Skip to content

Conversation

@jhrozek
Copy link
Contributor

@jhrozek jhrozek commented Dec 5, 2025

Not finished yet, want to get early feedback

jhrozek and others added 4 commits December 5, 2025 15:48
Tests cover core DAG execution patterns:
- Sequential two-step workflow
- Parallel fan-out with aggregation
- Diamond DAG (diverge/converge)
- Parameter template expansion

Uses table-driven approach for maintainability. Additional tests
for conditionals and nested data blocked by issue #2669.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Extends the vMCP composer integration tests with error handling scenarios:
- SingleStepToolError: basic error propagation
- AbortMode_StopsOnFirstError: abort mode stops on first failure
- ContinueMode_PartialSuccess: continue mode returns partial results
- MultipleParallelErrors: handles multiple concurrent failures

Also adds test infrastructure for error scenarios:
- ErrorHandler field on BackendTool for returning errors
- Helper functions: NewErrorBackendTool, NewSlowBackendTool, NewTransientBackendTool
- AssertToolCallError helper for asserting error responses

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Updates integration tests to use the proper OutputConfig feature instead
of relying on the plain text workaround. This aligns with the new
structured output support added in commit 51ac7d8.

- Add Output config to all successful workflow test cases
- Remove outdated note about issue #2669 workaround
- Error test cases intentionally left without Output config

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Adds 4 new integration tests covering edge cases identified during QE analysis:

- StepTimeoutDuringRetry: Verifies timeout is respected during retry attempts
- ConditionalStepReferencingSkippedStep: Documents behavior when template
  references output from a skipped step (returns <no value>)
- OutputMissingRequiredField: Documents that Required field doesn't validate
  templated values are non-empty (returns <no value>)
- ContinueModeWithFailedDependency: Documents that dependent steps still run
  when their dependency fails in continue mode (receives <no value>)

Tests include DISCUSSION comments noting behaviors that may warrant review.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions github-actions bot added the size/L Large PR: 600-999 lines changed label Dec 5, 2025
@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 0% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.25%. Comparing base (afe9b41) to head (ecd6480).
⚠️ Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
test/integration/vmcp/helpers/backend.go 0.00% 45 Missing ⚠️
test/integration/vmcp/helpers/mcp_client.go 0.00% 11 Missing ⚠️
test/integration/vmcp/helpers/vmcp_server.go 0.00% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2922      +/-   ##
==========================================
- Coverage   56.36%   56.25%   -0.12%     
==========================================
  Files         323      323              
  Lines       31763    31828      +65     
==========================================
  Hits        17904    17904              
- Misses      12330    12395      +65     
  Partials     1529     1529              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

// Step 1 has a condition that evaluates to false (gets skipped).
// Step 2 depends on Step 1 and tries to use its output in arguments.
//
// DISCUSSION: When a step is skipped, referencing its output via template
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The step.OnError.Action config allows continue or abort, which implies whether the step is required for success. Similarly the condition implies whether the step is required. IMO if a downstream tool uses an input from a failed/skipped step, it should either:

  1. require a default value
  2. receive an empty value (instead of "<no value>")
    I don't think either of these is implemented. for 1) we could validate at the webhook, for 2) we could replace the at tool execution time.

I tend to think (1) is a better option, since we it can be validated early and provides more explicit behavior. Plus we don't have to guess what "<no value>" should be for non-string types.

// field (e.g., {{.steps.step1.output.nonexistent_field}}), it expands to
// "<no value>" (Go template default) WITHOUT failing.
//
// DISCUSSION: This is potentially surprising behavior that may warrant a fix.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this would be surprising and we should change the behavior. Options I can think of

  1. at composite tool validation time (in webhook), ensure required fields either a) reference values that cannot be missing (i.e. upstream step has OnError.Action = 'abort') or b) have a default value
  2. fail at tool execution time if template expansion fails (<no value>) and there is no default value

Here again I think (1) is better since it can be validated early.

// dependency failed. They receive "<no value>" placeholder for the failed
// step's output. Step B is NOT skipped - it executes with the placeholder.
//
// DISCUSSION: This may or may not be desired behavior. Alternative approaches:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned in a previous comment, I think a good option is to require a 'default' value for any references to outputs that come from an OnError.Action = 'continue' step (validated in webhook)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as for handling dependsOn for failed upstream steps using "continue", we could

  1. have an implied Condition for the downstream step on the upstream succeeding (as you mentioned), or
  2. require a default value for the downstream input that refers to steps that "continue"

I'm wary of an implied Condition (1) because it's less flexible and less explicit behavior. (2) aligns better with what I'm thinking for the other discussion topics in this PR.

@slyt3
Copy link
Contributor

slyt3 commented Dec 6, 2025

I noticed NewSlowBackendTool it just doesn't respect context cancellation it just use time.Sleep(delay) but it just simply ignores the context parameter.

SO if the context is cancleleed during the some deloya, the sleep continues no matter what. this could cuase some tests hand unnecessarily

Should it be:

Handler: func(ctx context.Context, _ map[string]any) string {
    select {
    case <-time.After(delay):
        return result
    case <-ctx.Done():
        return "" // or return maybe some kind of error
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants