DRAFT: tool composition integration tests #2922

jhrozek · 2025-12-05T15:49:34Z

Not finished yet, want to get early feedback

Tests cover core DAG execution patterns: - Sequential two-step workflow - Parallel fan-out with aggregation - Diamond DAG (diverge/converge) - Parameter template expansion Uses table-driven approach for maintainability. Additional tests for conditionals and nested data blocked by issue #2669. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Extends the vMCP composer integration tests with error handling scenarios: - SingleStepToolError: basic error propagation - AbortMode_StopsOnFirstError: abort mode stops on first failure - ContinueMode_PartialSuccess: continue mode returns partial results - MultipleParallelErrors: handles multiple concurrent failures Also adds test infrastructure for error scenarios: - ErrorHandler field on BackendTool for returning errors - Helper functions: NewErrorBackendTool, NewSlowBackendTool, NewTransientBackendTool - AssertToolCallError helper for asserting error responses 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Updates integration tests to use the proper OutputConfig feature instead of relying on the plain text workaround. This aligns with the new structured output support added in commit 51ac7d8. - Add Output config to all successful workflow test cases - Remove outdated note about issue #2669 workaround - Error test cases intentionally left without Output config 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Adds 4 new integration tests covering edge cases identified during QE analysis: - StepTimeoutDuringRetry: Verifies timeout is respected during retry attempts - ConditionalStepReferencingSkippedStep: Documents behavior when template references output from a skipped step (returns <no value>) - OutputMissingRequiredField: Documents that Required field doesn't validate templated values are non-empty (returns <no value>) - ContinueModeWithFailedDependency: Documents that dependent steps still run when their dependency fails in continue mode (receives <no value>) Tests include DISCUSSION comments noting behaviors that may warrant review. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

codecov · 2025-12-05T15:52:00Z

Codecov Report

❌ Patch coverage is 0% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.25%. Comparing base (afe9b41) to head (ecd6480).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
test/integration/vmcp/helpers/backend.go	0.00%	45 Missing ⚠️
test/integration/vmcp/helpers/mcp_client.go	0.00%	11 Missing ⚠️
test/integration/vmcp/helpers/vmcp_server.go	0.00%	11 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2922      +/-   ##
==========================================
- Coverage   56.36%   56.25%   -0.12%     
==========================================
  Files         323      323              
  Lines       31763    31828      +65     
==========================================
  Hits        17904    17904              
- Misses      12330    12395      +65     
  Partials     1529     1529

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tgrunnagle · 2025-12-05T17:08:57Z

test/integration/vmcp/composer_basic_test.go

+		// Step 1 has a condition that evaluates to false (gets skipped).
+		// Step 2 depends on Step 1 and tries to use its output in arguments.
+		//
+		// DISCUSSION: When a step is skipped, referencing its output via template


The step.OnError.Action config allows continue or abort, which implies whether the step is required for success. Similarly the condition implies whether the step is required. IMO if a downstream tool uses an input from a failed/skipped step, it should either:

require a default value

receive an empty value (instead of "<no value>")
I don't think either of these is implemented. for 1) we could validate at the webhook, for 2) we could replace the at tool execution time.

I tend to think (1) is a better option, since we it can be validated early and provides more explicit behavior. Plus we don't have to guess what "<no value>" should be for non-string types.

tgrunnagle · 2025-12-05T17:14:22Z

test/integration/vmcp/composer_basic_test.go

+		// field (e.g., {{.steps.step1.output.nonexistent_field}}), it expands to
+		// "<no value>" (Go template default) WITHOUT failing.
+		//
+		// DISCUSSION: This is potentially surprising behavior that may warrant a fix.


I agree this would be surprising and we should change the behavior. Options I can think of

at composite tool validation time (in webhook), ensure required fields either a) reference values that cannot be missing (i.e. upstream step has OnError.Action = 'abort') or b) have a default value

fail at tool execution time if template expansion fails (<no value>) and there is no default value

Here again I think (1) is better since it can be validated early.

tgrunnagle · 2025-12-05T17:15:55Z

test/integration/vmcp/composer_basic_test.go

+		// dependency failed. They receive "<no value>" placeholder for the failed
+		// step's output. Step B is NOT skipped - it executes with the placeholder.
+		//
+		// DISCUSSION: This may or may not be desired behavior. Alternative approaches:


Mentioned in a previous comment, I think a good option is to require a 'default' value for any references to outputs that come from an OnError.Action = 'continue' step (validated in webhook)

as for handling dependsOn for failed upstream steps using "continue", we could

have an implied Condition for the downstream step on the upstream succeeding (as you mentioned), or

require a default value for the downstream input that refers to steps that "continue"

I'm wary of an implied Condition (1) because it's less flexible and less explicit behavior. (2) aligns better with what I'm thinking for the other discussion topics in this PR.

slyt3 · 2025-12-06T20:33:43Z

I noticed NewSlowBackendTool it just doesn't respect context cancellation it just use time.Sleep(delay) but it just simply ignores the context parameter.

SO if the context is cancleleed during the some deloya, the sleep continues no matter what. this could cuase some tests hand unnecessarily

Should it be:

Handler: func(ctx context.Context, _ map[string]any) string {
    select {
    case <-time.After(delay):
        return result
    case <-ctx.Done():
        return "" // or return maybe some kind of error
    }
}

jhrozek and others added 4 commits December 5, 2025 15:48

github-actions bot added the size/L Large PR: 600-999 lines changed label Dec 5, 2025

tgrunnagle reviewed Dec 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DRAFT: tool composition integration tests #2922

DRAFT: tool composition integration tests #2922

jhrozek commented Dec 5, 2025

Uh oh!

codecov bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

tgrunnagle Dec 5, 2025

Uh oh!

tgrunnagle Dec 5, 2025

Uh oh!

tgrunnagle Dec 5, 2025

Uh oh!

tgrunnagle Dec 5, 2025

Uh oh!

slyt3 commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

DRAFT: tool composition integration tests #2922

Are you sure you want to change the base?

DRAFT: tool composition integration tests #2922

Conversation

jhrozek commented Dec 5, 2025

Uh oh!

codecov bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tgrunnagle Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

tgrunnagle Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

tgrunnagle Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

tgrunnagle Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

slyt3 commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Dec 5, 2025 •

edited

Loading