-
Notifications
You must be signed in to change notification settings - Fork 156
DRAFT: tool composition integration tests #2922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Tests cover core DAG execution patterns: - Sequential two-step workflow - Parallel fan-out with aggregation - Diamond DAG (diverge/converge) - Parameter template expansion Uses table-driven approach for maintainability. Additional tests for conditionals and nested data blocked by issue #2669. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Extends the vMCP composer integration tests with error handling scenarios: - SingleStepToolError: basic error propagation - AbortMode_StopsOnFirstError: abort mode stops on first failure - ContinueMode_PartialSuccess: continue mode returns partial results - MultipleParallelErrors: handles multiple concurrent failures Also adds test infrastructure for error scenarios: - ErrorHandler field on BackendTool for returning errors - Helper functions: NewErrorBackendTool, NewSlowBackendTool, NewTransientBackendTool - AssertToolCallError helper for asserting error responses 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Updates integration tests to use the proper OutputConfig feature instead of relying on the plain text workaround. This aligns with the new structured output support added in commit 51ac7d8. - Add Output config to all successful workflow test cases - Remove outdated note about issue #2669 workaround - Error test cases intentionally left without Output config 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Adds 4 new integration tests covering edge cases identified during QE analysis: - StepTimeoutDuringRetry: Verifies timeout is respected during retry attempts - ConditionalStepReferencingSkippedStep: Documents behavior when template references output from a skipped step (returns <no value>) - OutputMissingRequiredField: Documents that Required field doesn't validate templated values are non-empty (returns <no value>) - ContinueModeWithFailedDependency: Documents that dependent steps still run when their dependency fails in continue mode (receives <no value>) Tests include DISCUSSION comments noting behaviors that may warrant review. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2922 +/- ##
==========================================
- Coverage 56.36% 56.25% -0.12%
==========================================
Files 323 323
Lines 31763 31828 +65
==========================================
Hits 17904 17904
- Misses 12330 12395 +65
Partials 1529 1529 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| // Step 1 has a condition that evaluates to false (gets skipped). | ||
| // Step 2 depends on Step 1 and tries to use its output in arguments. | ||
| // | ||
| // DISCUSSION: When a step is skipped, referencing its output via template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The step.OnError.Action config allows continue or abort, which implies whether the step is required for success. Similarly the condition implies whether the step is required. IMO if a downstream tool uses an input from a failed/skipped step, it should either:
- require a default value
- receive an empty value (instead of "<no value>")
I don't think either of these is implemented. for 1) we could validate at the webhook, for 2) we could replace the at tool execution time.
I tend to think (1) is a better option, since we it can be validated early and provides more explicit behavior. Plus we don't have to guess what "<no value>" should be for non-string types.
| // field (e.g., {{.steps.step1.output.nonexistent_field}}), it expands to | ||
| // "<no value>" (Go template default) WITHOUT failing. | ||
| // | ||
| // DISCUSSION: This is potentially surprising behavior that may warrant a fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this would be surprising and we should change the behavior. Options I can think of
- at composite tool validation time (in webhook), ensure required fields either a) reference values that cannot be missing (i.e. upstream step has
OnError.Action = 'abort') or b) have a default value - fail at tool execution time if template expansion fails (
<no value>) and there is no default value
Here again I think (1) is better since it can be validated early.
| // dependency failed. They receive "<no value>" placeholder for the failed | ||
| // step's output. Step B is NOT skipped - it executes with the placeholder. | ||
| // | ||
| // DISCUSSION: This may or may not be desired behavior. Alternative approaches: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned in a previous comment, I think a good option is to require a 'default' value for any references to outputs that come from an OnError.Action = 'continue' step (validated in webhook)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as for handling dependsOn for failed upstream steps using "continue", we could
- have an implied
Conditionfor the downstream step on the upstream succeeding (as you mentioned), or - require a default value for the downstream input that refers to steps that "continue"
I'm wary of an implied Condition (1) because it's less flexible and less explicit behavior. (2) aligns better with what I'm thinking for the other discussion topics in this PR.
|
I noticed SO if the context is cancleleed during the some deloya, the sleep continues no matter what. this could cuase some tests hand unnecessarily Should it be: Handler: func(ctx context.Context, _ map[string]any) string {
select {
case <-time.After(delay):
return result
case <-ctx.Done():
return "" // or return maybe some kind of error
}
} |
Not finished yet, want to get early feedback