-
-
Notifications
You must be signed in to change notification settings - Fork 355
fix: context propagation fifo #1530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
e25fafe to
2bc3c08
Compare
2bc3c08 to
f793125
Compare
|
If you agree to this pr, can we put this in the next release? |
|
@tomazfernandes after your review we can include this in 4.0.0-M2 4.0.0-M1 is to be released really soon and we sadly won't have time to review and merge before release. |
|
Hey @igormq, thanks for the detailed analysis and PR. You're correct that, when The implementation you propose addresses that, but given A short-term workaround is to explicitly set For context, are you using Let me know your thoughts, thanks. |
|
Hey @tomazfernandes , thanks for the detailed review and for identifying the issue! You're absolutely right about the problem - my implementation creates the ObservationContext before the FIFO-specific headers are added by the framework, which means those headers are missing from the observation. To answer your question about the use case: I'm using send inside a synchronous flow where I care about whether the message was sent successfully (so not fire-and-forget). I understand the short-term workaround you suggested (explicitly setting TemplateContentBasedDeduplication to ENABLED or DISABLED), but this would require users to be aware of this limitation and work around it, which isn't ideal. The proper fix is to:
I'll update the PR with this approach. Let me know if you have any concerns with this solution! |
📢 Type of change
📜 Description
Fixes trace context propagation for FIFO queues when using
SqsTemplate.sendAsync(). The issue occurred on the first call to a FIFO queue where trace headers (traceparent, tracestate, etc.) were not propagated due to observation creation happening on the wrong thread after an async operation.💡 Motivation and Context
When using
SqsTemplate.sendAsync()to send messages to a FIFO queue (.fifo), the trace context (traceId/spanId from Micrometer Observation or distributed tracing) was not propagated on the first call, but worked correctly on subsequent calls when queue attributes were cached.Observed Behavior
Root Cause Analysis
Since Spring Cloud AWS 3.0.1 (PR #799),
SqsTemplateautomatically detects whether a FIFO queue hasContentBasedDeduplicationenabled. This detection is asynchronous and involves calling the AWS SQS API to fetch queue attributes.The Bug
The observation (which captures trace context and adds trace headers) was created in
AbstractMessagingTemplate.observeAndSendAsync(), which is called in athenComposecallback:Why It Failed on First Call But Worked on Second Call
CompletableFuture.thenCompose()behavior:sdk-async-*), so thethenComposecallback runs on that thread. The observation is created on the SDK thread, which doesn't have the calling thread's trace context.thenComposeruns synchronously on the calling thread, which has the correct trace context.Move observation creation and trace header capture to before the async preprocessing, ensuring it always happens on the calling thread with the correct trace context:
Key Changes
File:
spring-cloud-aws-sqs/src/main/java/io/awspring/cloud/sqs/operations/AbstractMessagingTemplate.javastartObservation()) to execute beforepreProcessMessageForSendAsync()doSendAndCompleteObservation()helper method that sends and completes an existing observation (rather than creating a new one)observeAndSendAsync()💚 How did you test it?
sendAsync_toFifoQueue_shouldPropagateObservationScopeOnFirstCallwill fail.📝 Checklist
🔮 Next steps