-
Notifications
You must be signed in to change notification settings - Fork 42
Exception Handling Improvements: #287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mason-sharp
reviewed
Dec 10, 2025
mason-sharp
reviewed
Dec 10, 2025
mason-sharp
reviewed
Dec 10, 2025
This commit fixes several critical issues to make Spock more stable and prevent
apply worker death loops (kill-restart-kill cycles). It also eliminates unhelpful
errors such as "exception handling had no exception(s)".
1. Fix TRANSDISCARD/SUB_DISABLE handling during commit phase in transaction retry mode
Problem: When a transaction initially errors but succeeds on retry in these modes,
it violates TRANSDISCARD semantics (all-or-nothing at transaction level). The
current behavior would commit the transaction, then terminate the apply worker
with an unhelpful error message, leading to a death loop.
Solution: Detect this condition (use_try_block=true with no exceptions during
replay) before commit. Abort the current transaction (all operations already
rolled back in subtransactions), start a new transaction, log the discard to
exception_log with the original error message and operation type, then commit
the log entry.
- SUB_DISABLE mode: Throw error to trigger subscription disable in parent PG_CATCH
- TRANSDISCARD mode: Use goto transdiscard_skip_commit to update progress and
continue, ensuring transaction is fully discarded with proper audit trail
2. Prevent NULL error messages in exception_log
Added fallback mechanism to initial_error_message for INSERT/UPDATE/DELETE/SQL
operations. Ensures context is logged even when operation succeeds on retry in
DISCARD mode.
3. Eliminate "(unknown action)" in error contexts
Set errcallback_arg.action_name in all protocol message handlers that were missing it:
ORIGIN, COMMIT_ORDER, RELATION, DELETE, STARTUP, MESSAGE
4. Track parent operation for transaction discards
Added initial_operation field to SpockExceptionLog structure to capture the
operation type (INSERT/UPDATE/DELETE/SQL) that caused the initial exception.
Shows which specific DML caused the transaction discard in TRANSDISCARD mode.
Regression Test Improvements:
Make regression tests deterministic and cover new behavior
- Normalize OIDs in spock.exception_log.error_message output using regexp_replace()
so expected output is stable across test runs
- Add TAP test 013_exception_handling to exercise TRANSDISCARD and SUB_DISABLE modes
end-to-end, verify exception_log entries have non-NULL error messages, and assert
a single SUB_DISABLE entry when subscription is disabled on conflict
Signed-off-by: Asif Rehman <asifr@pgedge.com>
b3eee26 to
1268356
Compare
When a transaction was successfully skipped using skip_lsn in SUB_DISABLE mode, the subscription would incorrectly get disabled again instead of continuing to replicate. This happened because the exception handling state was not cleared after a successful skip. Additionally, there was an LSN mismatch when comparing skip_lsn: - skip_lsn is set using replorigin_session_origin_lsn (BEGIN commit_lsn) - But clear_subscription_skip_lsn() was called with end_lsn (COMMIT end_lsn) - These LSNs are different, causing a mismatch warning.
mason-sharp
reviewed
Dec 15, 2025
mason-sharp
approved these changes
Dec 17, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit fixes several critical issues to make Spock more stable and prevent apply worker death loops (kill-restart-kill cycles). It also eliminates unhelpful errors such as "exception handling had no exception(s)".
Fix TRANSDISCARD/SUB_DISABLE handling during commit phase in transaction retry mode
Problem: When a transaction initially errors but succeeds on retry in these modes, it violates TRANSDISCARD semantics (all-or-nothing at transaction level). The current behavior would commit the transaction, then terminate the apply worker with an unhelpful error message, leading to a death loop.
Solution: Detect this condition (use_try_block=true with no exceptions during replay) before commit. Abort the current transaction (all operations already rolled back in subtransactions), start a new transaction, log the discard to exception_log with the original error message and operation type, then commit the log entry.
Prevent NULL error messages in exception_log
Added fallback mechanism to initial_error_message for INSERT/UPDATE/DELETE/SQL operations. Ensures context is logged even when operation succeeds on retry in DISCARD mode.
Eliminate "(unknown action)" in error contexts
Set errcallback_arg.action_name in all protocol message handlers that were missing it: ORIGIN, COMMIT_ORDER, RELATION, DELETE, STARTUP, MESSAGE
Track parent operation for transaction discards
Added initial_operation field to SpockExceptionLog structure to capture the operation type (INSERT/UPDATE/DELETE/SQL) that caused the initial exception. Shows which specific DML caused the transaction discard in TRANSDISCARD mode.
Regression Test Improvements:
Make regression tests deterministic and cover new behavior