Discard an un-appliable TRUNCATE instead of looping the apply worker#505
Conversation
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Duplication | 2 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Preparatory refactor with no change in behaviour: move the relation resolution (including partition expansion) and the ExecuteTruncateGuts() call out of handle_truncate() into a new apply_truncate() helper. This isolates the part of the TRUNCATE apply path that can throw, so the next commit can wrap it in a subtransaction for DISCARD-mode exception handling without obscuring the material change.
In DISCARD mode handle_truncate() had no per-action exception handling. On the retry pass it re-ran the truncate directly, so a failure - most commonly a relation that does not exist on the subscriber - escaped as an ERROR. With use_try_block set and xact_had_exception clear, apply_work()'s outer handler re-throws to exit the worker; the background worker then restarts, replays from the same position, and fails identically. The result is an apply-worker crash/restart loop that stalls every later change from that origin, sync_event included. Wrap apply_truncate() in a subtransaction on the DISCARD retry path, mirroring handle_insert/update/delete and handle_sql_or_exception(): on failure, roll back the savepoint, set xact_had_exception, log the discarded TRUNCATE, and carry on. TRANSDISCARD and SUB_DISABLE keep their unconditional skip - the whole transaction is being abandoned, so the truncate must not be attempted.
Add TAP test 031, exercising the fix from previous commit. A single replicated transaction mixes INSERT/UPDATE/DELETE on one table with a TRUNCATE of another table that has been dropped on the subscriber - a missing relation, modelled by dropping it manually. In DISCARD mode the test asserts that only the TRUNCATE is discarded (and logged to spock.exception_log), that the row changes in the same transaction still apply, and that the subscription keeps replicating: i.e. the missing relation no longer sends the apply worker into a crash/restart loop.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthrough
DISCARD TRUNCATE Handling and Test Coverage
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Problem
In
DISCARDexception mode,handle_truncate()had no per-action exception handling. On the retry (use_try_block) pass it re-ran the truncate directly, so any failure — most commonly a relation that does not exist on the subscriber — escaped as anERROR.With
use_try_blockset andxact_had_exceptionstill clear,apply_work()'s outer handler re-throws to exit the worker; the background worker then restarts, replays from the same position, and fails identically. The result is an apply-worker crash/restart loop that stalls every later change from that origin —spock.sync_event()included — so the whole transaction (its row changes too) never lands and the stream wedges.The row handlers (
handle_insert/update/delete) and the queued-SQL handler (handle_sql_or_exception()) already wrap their work in a subtransaction for exactly this reason.TRUNCATEwas the one directly-applied operation missing it:ExecuteTruncateGuts()is called directly (not viaProcessUtility), and on a missing relationhandle_truncate()simplyereported, with no savepoint to roll back to.Fix
Extract
apply_truncate()— a preparatory refactor with no behaviour change, isolating the relation resolution (including partition expansion) and theExecuteTruncateGuts()call, i.e. the part of the path that can throw.Discard the failing TRUNCATE in
DISCARDmode — on the retry path, runapply_truncate()inside an internal subtransaction. On failure, roll back the savepoint, setxact_had_exception, log the discarded TRUNCATE tospock.exception_log, and carry on. This mirrors the pattern already used by the row and SQL handlers.TRANSDISCARDandSUB_DISABLEkeep their unconditional skip: the whole transaction is being abandoned, so the truncate must not be attempted at all.Behaviour
A TRUNCATE that cannot be applied on the subscriber is now discarded individually — recorded in
spock.exception_logwithoperation = 'TRUNCATE'— while the other changes in the same transaction still apply and the apply worker keeps running. No more crash/restart loop on a missing relation.