test: deflake spartan-v5 merge-train CI timeouts#24401
Draft
AztecBot wants to merge 2 commits into
Draft
Conversation
2939c9c to
09c188b
Compare
This was referenced Jun 30, 2026
PhilWindle
pushed a commit
that referenced
this pull request
Jul 1, 2026
) ## Problem `L1Publisher integration › timeouts › cancels block proposal when the L2 slot ends` flakes in CI (failed run http://ci.aztec-labs.com/997f3ffdac47572f). The same commit passes on retry. The failure is `expect(sendRequestsResult).toBeNull()` receiving a successful propose result (`sentActions: ["propose"], successfulActions: ["propose"]`) instead of `null`: the propose got mined when the test required it to stay pending until the L2 slot timeout cancelled it. ## Root cause The three `timeouts` tests advance L1 blocks with `ethCheatCodes.mineEmptyBlock()` while a propose tx is in flight (anvil automine off). The old `mineEmptyBlock` worked by: read the pool, `anvil_dropAllTransactions`, `hardhat_mine`, then re-add the raw txs via `eth_sendRawTransaction`. That drop -> mine -> re-add sequence is **not atomic** against the publisher, which is concurrently monitoring its in-flight tx on a 1s loop. During the window between the drop and the mine, the tx reappears in the pool (the publisher re-broadcasts it, and `mineEmptyBlock` itself re-adds it on the prior iteration), so the supposedly-empty block is not empty — `hardhat_mine` sweeps the pending propose into it. The propose is mined, the monitor reports MINED, and `sendRequests` resolves with a successful proposal instead of `null`. Log evidence from the failed run: - `Sent L1 transaction 0xd2d5… nonce 0 isBlobTx:true` — one propose send, no speed-ups anywhere in the run. - `Mined 1 empty L1 block` (call #1), then on call #2: `Failed to re-add transaction: … Details: nonce too low`. The re-add of the propose failed because its nonce was **already consumed on-chain** — i.e. `mineEmptyBlock`'s own mine step had just included it. - `L1 transaction 0x08fa… with nonce 0 mined` blockNumber 36 — the propose mined into the "empty" block. This was reproduced deterministically at the anvil level: with the old implementation, dropping all txs, re-broadcasting (as the publisher monitor does), then `hardhat_mine` produces a block containing the pending tx. The previous explanation in this PR — "the re-added propose lands in the block whose timestamp == `txTimeoutAt`, and the monitor checks mined before timed-out" — was incomplete. The boundary timestamp is incidental (a consequence of the 12s L1-block grid and where the loop happened to be); the real determinant is that the tx is present in the pool when `mineEmptyBlock` mines. The monitor's mined-before-timeout ordering is a red herring here: once a tx is mined, it is mined regardless of check order. ## Fix Reimplement `mineEmptyBlock` so it never touches the mempool, eliminating the race for **all** callers rather than working around it in one test: - Temporarily lower the block gas limit below any transaction's 21000-gas intrinsic minimum, so `hardhat_mine` cannot include any pending tx; the pending tx stays in the pool untouched. - Restore the gas limit, then `anvil_reorg` the just-mined blocks into empty blocks at the restored gas limit (same height and timestamps). The reorg is required because anvil applies a new gas limit only to future blocks — without it the just-mined blocks would keep the tiny limit and a later `eth_call` against `latest` (whose gas is capped by the block gas limit) would revert with "intrinsic gas too high". This is strictly more robust than dropping the pending tx (the approach in #24401 / this PR's prior commit): it removes the drop/re-add window entirely instead of narrowing it, and it preserves pending txs (the existing `eth_cheat_codes.test.ts` contract). Consequently the test-local `mineBlockWithoutPendingTxs` workaround is removed and the three timeout tests call `mineEmptyBlock()` again — the change to the test file is a net revert. ## Verification - Deterministic RED: with the old implementation, drop-all -> re-broadcast -> `hardhat_mine` mines the pending tx into the "empty" block. - Unit `eth_cheat_codes.test.ts › mineEmptyBlock › mines an empty block while preserving pending transactions`: passes (block advances, block empty, pending tx preserved and mined later). - All three `timeouts` integration tests pass; the cancel test passes 3/3 repeats. - Full `ethereum/src/l1_tx_utils/l1_tx_utils.test.ts` (51 tests, 22 `mineEmptyBlock` calls incl. multi-block) passes — the L1TxUtils monitor suite that interacts most with `mineEmptyBlock`. ## Relation to #24401 #24401 independently landed the same drop-pending-txs-without-re-add workaround in these tests. This PR supersedes that approach by fixing the shared `mineEmptyBlock` helper, so no per-test workaround is needed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
merge-train/spartan-v5; the PR is mergeable and this was CI failure, not a merge conflict.merge-queue-heavytimeout inbb-prover/src/avm_proving_tests/avm_check_circuit_token.test.ts.28460715568: shardx3-fullfailedsequencer-client/src/publisher/l1_publisher.integration.test.tsbecausemineEmptyBlock()can fail to preserve/drop a pending blob proposal transaction, allowing the proposal to mine before the slot-timeout assertion. The timeout tests now explicitly drop pending txs before manually mining an L1 block.Verification
git diff --check origin/merge-train/spartan-v5...HEADpassed../bootstrap.sh ciwas requested but this branch exits withUnknown command: ci../bootstrap.sh ci-full-no-test-cachereached the test engine, then local test-engine failed after a few seconds and interrupted the build fanout../bootstrap.sh build yarn-projectis blocked locally by Noir bootstrap failure.noir/./bootstrap.shfails because this container cannot reachhttps://index.crates.io/config.jsonto installjust@1.42.4.yarn-project/./bootstrap.shis blocked by the missing Noir portal package@aztec/noir-acvm_js, so the focused Jest test cannot start locally (@swc-node/registeris absent).