feat(cache): stamp S3 content ETag onto disk copy at write time#326
feat(cache): stamp S3 content ETag onto disk copy at write time#326worstell wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 275cdcd5cb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| w.headers = http.Header{} | ||
| } | ||
| w.headers.Set(ContentETagHeader, etag) | ||
| return errors.WithStack(w.disk.db.set(w.key, w.namespace, w.expiresAt, w.headers)) |
There was a problem hiding this comment.
Guard ETag stamps against replaced disk entries
When two tiered Creates for the same key close concurrently, the first disk writer can commit its file, wait for its S3 upload, and then run this stamp after a later writer has already renamed a different file over w.path. Because this only checks that some file exists before rewriting metadata with this writer's S3 ETag, it can attach an old pin to newer disk bytes; a client holding the old pin would then be served the wrong range from disk instead of falling through to S3 and getting ErrPinStale. The stamp needs a generation/same-file check, or it must happen while the disk commit is still known to be the current entry.
Useful? React with 👍 / 👎.
A tiered Create writes the same byte stream to disk and S3, but only an S3-backed Open backfill persisted the content ETag, so freshly generated disk entries weren't pinnable and pinned ranges proxied through S3 until a backfill ran. Capture S3's ETag from PutObject and stamp it onto the disk copy on Close, so the disk tier serves pinned ranges immediately. Best effort: a stamp failure only falls pinned reads back to S3. Amp-Thread-ID: https://ampcode.com/threads/T-019eae3d-a2fd-70ca-80fe-a7536ec6748c Co-authored-by: Amp <amp@ampcode.com>
275cdcd to
84a6492
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 84a6492f3c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if r, ok := w.(contentETagReceiver); ok { | ||
| r.setContentETag(etag) |
There was a problem hiding this comment.
Avoid minting pins from trailing disk tiers
When the tiers are ordered S3+Disk, this stamps the disk tier even though it sits after the S3 provider. Tiered.Pin scans tiers in reverse, so after a tiered write it will return the local disk ETag before consulting S3; if another replica later overwrites S3, re-pinning on this cache keeps returning the stale disk pin and OpenPinnedRange hits S3 first with ErrPinStale, preventing clients from obtaining the current object. Only stamp disk receivers that precede the authoritative provider, or make Pin skip disk as an authoritative tier.
Useful? React with 👍 / 👎.
Stacks on #324.
A tiered
Createfeeds the same byte stream to disk and S3, but only an S3-backedOpenbackfill persisted the content ETag onto the disk copy. So freshly generated disk entries weren't pinnable, and pinned ranges proxied through S3 until a backfill happened to run — defeating disk-first range serving for hot repos.This captures S3's ETag from
PutObjectand stamps it onto the disk copy when the tiered writer closes. Both tiers were written from the same bytes, so the ETag is valid for the disk copy and it serves pinned ranges immediately. Stamping is best effort: a failure only falls pinned reads back to S3, and a disk ETag that doesn't match the authoritative S3 pin simply misses to S3.