feat(snapshot): ETag-pinned range serving across disk and S3 tiers#324
feat(snapshot): ETag-pinned range serving across disk and S3 tiers#324worstell wants to merge 1 commit into
Conversation
f2699f4 to
66758a6
Compare
66758a6 to
64ffecd
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 64ffecd55a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| w.Header().Set("Accept-Ranges", "bytes") | ||
| w.Header().Set("Content-Length", strconv.FormatInt(length, 10)) | ||
| w.Header().Set("Content-Range", fmt.Sprintf("bytes %d-%d/%d", startByte, endByte, total)) | ||
| w.WriteHeader(http.StatusPartialContent) |
There was a problem hiding this comment.
Delay 206 until the pinned S3 reader is validated
When the S3 tier wins after a disk miss, OpenPinnedRange returns a lazy minio reader; its own comment notes that the If-Match guard's 412 is surfaced only on Read (internal/cache/s3_pinned.go lines 64-66). This line commits 206 Partial Content before the first read, so if the object is overwritten between the stat and the GET/read, io.CopyN can only observe ErrPinStale after the response status is already sent. In that S3 race, clients do not receive the promised 412 and may treat the range as a truncated/failed 206 instead of restarting from a fresh validator.
Useful? React with 👍 / 👎.
Serve snapshot byte ranges pinned to a content ETag via standard HTTP ETag/If-Match/Range, so parallel chunk downloads stitch correctly across replicas and a snapshot regenerated mid-download fails closed (412). S3 surfaces its content ETag, which flows through tiered backfill onto the disk copy. The tiered cache serves pinned ranges from local disk when its stored ETag matches (sendfile zero-copy) and falls back to S3 otherwise; disk entries without a content ETag are not pinnable. Normal serves now advertise the ETag and Accept-Ranges so clients can discover the pin. Co-authored-by: Amp <amp@ampcode.com> Amp-Thread-ID: https://ampcode.com/threads/T-019eae3d-a2fd-70ca-80fe-a7536ec6748c
64ffecd to
adc4657
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: adc46570bb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // tiered backfill and is persisted on the disk copy. This lets a disk and | ||
| // an S3 serve of the same revision report the same pin token. | ||
| if objInfo.ETag != "" { | ||
| headers.Set(ContentETagHeader, objInfo.ETag) |
There was a problem hiding this comment.
Pin small S3 reads before persisting the stat ETag
When an S3 object is small enough for the single-GET path, parallelGetReader still uses an unguarded lazy GetObject (internal/cache/s3_parallel_get.go:19-25). If the snapshot key is overwritten after this StatObject but before that GET is actually read, this header contains the old ETag while the response and tiered backfill can contain the new bytes. That persists/advertises a pin for the wrong revision, so later If-Match ranges can fail against S3 or be served from a disk copy labeled with a stale ETag and mix with another replica still holding the old bytes; the single-GET path needs the same SetMatchETag guard before this ETag is surfaced.
Useful? React with 👍 / 👎.
Serve snapshot byte ranges pinned to a content ETag using standard HTTP
ETag/If-Match/Range. Parallel chunk downloads stitch correctly across replicas, and a snapshot regenerated mid-download fails closed with412rather than mixing revisions.S3 surfaces its content ETag, which flows through the tiered backfill onto the disk copy. The tiered cache serves a pinned range from local disk when its stored ETag matches (sendfile zero-copy) and falls back to S3 otherwise; disk entries that never round-tripped through S3 carry no content ETag and are not pinnable. Normal snapshot responses advertise
ETag+Accept-Rangesso clients can discover the pin.The
PinnedRangeCacheinterface is implemented by disk, S3, and tiered. Tests cover S3 and disk range stitching, disk fail-closed on ETag mismatch, non-pinnable ETag-less disk entries, tiered disk-first/S3-fallback after backfill, and handlerIf-Match/range validation.