Add file-based transfer APIs to build cache provider interface and all cache plugins#5746
Add file-based transfer APIs to build cache provider interface and all cache plugins#5746Copilot wants to merge 50 commits into
Conversation
|
@copilot - add APIs to |
Added |
|
@copilot - are there async variants of those |
No, |
|
@copilot - take a look at the other |
Added |
|
@copilot - can we add stream support to the other cache plugins? |
Added streaming support to both remaining cache plugins: Amazon S3 plugin (
Azure Storage plugin (
All three cache plugins (HTTP, S3, Azure) now implement both streaming methods. All existing tests pass (45 S3, 10 Azure). |
|
@copilot - Can you put usage of this functionality behind a Rush experiment? |
Added a |
The delay variable is in milliseconds but the log message said "s". This produced misleading output like "Will retry request in 4000s...". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… signing Remove unused onBlobAlreadyExists parameter from _trySetBlobDataAsync This callback was previously used to drain incoming streams when the blob already existed. With the switch to file-based APIs, no callers pass this parameter anymore.
- HTTP: add 404 cache miss test for tryDownloadCacheEntryToFileAsync - HTTP: add pipeline assertion in download success test - S3: add retry test for downloadObjectToFileAsync on transient 5xx - S3: add pipeline assertions in download success/miss tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For stream-body requests, the log read "unknown bytes" which is awkward. Change to "unknown length" for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add clarifying comment on maxAttempts: 1 for uploads explaining why the parameter exists (shared between download with retries and upload without) - Replace S3 download snapshot containing auth headers with explicit field assertions, avoiding credential-looking strings in snapshots - Update inline snapshot for "unknown length" wording change Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dd9bf94 to
f8fecd4
Compare
| // Clean up any partial file left by the failed download so it isn't | ||
| // mistaken for a valid cache entry on the next build. | ||
| try { | ||
| await FileSystem.deleteFileAsync(targetPath); |
There was a problem hiding this comment.
Seems like this only deletes the target file when the provider throws during download.
However, providers generally swallow download failures and return false/undefined instead of throwing:
- HTTP provider catches any error in
tryDownloadCacheEntryToFileAsync()and returnsfalse - S3 provider catches errors and returns
false - Azure provider catches errors in
_tryGetBlobDataAsync()and ultimately returnsfalse
At the same time, the local cache provider considers a cache entry valid if the file merely exists, without validating size, checksum, or completeness.
It seems possible that a partially written (truncated or corrupted) file from a failed streamed download could remain on disk and be treated as a valid cache hit on subsequent runs.
There was a problem hiding this comment.
Good catch — the cleanup now runs for any non-success case (both false return and thrown exception), not just in the catch block. Moved deleteFileAsync outside the try/catch so it triggers whenever cloudCacheHit remains false. This prevents partially written files from being treated as valid cache entries on subsequent runs.
|
@copilot - this branch has conflicts. Rebase or merge main into this branch and resolve conflicts. |
…urns false Providers may catch errors internally and return false instead of throwing, leaving a partially written file on disk. Move the cleanup logic outside the catch block so it runs for any non-success case, preventing corrupted partial files from being treated as valid cache entries on subsequent builds.
…entry-for-http-plugin # Conflicts: # common/reviews/api/rush-lib.api.md # libraries/rush-lib/src/api/ExperimentsConfiguration.ts # libraries/rush-lib/src/schemas/experiments.schema.json
Merged main and resolved three conflicts ( |
| body: entryStream, | ||
| warningText: 'Could not write cache entry', | ||
| // Streaming uploads cannot be retried because the stream is consumed | ||
| maxAttempts: 1 |
There was a problem hiding this comment.
Pull request overview
This PR adds optional file-based transfer APIs to the cloud build cache provider interface and wires them through Rush’s build cache flow, enabling streaming uploads/downloads to avoid buffering large cache entries in memory (gated behind the useDirectFileTransfersForBuildCache experiment).
Changes:
- Added optional
tryDownloadCacheEntryToFileAsync/tryUploadCacheEntryFromFileAsynctoICloudBuildCacheProviderand integrated conditional usage intoOperationBuildCache. - Refactored
WebClientto support streaming responses (fetchStreamAsync) with shared request core and Content-Encoding decompression for streams. - Implemented/updated providers and tests for HTTP, Amazon S3, and Azure Blob storage to support the new file-based path.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| rush-plugins/rush-http-build-cache-plugin/src/test/HttpBuildCacheProvider.test.ts | Expands unit coverage for buffer vs file-based HTTP cache transfers and retry/credential behaviors. |
| rush-plugins/rush-http-build-cache-plugin/src/HttpBuildCacheProvider.ts | Adds file-based download/upload APIs and introduces stream-based request path via WebClient.fetchStreamAsync. |
| rush-plugins/rush-bridge-cache-plugin/src/BridgeCachePlugin.ts | Plumbs the experiment flag into cache plugin options. |
| rush-plugins/rush-azure-storage-build-cache-plugin/src/AzureStorageBuildCacheProvider.ts | Adds Azure SDK-native file-based download/upload implementations and shared helpers. |
| rush-plugins/rush-amazon-s3-build-cache-plugin/src/test/AmazonS3Client.test.ts | Adds coverage for file-based GET/PUT behavior, retry expectations, and payload signing hash. |
| rush-plugins/rush-amazon-s3-build-cache-plugin/src/test/snapshots/AmazonS3Client.test.ts.snap | Snapshot update for added/modified tests. |
| rush-plugins/rush-amazon-s3-build-cache-plugin/src/AmazonS3Client.ts | Adds streaming file transfers and reinstates payload signing via SHA-256 hash for file uploads. |
| rush-plugins/rush-amazon-s3-build-cache-plugin/src/AmazonS3BuildCacheProvider.ts | Adds file-based provider APIs and centralizes object-name computation. |
| libraries/rush-lib/src/utilities/WebClient.ts | Introduces streaming response API and shared raw request core; adds stream decompression support. |
| libraries/rush-lib/src/schemas/experiments.schema.json | Adds schema entry for useDirectFileTransfersForBuildCache. |
| libraries/rush-lib/src/logic/operations/CacheableOperationPlugin.ts | Threads experiment option through cache plugin initialization. |
| libraries/rush-lib/src/logic/buildCache/test/OperationBuildCache.test.ts | Updates test setup to include the new option. |
| libraries/rush-lib/src/logic/buildCache/OperationBuildCache.ts | Uses file-based download/upload when enabled and supported, with cleanup of partial downloads. |
| libraries/rush-lib/src/logic/buildCache/ICloudBuildCacheProvider.ts | Extends interface with optional file-based transfer methods. |
| libraries/rush-lib/src/logic/buildCache/FileSystemBuildCacheProvider.ts | Refactors local cache path generation and existence checks. |
| libraries/rush-lib/src/cli/scriptActions/PhasedScriptAction.ts | Plumbs experiments configuration into cache/skip plugin initialization. |
| libraries/rush-lib/src/api/ExperimentsConfiguration.ts | Adds typed config surface for the new experiment. |
| common/reviews/api/rush-lib.api.md | Updates API report for new interface and option surface. |
| common/reviews/api/rush-amazon-s3-build-cache-plugin.api.md | Updates API report for new S3 client methods. |
| common/changes/@microsoft/rush/copilot-stream-cache-entry-for-http-plugin_2026-04-05-03-56.json | Adds change entry describing the new APIs and gating. |
| const fetchOptions: IGetFetchOptions | IFetchOptionsWithBody = { | ||
| verb: method, | ||
| headers: headers, | ||
| body: body, | ||
| headers, | ||
| body, | ||
| redirect: 'follow', |
| const webFetchOptions: IGetFetchOptions | IFetchOptionsWithBody = { | ||
| verb, | ||
| headers | ||
| }; | ||
| if (verb === 'PUT' && body) { | ||
| (webFetchOptions as IFetchOptionsWithBody).body = body; | ||
| } |
| /** | ||
| * If implemented, the build cache will prefer to use this method over | ||
| * {@link ICloudBuildCacheProvider.tryGetCacheEntryBufferByIdAsync} to avoid loading the entire | ||
| * cache entry into memory, if possible. The implementation should download the cache entry and write it | ||
| * to the specified local file path. | ||
| * | ||
| * @returns `true` if the cache entry was found and written to the file, `false` if it was | ||
| * not found. Throws on errors. | ||
| */ |
| } else if (status === 400 || status === 401 || status === 403) { | ||
| cleanup?.(); | ||
| throw new Error(`Amazon S3 responded with status code ${status} (${statusText})`); | ||
| } else { | ||
| cleanup?.(); | ||
| return { | ||
| hasNetworkError: true, | ||
| error: new Error(`Amazon S3 responded with status code ${status} (${statusText})`) | ||
| }; | ||
| } |
Summary
Adds optional file-based transfer APIs to the build cache provider interface (
ICloudBuildCacheProvider) and implements them across all cache plugins (HTTP, Amazon S3, Azure Blob Storage). When enabled via theuseDirectFileTransfersForBuildCacheexperiment flag, cache entries are transferred directly between local files and cloud storage without buffering entire contents in memory, preventing out-of-memory errors for large build outputs.Details
Core changes:
ICloudBuildCacheProvidergains two optional methods:tryDownloadCacheEntryToFileAsyncandtryUploadCacheEntryFromFileAsync. Providers that don't implement them gracefully fall back to the existing buffer-based APIs.OperationBuildCacheconditionally uses the file-based path whenuseDirectFileTransfersForBuildCacheis enabled and the provider supports it. Includes cleanup of partial files on failed downloads.WebClientis refactored to extract a shared_makeRawRequestAsynccore used by both buffer and streaming request paths, with a newfetchStreamAsyncmethod and Content-Encoding decompression support for streaming responses.FileSystemin node-core-library gainscreateReadStream,createWriteStream, andcreateWriteStreamAsyncmethods (wrapped in_wrapExceptionfor consistent error handling).FileSystemBuildCacheProvideris simplified — the stream method is removed since cloud providers now handle file I/O directly.Plugin implementations:
fetchStreamAsync→pipeline()to file. Uploads viacreateReadStream→fetchStreamAsync. UsesmaxAttempts: 1for uploads (stream consumed after first attempt), with credential fallback skipped for stream bodies._hashFileAsync, then stream with the SHA-256 hash included in the AWS Signature V4 request — restoring full payload signing (no moreUNSIGNED_PAYLOAD). No retry on uploads.blobClient.downloadToFile(). Uploads viablockBlobClient.uploadFile(). Parent directory creation ensured before download.Gating:
useDirectFileTransfersForBuildCacheinexperiments.json. Defaults to off. Falls back to buffer-based APIs if the cloud provider plugin doesn't implement the file-based methods.How it was tested
HttpBuildCacheProvider(14 tests): buffer and file-based GET/SET, 404 cache miss, credential fallback skip for file uploads, write-not-allowed checks, retry behavior, pipeline assertionsAmazonS3Client(38 tests): buffer and file-based GET/SET, signed payload hash verification (not UNSIGNED-PAYLOAD), download retry on transient 5xx, no-retry on upload, credential validation, pipeline assertionsImpacted documentation
experiments.schema.jsonupdated withuseDirectFileTransfersForBuildCachedescriptioncommon/reviews/api/rush-lib.api.mdupdated with new API surfacecommon/reviews/api/node-core-library.api.mdupdated with newFileSystemstream methods