Fix corrupted VCS command output (diffs, blobs, errors)#538
Open
tony wants to merge 7 commits into
Open
Conversation
why: The legacy run() helper post-processes captured output (per-line strip, drops blank lines, no trailing newline), which corrupts whitespace-significant output -- captured diffs no longer apply, blob reads lose content, multi-line errors smash together. The decision and rejected alternatives need a durable record so the behavior isn't "simplified" back into the bug. what: - Record the decision: capture pristine, trim/decode at the edge, phased migration onto SubprocessCommand. - Capture the measured alternatives matrix and prior-art lessons (pip, uv, mise, mercurial, gitoxide, jujutsu, git) as evidence. - Start a docs/project/adr/ section with an index and toctree.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #538 +/- ##
==========================================
+ Coverage 60.98% 61.15% +0.16%
==========================================
Files 40 40
Lines 6531 6564 +33
Branches 1101 1103 +2
==========================================
+ Hits 3983 4014 +31
- Misses 1951 1952 +1
- Partials 597 598 +1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
why: The runner trimmed each captured line and dropped blank lines, so whitespace-significant output was corrupted: captured diffs were rejected by `git apply`, `cat-file blob` lost indentation and blank lines, and multi-line stderr was concatenated into one run-on line. Single-token reads like `rev-parse HEAD` hid the defect. what: - Capture stdout/stderr verbatim; decode once; trim only the whole output (str.rstrip), preserving leading indentation, blank lines, and interior structure. - Add `trim` (default True). Pass `trim=False` for byte-faithful output -- diffs that feed `git apply`, exact blob contents. - Rejoin stderr without smashing lines together. - Correct the Svn.blame doctest, which had encoded the stripped output. - Cover trim=False fidelity, the default trim contract, and stderr line preservation with functional tests.
7d956cd to
9619dba
Compare
why: Trimming in the capture path corrupted whitespace-significant output. Returning verbatim by default makes the plain `.run(["diff"])` faithful and applyable, fixing the original bug without an opt-in. Bare-value reads opt in with `trim=True`. what: - Flip `trim` default from True to False (verbatim). - Rewrite the docstring around the verbatim default.
why: get_revision() is a scalar API downstream compares to bare SHAs; with verbatim run() it would carry a trailing newline. what: - Strip the rev-parse result so get_revision() stays bare.
why: With verbatim run() output, example output retains its trailing newline. what: - Update git/hg/svn command doctests to the real verbatim output. - Mark affected docstrings raw so the literal newlines render.
why: Tests that read scalars via the low-level run() or thin cmd wrappers relied on the old implicit trim. what: - Strip rev-parse / symbolic-ref / is-shallow / version reads. - Invert the default-behavior test to assert verbatim output.
why: The chosen design flipped to verbatim-by-default during review. what: - Update ADR 0001 Phase 1, alternatives, and consequences to reflect verbatim default with `trim=True` opt-in.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
libvcs._internal.run.runcorrupting whitespace-significant output: it stripped every captured line, dropped blank lines, and removed the trailing newline — so a capturedgit diffwas rejected bygit apply,git cat-file bloblost indentation and blank lines, and multi-line stderr was concatenated into one run-on string. Scalar reads likerev-parse HEADhid the defect.trimparameter (defaultTrue). The default trims only the whole output (str.rstrip), preserving leading indentation, blank lines, and interior structure while keeping the established no-trailing-newline contract.trim=Falsereturns byte-faithful output for diffs destined forgit applyand exact blob contents.libvcs.exc.CommandError.output.docs/project/adr/0001-faithful-subprocess-output-capture.md).Before / After
Before (default
.run(["diff"])) — leading spaces gone, blank context line dropped, no trailing newline:git apply→error: corrupt patch at line 6.After (
.run(["diff"], trim=False)) — byte-identical togit diff, applies cleanly:Changes by area
src/libvcs/_internal/run.py: Capture stdout/stderr as raw bytes, decode once, and trim only the whole output whentrim=True. Adds thetrimparameter (reachable through theGit/Hg/Svn.run()**kwargspassthrough). Removes the per-linestrip+ blank-linefilterand the separator-less stderr join.src/libvcs/cmd/svn.py: Correct theSvn.blamedoctest, which expected the previously-stripped (corrupted) output.tests/cmd/test_git.py: Functional tests fortrim=Falsediff fidelity +git apply, byte-exact blob round-trip, the default no-trailing-newline contract, and stderr line preservation.docs/project/adr/: New ADR section (index + ADR 0001) wired into the Project docs.Design decisions
trim, defaultTrue(not verbatim-by-default): returning fully verbatim output breaks the no-trailing-newline contract that callers, doctests, and downstreamvcspullrely on for scalar reads. Keeping the trimmed default preserves that contract;trim=Falseis the explicit opt-in for byte fidelity.rstrip, not per-linestrip: matches the convention of every tool surveyed in the ADR (pip, uv, mise, mercurial, gitoxide, jujutsu) — capture pristine, transform at the edge. The per-line strip was the outlier and the source of the corruption.CompletedProcessbackend (libvcs._internal.subprocess.SubprocessCommand) and retiringconsole_to_strare Phase 2, deliberately deferred to keep this change additive and back-compatible.Verification
The old per-line strip/filter is gone:
$ rg "line.strip\(\) for line in" src/libvcs/_internal/run.py(returns zero matches)
Test plan
test_run_trim_false_preserves_diff— captured diff is byte-identical togit diffand passesgit apply --checktest_run_trim_false_preserves_blob—cat-file blobround-trips file contents byte-for-bytetest_run_default_trims_trailing_newline— default.run()keeps the no-trailing-newline contracttest_run_failure_preserves_stderr_lines— failed command keeps stderr line structure inCommandError.outputuv run pytest— full suite greenuv run ruff format --check . && uv run ruff check .uv run mypy src tests