From 988be3511f7c72573311f32adad4ee273a36b16e Mon Sep 17 00:00:00 2001
From: igerber <isaac.gerber@gmail.com>
Date: Thu, 2 Jul 2026 19:40:47 -0400
Subject: [PATCH 1/2] perf(utils): factorize-once + bincount MAP demeaning,
 max_iter 10k parity, FE-spanned regressor guard

Rewrites the fixed-effects absorption engine and closes a silent-inference
hazard the rewrite's identity gate surfaced. Three coupled changes:

1. demean_by_groups N>=2 branch: each absorbed dimension factorized ONCE
   (pd.factorize), MAP sweeps via np.bincount group-sum + gather instead of
   pandas rebuilding the group hash table per (iteration x variable x
   dimension). Same convergence semantics (per-variable sweep order,
   max|x - x_old| < tol, zero-total-weight inert guard, warn_if_not_converged).
   NaN group keys raise ValueError at every arity (one-way included; pandas
   groupby silently dropped NaN keys, factorize would code them -1).
   Numerical contract: bincount accumulation is not Kahan-compensated, so
   outputs agree with the prior pandas loop to ~1e-10 order, not bit-for-bit.

2. max_iter raised 100 -> 10,000 in BOTH demean_by_groups and
   within_transform (fixest fixef.iter / pyfixest fixef_maxiter parity).
   Correlated FE incidence (contiguous unit lifetimes) genuinely needs ~270
   iterations; the old cap warned and returned slightly-off residuals.

3. FE-spanned regressor guard (new utils.snap_absorbed_regressors +
   pre_demean_norms + _fe_span_residual_norm), wired at DiD/MPD absorb=,
   TWFE, SunAbraham, Bacon, Wooldridge AND the DiD/MPD/TWFE replicate-refit
   closures: regressors spanned by the absorbed FEs previously demeaned to
   junk columns that survived rank detection via column equilibration and
   perturbed identified estimates - up to ~3e-3 on ATT with ~1e14-scale
   garbage coefficients in the joint-span (a_unit + b_time) case. Detection
   is TWO-STAGE because the MAP stopping rule bounds the last step, not the
   distance to the limit: fast relative-norm path at 1e-10, then an exact
   sparse-LSMR span-membership confirmation (on the already-demeaned column,
   so it converges in a few iterations) for truncation-masked candidates in
   (1e-10, 1e-3]. Norms are sqrt(w)-weighted so zero-weight domain rows
   cannot mask spanning on the positive-weight sample. Spanned regressors
   drop deterministically (coefficient NaN) with a cause-specific UserWarning
   under rank_deficient_action="warn"; identified estimates become invariant
   (~1e-14) to the demeaning tolerance where they previously swung at ~1e-5.

Measured (fe_absorption suite, AFTER baselines committed, generated on frozen
code): 1.6-2.7x end-to-end on the headline scenarios; the correlated-FE
stress case now converges (0.8x wall-clock as the price of correctness:
2.7x the iterations to full convergence + the exact span confirmation).
Identity gate: 1e-14-1e-16 everywhere except the documented one-time
corrections (survey_absorb ATT 9.8e-6, geo_experiment SE 1e-7 - junk columns
removed). 875 targeted tests pass; byte-identity locks converted to
allclose + full-dummy ground-truth guards; Wooldridge 1e-14 baseline
recaptured (~1e-15 shift); a pre-existing degenerate test spec that asserted
isfinite on garbage output now asserts the identification error. Dead
_within_transform methods removed (twfe, sun_abraham). REGISTRY "Absorbed
Fixed Effects" + CHANGELOG + doc-deps updated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 CHANGELOG.md                                  |  41 ++
 .../baselines/fe_absorption_after.json        | 151 ++++
 .../speed_review/bench_fe_absorption.py       |  27 +-
 diff_diff/bacon.py                            |  15 +
 diff_diff/estimators.py                       |  62 ++
 diff_diff/sun_abraham.py                      |  30 +-
 diff_diff/twfe.py                             |  64 +-
 diff_diff/utils.py                            | 308 +++++++-
 diff_diff/wooldridge.py                       |  22 +-
 docs/doc-deps.yaml                            |   4 +
 docs/methodology/REGISTRY.md                  |  59 +-
 docs/performance-plan.md                      |  44 +-
 tests/test_bacon.py                           | 299 +++-----
 tests/test_estimators.py                      | 345 ++++++++-
 tests/test_sun_abraham.py                     |  44 ++
 tests/test_utils.py                           | 672 +++++++++++++-----
 tests/test_wooldridge.py                      | 428 +++++++----
 17 files changed, 1986 insertions(+), 629 deletions(-)
 create mode 100644 benchmarks/speed_review/baselines/fe_absorption_after.json

diff --git a/CHANGELOG.md b/CHANGELOG.md
index dcb4684ac..9c3e05b0e 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,47 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Changed
+- **FE-absorption demeaning rewritten: factorize-once + `np.bincount` method of alternating
+  projections** (`demean_by_groups` / `within_transform`). Each absorbed dimension is factorized
+  once and group means are formed via `np.bincount` instead of re-hashing the group keys with
+  `pandas.groupby().transform("mean")` on every iteration x variable x dimension. Affects every
+  fit of `TwoWayFixedEffects`, `SunAbraham`, `BaconDecomposition`, `WooldridgeDiD`,
+  `DifferenceInDifferences`/`MultiPeriodDiD` with `absorb=`, and the survey replicate-refit path.
+  Convergence semantics (per-variable sweep order, `max|x - x_old| < tol` criterion, zero-total-
+  weight group guard, non-convergence warning) are unchanged. **Numerical contract**: bincount
+  accumulation is not Kahan-compensated the way pandas' grouped mean is, so demeaned values agree
+  with the prior implementation to ~1e-10 order rather than bit-for-bit; estimator estimates are
+  validated unchanged at the benchmark identity gate
+  (`bench_fe_absorption.py --check-estimates`, coefficients atol=1e-9).
+- **MAP iteration cap raised from 100 to 10,000** (`max_iter` in both `demean_by_groups` and
+  `within_transform`), matching the R `fixest` (`fixef.iter`) and `pyfixest` (`fixef_maxiter`)
+  defaults. Correlated FE incidence (e.g. stores active in contiguous week windows) genuinely
+  needs hundreds of iterations; the old cap made such fits warn and return slightly-off residuals.
+  Worst-case trade-off accepted deliberately: a truly non-convergent input now iterates 10,000x
+  before warning. Documented in `docs/methodology/REGISTRY.md` "Absorbed Fixed Effects".
+- **NaN in an absorbed group column now raises `ValueError`** naming the column (previously the
+  unweighted path silently NaN-poisoned the affected rows and the weighted path silently passed
+  them through un-demeaned).
+- **FE-spanned regressors are now snapped to exact zero before the solver** (new
+  `diff_diff.utils.snap_absorbed_regressors`, wired at every demeaning consumer: DiD/MPD
+  `absorb=`, TWFE, SunAbraham, BaconDecomposition, WooldridgeDiD, and the DiD/MPD/TWFE
+  replicate-refit closures). A regressor spanned by the absorbed FEs (treated-group indicator with
+  the unit dimension absorbed, a period dummy with time absorbed, a unit-constant covariate, or an
+  additive `a_unit + b_time` combination) demeans to numerical junk that previously survived the
+  rank check via column equilibration and perturbed the identified coefficients - up to ~3e-3 on
+  ATT with a ~1e14-scale garbage coefficient reported for the spanned column in the joint-span
+  case. Detection is two-stage (fast relative-norm path at 1e-10, then an exact sparse-LSMR
+  span-membership confirmation for candidates masked by MAP truncation on unbalanced/correlated
+  panels); norms are `sqrt(w)`-weighted under WLS so zero-weight domain rows cannot mask spanning.
+  Spanned regressors are dropped deterministically (coefficient NaN) with a cause-specific
+  `UserWarning` naming them under `rank_deficient_action="warn"`.
+  **Point-estimate note**: designs carrying such spanned regressors (e.g. 2x2 DiD with
+  `absorb=[unit_dim, time_dim]`) see identified estimates shift by up to ~1e-5 relative to
+  v3.6.1 - the removal of the junk direction, documented in REGISTRY "Absorbed Fixed Effects";
+  estimates are now invariant (~1e-14) to the demeaning tolerance where they previously swung
+  at ~1e-5.
+
 ### Added
 - **FE-absorption benchmark suite** (`benchmarks/speed_review/bench_fe_absorption.py`, scenarios
   7-13 in `docs/performance-scenarios.md`): seven realistic workloads (county policy event study,
diff --git a/benchmarks/speed_review/baselines/fe_absorption_after.json b/benchmarks/speed_review/baselines/fe_absorption_after.json
new file mode 100644
index 000000000..30e2ec590
--- /dev/null
+++ b/benchmarks/speed_review/baselines/fe_absorption_after.json
@@ -0,0 +1,151 @@
+{
+  "suite": "fe_absorption",
+  "platform": "macOS-26.5.1-arm64-arm-64bit-Mach-O",
+  "python": "3.14.4",
+  "versions": {
+    "diff_diff": "3.6.1",
+    "numpy": "2.5.0",
+    "pandas": "3.0.3"
+  },
+  "backend_requested": "python",
+  "repeats": 3,
+  "quick": false,
+  "tolerances": {
+    "att_atol": 1e-09,
+    "se_rtol": 1e-07,
+    "se_rtol_survey": 1e-06,
+    "gate_exempt": [
+      "tail_stress"
+    ]
+  },
+  "results": [
+    {
+      "scenario": "county_policy",
+      "n_obs": 177289,
+      "checksum": 6315.071127371084,
+      "att": 0.2803785413990301,
+      "se": 0.031244864610215972,
+      "fit_median_s": 2.3882,
+      "fit_min_s": 2.3505,
+      "fit_max_s": 2.4948,
+      "fit_cv": 0.0257,
+      "noisy": false,
+      "n_timed_fits": 9,
+      "datagen_s": 0.004,
+      "peak_rss_mb": 3177.5,
+      "backend_resolved": "python",
+      "warnings": []
+    },
+    {
+      "scenario": "firm_churn",
+      "n_obs": 2400000,
+      "checksum": 240997.06223418942,
+      "att": 0.3144327184497798,
+      "se": 0.005837281701253927,
+      "fit_median_s": 49.502,
+      "fit_min_s": 48.4532,
+      "fit_max_s": 50.3118,
+      "fit_cv": 0.0189,
+      "noisy": false,
+      "n_timed_fits": 3,
+      "datagen_s": 0.07,
+      "peak_rss_mb": 15900.8,
+      "backend_resolved": "python",
+      "warnings": []
+    },
+    {
+      "scenario": "scanner_twfe",
+      "n_obs": 3255000,
+      "checksum": 76832.22571739106,
+      "att": 0.251164131495566,
+      "se": 0.002414938742160613,
+      "fit_median_s": 0.9834,
+      "fit_min_s": 0.9825,
+      "fit_max_s": 0.9845,
+      "fit_cv": 0.001,
+      "noisy": false,
+      "n_timed_fits": 3,
+      "datagen_s": 0.077,
+      "peak_rss_mb": 976.9,
+      "backend_resolved": "python",
+      "warnings": []
+    },
+    {
+      "scenario": "geo_experiment",
+      "n_obs": 5000000,
+      "checksum": 443696.3834000407,
+      "att": 0.25194372599916653,
+      "se": 0.0017886314723251327,
+      "fit_median_s": 0.973,
+      "fit_min_s": 0.968,
+      "fit_max_s": 0.9759,
+      "fit_cv": 0.0041,
+      "noisy": false,
+      "n_timed_fits": 3,
+      "datagen_s": 0.062,
+      "peak_rss_mb": 1664.8,
+      "backend_resolved": "python",
+      "warnings": [
+        "Rank-deficient design matrix: dropping 2 of 4 columns (column 1, column 2). Coefficients for these columns are",
+        "Regressor(s) ['treated', 'post'] are collinear with the absorbed fixed effects (absorb=['store', 'week']): the"
+      ]
+    },
+    {
+      "scenario": "survey_absorb",
+      "n_obs": 500000,
+      "checksum": -176482.6014001448,
+      "att": 0.24554568144130481,
+      "se": 0.007662698712656388,
+      "fit_median_s": 4.0767,
+      "fit_min_s": 4.0736,
+      "fit_max_s": 4.0778,
+      "fit_cv": 0.0005,
+      "noisy": false,
+      "n_timed_fits": 3,
+      "datagen_s": 0.107,
+      "peak_rss_mb": 3589.2,
+      "backend_resolved": "python",
+      "warnings": [
+        "Rank-deficient design matrix: dropping 2 of 4 columns (column 1, column 2). Coefficients for these columns are",
+        "Regressor(s) ['treated', 'post'] are collinear with the absorbed fixed effects (absorb=['state', 'month']): th"
+      ]
+    },
+    {
+      "scenario": "tail_stress",
+      "n_obs": 5000000,
+      "checksum": 376703.7248141762,
+      "att": 0.2502082560083505,
+      "se": 0.004392830594147583,
+      "fit_median_s": 31.7648,
+      "fit_min_s": 31.4957,
+      "fit_max_s": 31.8372,
+      "fit_cv": 0.0057,
+      "noisy": false,
+      "n_timed_fits": 3,
+      "datagen_s": 0.067,
+      "peak_rss_mb": 1840.7,
+      "backend_resolved": "python",
+      "warnings": [
+        "Rank-deficient design matrix: dropping 2 of 4 columns (column 1, column 2). Coefficients for these columns are",
+        "Regressor(s) ['treated', 'post'] are collinear with the absorbed fixed effects (absorb=['store', 'week']): the"
+      ]
+    },
+    {
+      "scenario": "guard_small",
+      "n_obs": 20000,
+      "checksum": -571.6190664591859,
+      "att": 0.2515087011451766,
+      "se": 0.028726073333820948,
+      "fit_median_s": 0.0035,
+      "fit_min_s": 0.0034,
+      "fit_max_s": 0.0039,
+      "fit_cv": 0.0437,
+      "noisy": false,
+      "n_timed_fits": 21,
+      "datagen_s": 0.001,
+      "peak_rss_mb": 145.1,
+      "backend_resolved": "python",
+      "warnings": []
+    }
+  ]
+}
\ No newline at end of file
diff --git a/benchmarks/speed_review/bench_fe_absorption.py b/benchmarks/speed_review/bench_fe_absorption.py
index 316801d73..f6867628e 100644
--- a/benchmarks/speed_review/bench_fe_absorption.py
+++ b/benchmarks/speed_review/bench_fe_absorption.py
@@ -189,9 +189,15 @@ def _measure(scenario, repeats, backend, quick):
 
 
 # --------------------------------------------------------------- identity gate
-def check_estimates(results, before_path):
+def check_estimates(results, before_path, allow_shift=()):
     """Compare ATT/SE per scenario against a prior baseline JSON. Returns the
     number of failures (0 = pass). tail_stress reports its delta, ungated.
+
+    allow_shift: scenario ids whose estimates are EXPECTED to move this run
+    because of a documented estimate-affecting fix (the reason must be in the
+    PR/CHANGELOG). Their deltas are printed loudly but do not fail the gate -
+    this is a per-run flag, never a permanent exemption, so future runs
+    re-gate them against the refreshed baseline.
     """
     before = {r["scenario"]: r for r in json.loads(Path(before_path).read_text())["results"]}
     failures = 0
@@ -210,6 +216,12 @@ def check_estimates(results, before_path):
                 f"|d att|={d_att:.3e} rel d se={rel_se:.3e}"
             )
             continue
+        if scen in allow_shift:
+            print(
+                f"  identity {scen:16s} ALLOWED SHIFT (documented fix): "
+                f"|d att|={d_att:.3e} rel d se={rel_se:.3e}"
+            )
+            continue
         ok = d_att <= ATT_ATOL and rel_se <= se_rtol
         status = "ok" if ok else "FAIL"
         print(
@@ -239,6 +251,14 @@ def main():
         metavar="BEFORE_JSON",
         help="compare ATT/SE against a prior baseline at the unified tolerances",
     )
+    ap.add_argument(
+        "--allow-shift",
+        metavar="SCEN[,SCEN...]",
+        default="",
+        help="comma-separated scenarios whose estimates are expected to move "
+        "this run due to a documented estimate-affecting fix (printed loudly, "
+        "not failed; per-run only - never a permanent exemption)",
+    )
     args = ap.parse_args()
 
     if args.worker:
@@ -301,7 +321,10 @@ def main():
     exit_code = 0
     if args.check_estimates:
         print("\nidentity gate vs", args.check_estimates)
-        failures = check_estimates([r for r in results if "error" not in r], args.check_estimates)
+        allow = tuple(s for s in args.allow_shift.split(",") if s)
+        failures = check_estimates(
+            [r for r in results if "error" not in r], args.check_estimates, allow
+        )
         if failures:
             print(f"IDENTITY GATE FAILED: {failures} scenario(s) moved")
             exit_code = 1
diff --git a/diff_diff/bacon.py b/diff_diff/bacon.py
index 4276cfab3..28b2b5c98 100644
--- a/diff_diff/bacon.py
+++ b/diff_diff/bacon.py
@@ -18,6 +18,7 @@
 import pandas as pd
 
 from diff_diff.results import _format_survey_block
+from diff_diff.utils import pre_demean_norms, snap_absorbed_regressors
 from diff_diff.utils import within_transform as _within_transform_util
 
 
@@ -795,6 +796,7 @@ def _compute_twfe(
     ) -> float:
         """Compute TWFE estimate using within-transformation."""
         # Apply two-way within transformation (weighted if survey weights provided)
+        _pre_norms = pre_demean_norms(df, [treat_col], weights=weights)
         df_dm = _within_transform_util(
             df,
             [outcome, treat_col],
@@ -803,6 +805,19 @@ def _compute_twfe(
             suffix="_within",
             weights=weights,
         )
+        # Snap an FE-spanned treatment to exact zero: the d_var == 0 guard
+        # below then returns its deterministic 0.0 (with the cause warning)
+        # instead of an arbitrary junk/junk division.
+        snap_absorbed_regressors(
+            df_dm,
+            [treat_col],
+            _pre_norms,
+            absorbed_desc=f"unit '{unit}' and time '{time}' fixed effects",
+            group_vars=[unit, time],
+            suffix="_within",
+            display_names={treat_col: "treatment"},
+            weights=weights,
+        )
 
         # Extract within-transformed values
         y_within = df_dm[f"{outcome}_within"].values
diff --git a/diff_diff/estimators.py b/diff_diff/estimators.py
index b48ddf643..2c996bf36 100644
--- a/diff_diff/estimators.py
+++ b/diff_diff/estimators.py
@@ -30,7 +30,9 @@
     WildBootstrapResults,
     demean_by_groups,
     fe_dummy_names,
+    pre_demean_norms,
     safe_inference,
+    snap_absorbed_regressors,
     validate_binary,
     validate_covariate_names,
     validate_design_term_names,
@@ -454,6 +456,8 @@ def fit(
                 float
             ) * working_data[time].values.astype(float)
             vars_to_demean = [outcome, treatment, time, "_treat_time"] + (covariates or [])
+            _absorb_regressors = vars_to_demean[1:]  # everything except outcome
+            _pre_norms = pre_demean_norms(working_data, _absorb_regressors, weights=survey_weights)
             # Method of alternating projections: for N > 1 absorbed dimensions a
             # single sequential sweep is only exact on balanced (orthogonal-FE)
             # panels; demean_by_groups iterates to the exact (W)LS-FWL residual.
@@ -464,6 +468,19 @@ def fit(
                 inplace=True,
                 weights=survey_weights,
             )
+            # FE-spanned regressors demean to numerical junk, not exact zero;
+            # snap them so rank handling drops them deterministically instead
+            # of the junk direction perturbing the identified coefficients.
+            snap_absorbed_regressors(
+                working_data,
+                _absorb_regressors,
+                _pre_norms,
+                absorbed_desc=f"absorb={list(absorb)}",
+                group_vars=list(absorb),
+                rank_deficient_action=self.rank_deficient_action,
+                display_names={"_treat_time": f"{treatment}:{time}"},
+                weights=survey_weights,
+            )
             n_absorbed_effects += n_fe
             absorbed_vars = list(absorb)
 
@@ -638,7 +655,21 @@ def _refit_did_absorb(w_r):
                     float
                 )
                 vars_dm = [outcome, treatment, time, "_treat_time"] + (covariates or [])
+                _rep_norms = pre_demean_norms(wd, vars_dm[1:], weights=w_nz)
                 wd, _ = demean_by_groups(wd, vars_dm, _absorb_list, inplace=True, weights=w_nz)
+                # A regressor can become FE-spanned WITHIN a replicate's
+                # effective sample (half-sample zeroing): snap silently so the
+                # replicate solve drops it (NaN replicate -> invalid) instead
+                # of consuming a junk direction.
+                snap_absorbed_regressors(
+                    wd,
+                    vars_dm[1:],
+                    _rep_norms,
+                    absorbed_desc=f"absorb={_absorb_list}",
+                    group_vars=_absorb_list,
+                    rank_deficient_action="silent",
+                    weights=w_nz,
+                )
                 y_r = wd[outcome].values.astype(float)
                 d_r = wd[treatment].values.astype(float)
                 t_r = wd[time].values.astype(float)
@@ -1607,6 +1638,8 @@ def fit(  # type: ignore[override]
                 + [f"_did_interact_{p}" for p in non_ref_periods]
                 + (covariates or [])
             )
+            _absorb_regressors = vars_to_demean[1:]  # everything except outcome
+            _pre_norms = pre_demean_norms(working_data, _absorb_regressors, weights=survey_weights)
             # Method of alternating projections (exact for unbalanced panels; a
             # single sequential sweep is exact only on balanced orthogonal-FE panels).
             working_data, n_fe = demean_by_groups(
@@ -1616,6 +1649,24 @@ def fit(  # type: ignore[override]
                 inplace=True,
                 weights=survey_weights,
             )
+            # Snap FE-spanned regressors (e.g. period dummies when a time
+            # dimension is absorbed) to exact zero: rank handling then drops
+            # them deterministically instead of their junk directions
+            # perturbing the identified interaction coefficients.
+            snap_absorbed_regressors(
+                working_data,
+                _absorb_regressors,
+                _pre_norms,
+                absorbed_desc=f"absorb={list(absorb)}",
+                group_vars=list(absorb),
+                rank_deficient_action=self.rank_deficient_action,
+                display_names={
+                    "_did_treatment": treatment,
+                    **{f"_did_period_{p}": f"{time}=={p}" for p in non_ref_periods},
+                    **{f"_did_interact_{p}": f"{treatment}:{time}=={p}" for p in non_ref_periods},
+                },
+                weights=survey_weights,
+            )
             n_absorbed_effects += n_fe
 
         # Extract outcome and treatment (may be demeaned if absorb was used)
@@ -1840,7 +1891,18 @@ def _refit_mp_absorb(w_r):
                     + [f"_did_interact_{p}" for p in non_ref_periods]
                     + (covariates or [])
                 )
+                _rep_norms_mp = pre_demean_norms(wd, vars_dm_[1:], weights=w_nz)
                 wd, _ = demean_by_groups(wd, vars_dm_, _absorb_list_mp, inplace=True, weights=w_nz)
+                # Replicate-local FE spanning: snap silently (see DiD closure).
+                snap_absorbed_regressors(
+                    wd,
+                    vars_dm_[1:],
+                    _rep_norms_mp,
+                    absorbed_desc=f"absorb={_absorb_list_mp}",
+                    group_vars=_absorb_list_mp,
+                    rank_deficient_action="silent",
+                    weights=w_nz,
+                )
                 y_r = wd[outcome].values.astype(float)
                 d_r = wd["_did_treatment"].values.astype(float)
                 X_r = np.column_stack([np.ones(len(y_r)), d_r])
diff --git a/diff_diff/sun_abraham.py b/diff_diff/sun_abraham.py
index 931831f8c..c14286739 100644
--- a/diff_diff/sun_abraham.py
+++ b/diff_diff/sun_abraham.py
@@ -20,7 +20,9 @@
 from diff_diff.linalg import LinearRegression
 from diff_diff.results import _format_survey_block, _get_significance_stars
 from diff_diff.utils import (
+    pre_demean_norms,
     safe_inference,
+    snap_absorbed_regressors,
 )
 from diff_diff.utils import (
     within_transform as _within_transform_util,
@@ -1491,10 +1493,24 @@ def _fit_saturated_regression(
             variables_to_demean = [outcome] + interaction_cols
             if covariates:
                 variables_to_demean.extend(covariates)
+            _sa_regressors = variables_to_demean[1:]  # everything except outcome
+            _pre_norms = pre_demean_norms(df, _sa_regressors, weights=survey_weights)
 
             df_demeaned = _within_transform_util(
                 df, variables_to_demean, unit, time, suffix="_dm", weights=survey_weights
             )
+            # Snap FE-spanned regressors (e.g. a unit-constant covariate) to
+            # exact zero so rank handling drops them deterministically.
+            snap_absorbed_regressors(
+                df_demeaned,
+                _sa_regressors,
+                _pre_norms,
+                absorbed_desc=f"unit '{unit}' and time '{time}' fixed effects",
+                group_vars=[unit, time],
+                rank_deficient_action=self.rank_deficient_action,
+                suffix="_dm",
+                weights=survey_weights,
+            )
 
             X_cols = [f"{col}_dm" for col in interaction_cols]
             if covariates:
@@ -1585,20 +1601,6 @@ def _fit_saturated_regression(
 
         return cohort_effects, cohort_ses, vcov_cohort, coef_index_map, bm_artifacts
 
-    def _within_transform(
-        self,
-        df: pd.DataFrame,
-        variables: List[str],
-        unit: str,
-        time: str,
-    ) -> pd.DataFrame:
-        """
-        Apply two-way within transformation to remove unit and time fixed effects.
-
-        y_it - y_i. - y_.t + y_..
-        """
-        return _within_transform_util(df, variables, unit, time, suffix="_dm")
-
     def _compute_iw_effects(
         self,
         df: pd.DataFrame,
diff --git a/diff_diff/twfe.py b/diff_diff/twfe.py
index 750a12040..70b7ccd62 100644
--- a/diff_diff/twfe.py
+++ b/diff_diff/twfe.py
@@ -16,6 +16,8 @@
 from diff_diff.results import DiDResults
 from diff_diff.utils import (
     fe_dummy_names,
+    pre_demean_norms,
+    snap_absorbed_regressors,
     validate_covariate_names,
     validate_design_term_names,
 )
@@ -382,6 +384,8 @@ def fit(  # type: ignore[override]
             # demean outcome, covariates, AND interaction in a single pass
             # so the regression uses demeaned regressors (FWL theorem).
             all_vars = [outcome] + (covariates or []) + ["_treatment_post"]
+            _twfe_regressors = all_vars[1:]  # everything except outcome
+            _pre_norms = pre_demean_norms(data, _twfe_regressors, weights=survey_weights)
             data_demeaned = _within_transform_util(
                 data,
                 all_vars,
@@ -390,6 +394,19 @@ def fit(  # type: ignore[override]
                 suffix="_demeaned",
                 weights=survey_weights,
             )
+            # Snap FE-spanned regressors (e.g. a unit-constant covariate) to
+            # exact zero so rank handling drops them deterministically.
+            snap_absorbed_regressors(
+                data_demeaned,
+                _twfe_regressors,
+                _pre_norms,
+                absorbed_desc=f"unit '{unit}' and time '{time}' fixed effects",
+                group_vars=[unit, time],
+                rank_deficient_action=self.rank_deficient_action,
+                suffix="_demeaned",
+                display_names={"_treatment_post": f"{treatment}:{time}"},
+                weights=survey_weights,
+            )
             y = data_demeaned[f"{outcome}_demeaned"].values
             X_list = [data_demeaned["_treatment_post_demeaned"].values]
             if covariates:
@@ -595,6 +612,7 @@ def _refit_twfe(w_r):
                 nz = w_r > 0
                 data_nz = data[nz].copy()
                 w_nz = w_r[nz]
+                _rep_norms_twfe = pre_demean_norms(data_nz, _all_vars_twfe[1:], weights=w_nz)
                 data_dem_r = _within_transform_util(
                     data_nz,
                     _all_vars_twfe,
@@ -603,6 +621,17 @@ def _refit_twfe(w_r):
                     suffix="_demeaned",
                     weights=w_nz,
                 )
+                # Replicate-local FE spanning: snap silently (see DiD closure).
+                snap_absorbed_regressors(
+                    data_dem_r,
+                    _all_vars_twfe[1:],
+                    _rep_norms_twfe,
+                    absorbed_desc=f"unit '{unit}' and time '{time}' fixed effects",
+                    group_vars=[unit, time],
+                    rank_deficient_action="silent",
+                    suffix="_demeaned",
+                    weights=w_nz,
+                )
                 y_r = data_dem_r[f"{outcome}_demeaned"].values
                 X_list_r = [data_dem_r["_treatment_post_demeaned"].values]
                 for cov_ in _covariates_twfe:
@@ -732,41 +761,6 @@ def _refit_twfe(w_r):
         self.is_fitted_ = True
         return self.results_
 
-    def _within_transform(
-        self,
-        data: pd.DataFrame,
-        outcome: str,
-        unit: str,
-        time: str,
-        covariates: Optional[List[str]] = None,
-    ) -> pd.DataFrame:
-        """
-        Apply within transformation to remove unit and time fixed effects.
-
-        This implements the standard two-way within transformation:
-        y_it - y_i. - y_.t + y_..
-
-        Parameters
-        ----------
-        data : pd.DataFrame
-            Panel data.
-        outcome : str
-            Outcome variable name.
-        unit : str
-            Unit identifier column.
-        time : str
-            Time period column.
-        covariates : list, optional
-            Covariate column names.
-
-        Returns
-        -------
-        pd.DataFrame
-            Data with demeaned variables.
-        """
-        variables = [outcome] + (covariates or [])
-        return _within_transform_util(data, variables, unit, time, suffix="_demeaned")
-
     def _check_staggered_treatment(
         self,
         data: pd.DataFrame,
diff --git a/diff_diff/utils.py b/diff_diff/utils.py
index 0afddb711..e4a44c201 100644
--- a/diff_diff/utils.py
+++ b/diff_diff/utils.py
@@ -2648,7 +2648,7 @@ def demean_by_groups(
     inplace: bool = False,
     suffix: str = "",
     weights: Optional[np.ndarray] = None,
-    max_iter: int = 100,
+    max_iter: int = 10_000,
     tol: float = 1e-10,
 ) -> Tuple[pd.DataFrame, int]:
     """N-way within transformation via the method of alternating projections (MAP).
@@ -2685,9 +2685,12 @@ def demean_by_groups(
     weights : np.ndarray, optional
         Observation weights. When provided, uses weighted group means
         ``sum(w*x)/sum(w)`` per group and converges to the WLS-FWL residual.
-    max_iter : int, default 100
-        Maximum number of alternating-projection iterations per variable. Emits a
-        single ``UserWarning`` per call listing any variable that fails to converge.
+    max_iter : int, default 10_000
+        Maximum number of alternating-projection iterations per variable
+        (matching the R ``fixest`` ``fixef.iter`` / ``pyfixest``
+        ``fixef_maxiter`` defaults; correlated FE incidence can genuinely
+        require hundreds of iterations). Emits a single ``UserWarning`` per
+        call listing any variable that fails to converge.
     tol : float, default 1e-10
         Convergence tolerance on the max absolute change across the iterate.
 
@@ -2699,6 +2702,25 @@ def demean_by_groups(
         Number of absorbed fixed effects, ``sum_d (nunique_d - 1)`` over
         ``group_vars`` (the standard DOF-accounting convention).
 
+    Raises
+    ------
+    ValueError
+        If ``group_vars`` is empty, or if any absorbed group column contains
+        NaN (checked for every ``len(group_vars)``, one-way included) —
+        pandas groupby silently drops NaN keys and ``pd.factorize`` codes
+        them ``-1``, which would silently mis-index the group means, so
+        missing group keys must be dropped or imputed by the caller.
+
+    Notes
+    -----
+    For ``len(group_vars) >= 2`` each dimension is factorized once and the MAP
+    sweeps compute group means via ``np.bincount`` (two O(n) passes per
+    dimension). Plain bincount summation is not compensated the way pandas'
+    grouped mean is, so results agree with a pandas ``groupby`` implementation
+    to ~1e-10 order (drift compounds across iterations), not bit-for-bit. See
+    ``docs/methodology/REGISTRY.md`` "Absorbed Fixed Effects with Survey
+    Weights".
+
     Examples
     --------
     >>> df, n_fe = demean_by_groups(df, ['y', 'x'], ['unit', 'period'])
@@ -2706,8 +2728,22 @@ def demean_by_groups(
     if not group_vars:
         raise ValueError("demean_by_groups requires at least one grouping variable.")
 
+    # NaN group keys are rejected for EVERY N (one-way included): pandas
+    # groupby silently drops NaN keys (NaN-poisoning the affected rows
+    # unweighted, passing them through un-demeaned weighted) and
+    # pd.factorize codes them -1, which would mis-index the last group's
+    # mean in the bincount sweeps below.
+    for g in group_vars:
+        if pd.isna(data[g].values).any():
+            raise ValueError(
+                f"demean_by_groups: absorbed group column '{g}' contains NaN. "
+                "Drop or impute missing group keys before absorbing this "
+                "dimension."
+            )
+
     # One dimension: the within projection is exact in a single pass. Delegate so
-    # the one-way callers stay byte-identical to demean_by_group.
+    # the one-way callers stay byte-identical to demean_by_group (for valid,
+    # NaN-free group keys).
     if len(group_vars) == 1:
         return demean_by_group(
             data,
@@ -2719,38 +2755,49 @@ def demean_by_groups(
         )
 
     # N >= 2: method of alternating projections. Per-variable independent
-    # convergence (outer loop = variable, inner = iterations) — this mirrors
-    # within_transform's weighted loop exactly so that the two-way weighted case
-    # (group_vars == [unit, time]) is bit-identical.
-    n_effects = sum(int(data[g].nunique()) - 1 for g in group_vars)
+    # convergence (outer loop = variable, inner = iterations), same sweep order
+    # over group_vars as within_transform's historical unit-then-time loop.
+    #
+    # Each absorbed dimension is factorized ONCE; every MAP sweep is then two
+    # O(n) passes (np.bincount group-sum + fancy-index gather) with no group
+    # re-hashing — pandas groupby.transform rebuilt its hash table on every
+    # (iteration x variable x dimension) call. Same precompute pattern as
+    # survey._PsuScaffolding. Bincount accumulation is not compensated the way
+    # pandas' grouped mean is: results agree with the prior implementation to
+    # ~1e-10 order, not bit-for-bit (see REGISTRY "Absorbed Fixed Effects").
     target_cols = [var if not suffix else f"{var}{suffix}" for var in variables]
-    group_arrays = [data[g].values for g in group_vars]
+    codes_list: List[np.ndarray] = []
+    n_groups_list: List[int] = []
+    for g in group_vars:
+        # NaN keys already rejected above, so codes are all >= 0 here.
+        codes, uniques = pd.factorize(data[g].values, sort=False)
+        codes_list.append(codes.astype(np.intp, copy=False))
+        n_groups_list.append(len(uniques))
+    n_effects = sum(n_g - 1 for n_g in n_groups_list)
     demeaned_values: List[np.ndarray] = []
     non_converged_vars: List[str] = []
 
     if weights is not None:
         w = np.asarray(weights, dtype=np.float64)
         # Cache per-group weight sums once (invariant across variables/iterations).
-        w_sums = [pd.Series(w).groupby(g).transform("sum").values for g in group_arrays]
-
-        def _weighted_group_demean(x, groups, w, w_sum):
-            wx_sum = pd.Series(w * x).groupby(groups).transform("sum").values
-            # Guard zero-total-weight groups (survey subpopulation / zero-weight
-            # domain padding): leave such rows unchanged (mean 0) so they remain
-            # inert in the downstream WLS instead of poisoning the design with
-            # NaN/Inf. For positive-weight groups this is bit-identical to wx/w_sum.
-            means = np.divide(
-                wx_sum, w_sum, out=np.zeros_like(wx_sum, dtype=np.float64), where=w_sum > 0
-            )
-            return x - means
+        w_sums = [
+            np.bincount(codes, weights=w, minlength=n_g)
+            for codes, n_g in zip(codes_list, n_groups_list)
+        ]
 
         for var in variables:
             x = data[var].values.astype(np.float64)
             converged = False
             for _iter in range(max_iter):
                 x_old = x.copy()
-                for groups, w_sum in zip(group_arrays, w_sums):
-                    x = _weighted_group_demean(x, groups, w, w_sum)
+                for codes, n_g, w_sum in zip(codes_list, n_groups_list, w_sums):
+                    wx_sum = np.bincount(codes, weights=w * x, minlength=n_g)
+                    # Guard zero-total-weight groups (survey subpopulation /
+                    # zero-weight domain padding): leave such rows unchanged
+                    # (mean 0) so they remain inert in the downstream WLS
+                    # instead of poisoning the design with NaN/Inf.
+                    means = np.divide(wx_sum, w_sum, out=np.zeros_like(wx_sum), where=w_sum > 0)
+                    x = x - means[codes]
                 if np.max(np.abs(x - x_old)) < tol:
                     converged = True
                     break
@@ -2758,13 +2805,18 @@ def _weighted_group_demean(x, groups, w, w_sum):
                 non_converged_vars.append(var)
             demeaned_values.append(x)
     else:
+        counts = [
+            np.bincount(codes, minlength=n_g).astype(np.float64)
+            for codes, n_g in zip(codes_list, n_groups_list)
+        ]
         for var in variables:
             x = data[var].values.astype(np.float64)
             converged = False
             for _iter in range(max_iter):
                 x_old = x.copy()
-                for groups in group_arrays:
-                    x = x - pd.Series(x).groupby(groups).transform("mean").values
+                for codes, n_g, cnt in zip(codes_list, n_groups_list, counts):
+                    means = np.bincount(codes, weights=x, minlength=n_g) / cnt
+                    x = x - means[codes]
                 if np.max(np.abs(x - x_old)) < tol:
                     converged = True
                     break
@@ -2796,6 +2848,191 @@ def _weighted_group_demean(x, groups, w, w_sum):
     return pd.concat([data, new_block], axis=1), n_effects
 
 
+def pre_demean_norms(
+    data: pd.DataFrame,
+    regressors: List[str],
+    weights: Optional[np.ndarray] = None,
+) -> Dict[str, float]:
+    """L2 norms of regressor columns BEFORE demeaning, for
+    :func:`snap_absorbed_regressors` (capture before an in-place demean
+    overwrites the source values).
+
+    When ``weights`` is given, norms are ``||sqrt(w) * x||`` — the same
+    effective-sample scaling ``solve_ols`` applies in WLS, so the snap
+    decision is made on the sample the solver actually sees (zero-weight
+    domain rows, which the weighted demean leaves inert by design, must not
+    mask a regressor that is FE-spanned on the positive-weight sample).
+    """
+    sw = None if weights is None else np.sqrt(np.asarray(weights, dtype=np.float64))
+    out: Dict[str, float] = {}
+    for v in regressors:
+        x = data[v].to_numpy(dtype=np.float64)
+        out[v] = float(np.linalg.norm(x if sw is None else sw * x))
+    return out
+
+
+def _fe_span_residual_norm(
+    x_eff: np.ndarray,
+    group_codes: List[np.ndarray],
+    n_groups: List[int],
+    sqrt_w: Optional[np.ndarray],
+) -> float:
+    """Exact residual norm of an (effective-sample) column projected onto the
+    absorbed-FE span, via sparse LSMR.
+
+    The MAP demean's per-iteration stopping rule bounds the last STEP, not the
+    distance to the limit, so a truncated iterate of an exactly-spanned column
+    can carry a structured residual orders of magnitude above the convergence
+    tolerance (slow-convergence regimes: unbalanced, correlated FE incidence).
+    A Krylov solve on the sparse FE incidence decides span membership exactly
+    (up to fp), and because ``x_eff`` is the ALREADY-DEMEANED column (nearly
+    orthogonal to the span), LSMR converges in a handful of iterations.
+
+    Returns the achieved residual norm ``||x_eff - A @ sol||`` — an upper
+    bound on the true projection residual, so an unconverged LSMR errs toward
+    KEEPING the column (the pre-guard status quo), never toward over-snapping.
+    """
+    from scipy.sparse import csr_matrix, hstack
+    from scipy.sparse.linalg import lsmr
+
+    n = x_eff.shape[0]
+    rows = np.arange(n)
+    ones = np.ones(n) if sqrt_w is None else sqrt_w
+    blocks = [
+        csr_matrix((ones, (rows, codes)), shape=(n, n_g))
+        for codes, n_g in zip(group_codes, n_groups)
+    ]
+    a_mat = hstack(blocks, format="csr")
+    sol = lsmr(a_mat, x_eff, atol=1e-13, btol=1e-13)[0]
+    return float(np.linalg.norm(x_eff - a_mat @ sol))
+
+
+def snap_absorbed_regressors(
+    demeaned: pd.DataFrame,
+    regressors: List[str],
+    pre_norms: Dict[str, float],
+    absorbed_desc: str,
+    group_vars: List[str],
+    rank_deficient_action: str = "warn",
+    suffix: str = "",
+    display_names: Optional[Dict[str, str]] = None,
+    rel_tol: float = 1e-10,
+    weights: Optional[np.ndarray] = None,
+    screen_tol: float = 1e-3,
+) -> List[str]:
+    """Zero out regressors that were absorbed (spanned) by the fixed effects.
+
+    A regressor lying exactly in the span of the absorbed FE dummies demeans
+    to numerical junk rather than exact zero. Such a column must not reach the
+    solver: column equilibration re-inflates it to unit norm, it passes the
+    rank check as linearly independent, and its arbitrary direction perturbs
+    the OTHER coefficients (measured up to ~3e-3 on ATT with a garbage
+    ~1e14-scale coefficient reported for the spanned column itself). Snapping
+    it to exact zero makes the downstream rank-deficiency handling drop it
+    deterministically (coefficient NaN) — the documented contract for
+    FE-collinear regressors. See ``docs/methodology/REGISTRY.md`` "Absorbed
+    Fixed Effects".
+
+    Detection is TWO-STAGE, because MAP truncation bounds the last iteration
+    step, not the distance to the limit — a spanned column in a
+    slow-convergence regime (unbalanced, correlated FE incidence) can stop
+    with a structured residual far above ``rel_tol``:
+
+    1. fast path: relative demeaned norm ``<= rel_tol`` snaps immediately
+       (covers exactly-converged spanned columns);
+    2. candidates with relative norm in ``(rel_tol, screen_tol]`` get an
+       exact span-membership confirmation via sparse LSMR on the FE incidence
+       (:func:`_fe_span_residual_norm`) and snap iff the true projection
+       residual is ``<= rel_tol`` — genuinely identified low-within-variation
+       regressors keep their (real) residual and are left untouched.
+
+    Parameters
+    ----------
+    demeaned : pd.DataFrame
+        Frame holding the demeaned columns AND the ``group_vars`` columns
+        (modified in place).
+    regressors : list of str
+        SOURCE names of the regressors (never the outcome — an outcome spanned
+        by the FEs is a zero-residual-variance situation, not a collinearity).
+    pre_norms : dict
+        Pre-demean L2 norm per source name (:func:`pre_demean_norms`).
+    absorbed_desc : str
+        Human description of the absorbed dimensions, for the warning.
+    group_vars : list of str
+        The absorbed FE columns (present in ``demeaned``), needed to build
+        the sparse incidence for the stage-2 confirmation.
+    rank_deficient_action : str, default "warn"
+        The owning estimator's contract: ``"warn"`` emits the cause-specific
+        ``UserWarning`` here; ``"silent"`` and ``"error"`` defer entirely to
+        the downstream rank machinery (silent drop / raise respectively).
+    suffix : str, default ""
+        Demeaned column naming (``f"{var}{suffix}"``).
+    display_names : dict, optional
+        Source name -> user-facing name for the warning message.
+    rel_tol : float, default 1e-10
+        Snap threshold on the (confirmed) relative projection residual.
+    weights : np.ndarray, optional
+        WLS observation weights. When given, BOTH norms must be
+        ``sqrt(w)``-weighted (pass the same weights to
+        :func:`pre_demean_norms`) so the snap decision matches the effective
+        sample ``solve_ols`` sees — zero-weight rows, which the weighted
+        demean leaves inert by design, must not mask an FE-spanned regressor
+        on the positive-weight sample.
+    screen_tol : float, default 1e-3
+        Stage-2 screening bound. Spanned columns whose MAP truncation
+        residual exceeds this are outside the guard — but at that point the
+        demeaning of every other column is equally unconverged and the
+        non-convergence warning contract applies.
+
+    Returns
+    -------
+    list of str
+        Source names of the snapped regressors (empty if none).
+    """
+    sw = None if weights is None else np.sqrt(np.asarray(weights, dtype=np.float64))
+    snapped: List[str] = []
+    span_cache: Optional[Tuple[List[np.ndarray], List[int]]] = None
+    for var in regressors:
+        pre = pre_norms.get(var, 0.0)
+        if pre <= 0.0:
+            continue  # all-zero input column: rank handling covers it as-is
+        col = f"{var}{suffix}" if suffix else var
+        vals = demeaned[col].to_numpy(dtype=np.float64)
+        eff = vals if sw is None else sw * vals
+        eff_norm = float(np.linalg.norm(eff))
+        if eff_norm <= rel_tol * pre:
+            demeaned[col] = np.zeros(len(vals), dtype=np.float64)
+            snapped.append(var)
+            continue
+        if eff_norm <= screen_tol * pre:
+            # Stage 2: MAP truncation may mask an exactly-spanned column;
+            # confirm with an exact projection on the FE incidence.
+            if span_cache is None:
+                codes_l: List[np.ndarray] = []
+                sizes_l: List[int] = []
+                for g in group_vars:
+                    codes_g, uniques_g = pd.factorize(demeaned[g].values, sort=False)
+                    codes_l.append(codes_g.astype(np.intp, copy=False))
+                    sizes_l.append(len(uniques_g))
+                span_cache = (codes_l, sizes_l)
+            resid = _fe_span_residual_norm(eff, span_cache[0], span_cache[1], sw)
+            if resid <= rel_tol * pre:
+                demeaned[col] = np.zeros(len(vals), dtype=np.float64)
+                snapped.append(var)
+    if snapped and rank_deficient_action == "warn":
+        shown = [display_names.get(v, v) if display_names else v for v in snapped]
+        warnings.warn(
+            f"Regressor(s) {shown} are collinear with the absorbed fixed "
+            f"effects ({absorbed_desc}): their within-transformed values are "
+            f"numerically zero (relative projection residual <= {rel_tol:g}), "
+            "so their coefficients are not identified and will be reported "
+            "as NaN.",
+            UserWarning,
+            stacklevel=3,
+        )
+    return snapped
+
+
 def within_transform(
     data: pd.DataFrame,
     variables: List[str],
@@ -2804,7 +3041,7 @@ def within_transform(
     inplace: bool = False,
     suffix: str = "_demeaned",
     weights: Optional[np.ndarray] = None,
-    max_iter: int = 100,
+    max_iter: int = 10_000,
     tol: float = 1e-8,
 ) -> pd.DataFrame:
     """
@@ -2846,12 +3083,14 @@ def within_transform(
         column name overwrites it rather than appending a duplicate label.
     weights : np.ndarray, optional
         Observation weights for weighted group means.
-    max_iter : int, default 100
+    max_iter : int, default 10_000
         Maximum number of alternating-projection iterations. Applies to BOTH the
         weighted and unweighted paths (both now iterate via the method of
         alternating projections, exact for unbalanced panels). Emits a
         ``UserWarning`` per call when any variable fails to converge within this
-        budget. Balanced panels converge in ~2 iterations.
+        budget. Balanced panels converge in ~2 iterations; correlated FE
+        incidence (contiguous unit lifetimes) can require hundreds. The default
+        matches ``fixest``'s ``fixef.iter`` / ``pyfixest``'s ``fixef_maxiter``.
     tol : float, default 1e-8
         Convergence tolerance on the max absolute change across the iterate.
 
@@ -2874,12 +3113,11 @@ def within_transform(
     """
     # Two-way (unit + time) within transformation is the N-way method of
     # alternating projections specialized to two FE dimensions. Delegate to the
-    # shared engine so there is one MAP implementation. The weighted path is
-    # bit-identical to the previous inline loop (same per-variable convergence,
-    # unit-then-time order, and forwarded tol=1e-8); the unweighted path now also
-    # iterates (MAP) instead of the closed-form additive demean, which was exact
-    # only for balanced panels. ``tol`` is forwarded explicitly (within_transform's
-    # default is 1e-8, not demean_by_groups' 1e-10) to preserve weighted goldens.
+    # shared engine so there is one MAP implementation (same per-variable
+    # convergence, unit-then-time sweep order; agrees with the historical
+    # pandas-groupby loop to ~1e-10 order — see the engine's Notes). ``tol`` is
+    # forwarded explicitly (within_transform's default is 1e-8, not
+    # demean_by_groups' 1e-10) to preserve the historical convergence budget.
     return demean_by_groups(
         data,
         variables,
diff --git a/diff_diff/wooldridge.py b/diff_diff/wooldridge.py
index d6597f5d3..1869e80ee 100644
--- a/diff_diff/wooldridge.py
+++ b/diff_diff/wooldridge.py
@@ -20,7 +20,12 @@
 import pandas as pd
 
 from diff_diff.linalg import compute_robust_vcov, solve_logit, solve_ols, solve_poisson
-from diff_diff.utils import safe_inference, within_transform
+from diff_diff.utils import (
+    pre_demean_norms,
+    safe_inference,
+    snap_absorbed_regressors,
+    within_transform,
+)
 from diff_diff.wooldridge_results import WooldridgeDiDResults
 
 _VALID_METHODS = ("ols", "logit", "poisson")
@@ -1195,6 +1200,8 @@ def _fit_ols(
                             f"{grp_label}s or use non-zero weights."
                         )
 
+            _w_regressors = [f"_x{i}" for i in range(X_design.shape[1])]
+            _pre_norms = pre_demean_norms(tmp, _w_regressors, weights=wt_weights)
             transformed = within_transform(
                 tmp,
                 all_vars,
@@ -1203,6 +1210,19 @@ def _fit_ols(
                 suffix="_demeaned",
                 weights=wt_weights,
             )
+            # Snap FE-spanned design columns (e.g. a unit-constant covariate)
+            # to exact zero so rank handling drops them deterministically.
+            snap_absorbed_regressors(
+                transformed,
+                _w_regressors,
+                _pre_norms,
+                absorbed_desc=f"unit '{unit}' and time '{time}' fixed effects",
+                group_vars=[unit, time],
+                rank_deficient_action=self.rank_deficient_action,
+                suffix="_demeaned",
+                display_names=dict(zip(_w_regressors, col_names)),
+                weights=wt_weights,
+            )
 
             y = transformed[f"{outcome}_demeaned"].values
             X_cols = [f"_x{i}_demeaned" for i in range(X_design.shape[1])]
diff --git a/docs/doc-deps.yaml b/docs/doc-deps.yaml
index 7753311f6..e90ea6b70 100644
--- a/docs/doc-deps.yaml
+++ b/docs/doc-deps.yaml
@@ -901,6 +901,10 @@ sources:
         section: "Wild cluster bootstrap (WCR)"
         type: methodology
         note: "wild_bootstrap_se: WCR null-imposition, enumeration, test-inversion CI, fwildclusterboot parity"
+      - path: docs/methodology/REGISTRY.md
+        section: "Absorbed Fixed Effects with Survey Weights"
+        type: methodology
+        note: "demean_by_groups / within_transform: MAP convergence contract (max_iter/tol), bincount accumulation numerics, NaN-group-key guard"
       - path: docs/practitioner_getting_started.rst
         section: "Step 4: Check Whether the Result Is Trustworthy"
         type: user_guide
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
index a0cbd70fc..38e0c3558 100644
--- a/docs/methodology/REGISTRY.md
+++ b/docs/methodology/REGISTRY.md
@@ -4319,11 +4319,64 @@ unequal selection probabilities).
   when any transformed variable exits the alternating-projection loop without
   reaching `tol` within `max_iter`. This now covers the **unweighted** path as well
   (previously the unweighted two-way transform used a closed-form additive demean
-  and could not warn). Defaults: `max_iter=100`; `tol=1e-8` via `within_transform`
+  and could not warn). Defaults: `max_iter=10_000` in both `demean_by_groups` and
+  `within_transform` (raised from 100 in v3.6.x to match the compiled-library
+  convention - R `fixest` `fixef.iter` and `pyfixest` `fixef_maxiter` both default
+  to 10,000; correlated FE incidence such as contiguous unit lifetimes in
+  order-level data genuinely requires hundreds of iterations, measured ~250-280 on
+  the `tail_stress` benchmark scenario where the old cap of 100 warned and
+  returned slightly-off residuals). `tol=1e-8` via `within_transform`
   (TwoWayFixedEffects, SunAbraham, BaconDecomposition, WooldridgeDiD) and `tol=1e-10`
   for the DiD/MultiPeriodDiD `absorb=` path. Balanced panels converge in ~2
-  iterations. Silent return of the current iterate was classified as a silent
-  failure under the Phase 2 audit and replaced with this explicit signal.
+  iterations. Worst-case trade-off, accepted deliberately: an input that cannot
+  converge at all now burns the full 10,000 iterations before warning
+  (correctness-over-latency). Silent return of the current iterate was classified
+  as a silent failure under the Phase 2 audit and replaced with this explicit signal.
+- **Note:** The MAP inner loop factorizes each absorbed dimension once
+  (`pd.factorize`) and forms group means via `np.bincount` accumulation. Plain
+  summation is not Kahan-compensated the way pandas `groupby().mean()` is, so
+  demeaned values agree with the pre-v3.6.x pandas implementation to ~1e-10 order
+  (drift compounds across MAP iterations), not bit-for-bit; estimator estimates
+  are validated unchanged at the FE-absorption benchmark identity gate
+  (`benchmarks/speed_review/bench_fe_absorption.py --check-estimates`).
+- **Edge case:** NaN in an absorbed group column raises a `ValueError` naming the
+  column. `pd.factorize` codes NaN keys as -1, which would otherwise silently index
+  the last group's mean; the prior pandas behavior was itself silently bad
+  (unweighted: NaN-poisoned the affected rows; weighted: passed those rows through
+  un-demeaned into the regression), so the explicit error replaces two distinct
+  silent failure modes.
+- **Edge case (FE-spanned regressors, v3.6.x):** a regressor lying exactly in the
+  span of the absorbed/within-transformed FE dummies (e.g. the treated-group
+  indicator after absorbing the unit dimension, a period dummy after absorbing
+  time, or a unit-constant covariate) demeans to numerical junk (relative norm
+  ~1e-13), NOT exact zero. Such a column previously reached the solver, where
+  column equilibration re-inflated it to unit norm, it passed the rank check as
+  linearly independent, and its arbitrary direction perturbed the identified
+  coefficients at the ~1e-5 level (tolerance- and implementation-dependent).
+  `diff_diff.utils.snap_absorbed_regressors` now zeroes spanned regressors at
+  every demeaning consumer (DiD/MultiPeriodDiD `absorb=`, TwoWayFixedEffects,
+  SunAbraham, BaconDecomposition, WooldridgeDiD) — including the replicate-refit
+  closures (silently, matching their `rank_deficient_action="silent"` solves) —
+  so the rank-deficiency machinery drops them deterministically (coefficient
+  NaN). Detection is **two-stage**, because the MAP stopping rule bounds the
+  last iteration step, not the distance to the limit, and a spanned column in a
+  slow-convergence regime (unbalanced, correlated FE incidence — e.g.
+  `x = a_unit + b_time` on a contiguous-lifetimes panel) can stop with a
+  structured truncation residual far above any fixed norm threshold (measured
+  1.9e-10 at tol=1e-10 and 2e-8 at tol=1e-8; left unsnapped it shifted ATT by
+  ~3e-3 and reported a ~1e14-scale garbage coefficient): (1) relative demeaned
+  norm <= 1e-10 snaps immediately; (2) candidates in (1e-10, 1e-3] get an exact
+  span-membership confirmation via sparse LSMR on the (weighted) FE incidence
+  and snap iff the true projection residual is <= 1e-10 relative — genuinely
+  identified low-within-variation regressors are left untouched (their LSMR
+  residual equals their real within-variation). Norms are `sqrt(w)`-weighted
+  under WLS so zero-weight domain rows (left inert by the weighted demean)
+  cannot mask spanning on the positive-weight sample. A cause-specific
+  `UserWarning` naming the spanned regressors is emitted under
+  `rank_deficient_action="warn"`; `"silent"` and `"error"` defer to the rank
+  machinery per that parameter's existing contract. Identified coefficients are
+  consequently stable in the demeaning tolerance (verified to ~1e-14 across
+  tol=1e-8..1e-12, previously ~1e-5 swings).
 
 ### Survey Degrees of Freedom
 
diff --git a/docs/performance-plan.md b/docs/performance-plan.md
index e3aaf3c14..864c30ddc 100644
--- a/docs/performance-plan.md
+++ b/docs/performance-plan.md
@@ -41,25 +41,43 @@ the gap is machinery, not architecture:
 
 The measurement surface is the seven-scenario FE-absorption suite
 (`benchmarks/speed_review/bench_fe_absorption.py`; shapes documented in
-`docs/performance-scenarios.md` scenarios 7-13; committed baseline
-`baselines/fe_absorption_before.json`, with `_after.json` to be added by the
-optimization PR). The optional pyfixest yardstick
-(`bench_fe_absorption_pyfixest.py`, guarded on import) asserts coefficient
-parity < 1e-6 on the four exact-estimand scenarios so the comparison stays
-honest over time.
+`docs/performance-scenarios.md` scenarios 7-13; committed baselines
+`baselines/fe_absorption_{before,after}.json`). The optional pyfixest
+yardstick (`bench_fe_absorption_pyfixest.py`, guarded on import) asserts
+coefficient parity < 1e-6 on the four exact-estimand scenarios so the
+comparison stays honest over time.
+
+**Landed (v3.6.x): factorize-once + bincount MAP rewrite.** End-to-end fit
+speedups of 1.6-2.7x across the headline scenarios (table below). The
+correlated-FE stress case now CONVERGES (~270 iterations under the raised
+`max_iter=10_000` cap; ~2.7x cheaper per iteration) where the old loop hit
+the cap of 100 and returned slightly-off residuals - its 0.8x wall-clock
+ratio is the price of correctness: it runs 2.7x the iterations to full
+convergence AND pays the exact LSMR span confirmation that now catches its
+truncation-masked FE-spanned `post` column (previously kept as a junk
+regressor in the design). The identity gate held at 1e-14-1e-16 on all
+scenarios except the two carrying FE-spanned regressors through the DiD
+`absorb=` path, whose estimates moved once (survey_absorb ATT by 9.8e-6;
+geo_experiment SE by 1e-7) when the FE-spanned junk columns were snapped to
+zero - a deliberate correction documented in REGISTRY "Absorbed Fixed
+Effects" and the CHANGELOG. Remaining gap vs the pyfixest yardstick (2-26x
+depending on scenario) is the compiled+parallel margin; a Rust demean kernel
+in the existing `rust/` backend is the candidate next step, to be scoped
+only if the remaining gap matters in practice (pyfixest itself validates
+that architecture).
 
 ### FE-absorption suite results
 
 <!-- TABLE:start fe_absorption -->
 | Scenario | n rows | Before (s) | After (s) | Speedup | pyfixest (s) |
 |---|---:|---:|---:|---:|---:|
-| 7. County policy event study (SunAbraham) | 177,289 | 3.865 (cv 2.5%) | - | - | 0.190 (cv 4.0%, proxy) |
-| 8. Firm panel with churn (SunAbraham) | 2,400,000 | 92.957 (cv 0.9%) | - | - | 1.912 (cv 20.3%, noisy, proxy) |
-| 9. Scanner store-week (TWFE) | 3,255,000 | 1.549 (cv 0.3%) | - | - | 0.644 (cv 12.3%, noisy) |
-| 10. Geo experiment 5M orders (DiD absorb) | 5,000,000 | 2.630 (cv 0.6%) | - | - | 0.623 (cv 7.8%) |
-| 11. Survey BRR replicates (DiD absorb) | 500,000 | 7.385 (cv 8.6%) | - | - | - |
-| 12. Correlated-FE stress (DiD absorb) | 5,000,000 | 26.271 (cv 0.4%) | - | - | 3.075 (cv 1.5%) |
-| 13. Small-panel guard (TWFE) | 20,000 | 0.005 (cv 4.1%) | - | - | 0.007 (cv 2.2%) |
+| 7. County policy event study (SunAbraham) | 177,289 | 3.865 (cv 2.5%) | 2.388 (cv 2.6%) | 1.6x | 0.190 (cv 4.0%, proxy) |
+| 8. Firm panel with churn (SunAbraham) | 2,400,000 | 92.957 (cv 0.9%) | 49.502 (cv 1.9%) | 1.9x | 1.912 (cv 20.3%, noisy, proxy) |
+| 9. Scanner store-week (TWFE) | 3,255,000 | 1.549 (cv 0.3%) | 0.983 (cv 0.1%) | 1.6x | 0.644 (cv 12.3%, noisy) |
+| 10. Geo experiment 5M orders (DiD absorb) | 5,000,000 | 2.630 (cv 0.6%) | 0.973 (cv 0.4%) | 2.7x | 0.623 (cv 7.8%) |
+| 11. Survey BRR replicates (DiD absorb) | 500,000 | 7.385 (cv 8.6%) | 4.077 (cv 0.1%) | 1.8x | - |
+| 12. Correlated-FE stress (DiD absorb) | 5,000,000 | 26.271 (cv 0.4%) | 31.765 (cv 0.6%) | 0.8x | 3.075 (cv 1.5%) |
+| 13. Small-panel guard (TWFE) | 20,000 | 0.005 (cv 4.1%) | 0.004 (cv 4.4%) | 1.3x | 0.007 (cv 2.2%) |
 
 *noisy* = CV above the 10% unusable threshold from the noise protocol. *proxy* = timing-only pyfixest stand-in: the Sun-Abraham scenarios run a saturated `i(rel_time)` event study there (pyfixest 0.60 has no `sunab()`) - comparable demeaning load, different estimand, so those cells are not exact-estimand comparisons (see `bench_fe_absorption_pyfixest.py`).
 <!-- TABLE:end fe_absorption -->
diff --git a/tests/test_bacon.py b/tests/test_bacon.py
index cce72d3da..7761a3450 100644
--- a/tests/test_bacon.py
+++ b/tests/test_bacon.py
@@ -65,19 +65,18 @@ def generate_staggered_data(
         effect = np.full(len(units), treatment_effect)
 
     outcomes = (
-        unit_fe_expanded +
-        time_fe_expanded +
-        effect * post +
-        np.random.randn(len(units)) * 0.5
+        unit_fe_expanded + time_fe_expanded + effect * post + np.random.randn(len(units)) * 0.5
     )
 
-    df = pd.DataFrame({
-        'unit': units,
-        'time': times,
-        'outcome': outcomes,
-        'first_treat': first_treat_expanded.astype(int),
-        'treated': post.astype(int),
-    })
+    df = pd.DataFrame(
+        {
+            "unit": units,
+            "time": times,
+            "outcome": outcomes,
+            "first_treat": first_treat_expanded.astype(int),
+            "treated": post.astype(int),
+        }
+    )
 
     return df
 
@@ -91,11 +90,7 @@ def test_basic_fit(self):
 
         decomp = BaconDecomposition()
         results = decomp.fit(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         assert decomp.is_fitted_
@@ -108,11 +103,7 @@ def test_weights_sum_to_one(self):
         data = generate_staggered_data(seed=123)
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         total_weight = sum(c.weight for c in results.comparisons)
@@ -128,11 +119,11 @@ def test_weighted_sum_equals_twfe(self):
 
         results = bacon_decompose(
             data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat',
-            weights='exact',
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            weights="exact",
         )
 
         weighted_sum = sum(c.weight * c.estimate for c in results.comparisons)
@@ -147,11 +138,7 @@ def test_comparison_types(self):
         data = generate_staggered_data(n_cohorts=3, never_treated_frac=0.3)
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         comp_types = set(c.comparison_type for c in results.comparisons)
@@ -166,11 +153,7 @@ def test_no_never_treated(self):
         data = generate_staggered_data(never_treated_frac=0.0)
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         # Should still work
@@ -182,11 +165,7 @@ def test_single_cohort(self):
         data = generate_staggered_data(n_cohorts=1, never_treated_frac=0.3)
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         # With single cohort, should only have treated vs never
@@ -199,11 +178,7 @@ def test_weight_by_type(self):
         data = generate_staggered_data()
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         weights = results.weight_by_type()
@@ -218,11 +193,7 @@ def test_effect_by_type(self):
         data = generate_staggered_data()
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         effects = results.effect_by_type()
@@ -236,11 +207,7 @@ def test_to_dataframe(self):
         data = generate_staggered_data()
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         df = results.to_dataframe()
@@ -258,11 +225,7 @@ def test_summary(self):
         data = generate_staggered_data()
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         summary = results.summary()
@@ -277,11 +240,7 @@ def test_missing_column_error(self):
 
         with pytest.raises(ValueError, match="Missing columns"):
             bacon_decompose(
-                data,
-                outcome='nonexistent',
-                unit='unit',
-                time='time',
-                first_treat='first_treat'
+                data, outcome="nonexistent", unit="unit", time="time", first_treat="first_treat"
             )
 
 
@@ -315,11 +274,7 @@ def test_twfe_decompose_method(self):
 
         twfe = TwoWayFixedEffects()
         decomp = twfe.decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         assert isinstance(decomp, BaconDecompositionResults)
@@ -333,19 +288,10 @@ def test_twfe_staggered_warning(self):
 
         with warnings.catch_warnings(record=True) as w:
             warnings.simplefilter("always")
-            twfe.fit(
-                data,
-                outcome='outcome',
-                treatment='treated',
-                time='time',
-                unit='unit'
-            )
+            twfe.fit(data, outcome="outcome", treatment="treated", time="time", unit="unit")
 
             # Should have emitted a warning about staggered treatment
-            staggered_warnings = [
-                x for x in w
-                if "staggered" in str(x.message).lower()
-            ]
+            staggered_warnings = [x for x in w if "staggered" in str(x.message).lower()]
             assert len(staggered_warnings) > 0
 
 
@@ -357,11 +303,7 @@ def test_convenience_function(self):
         data = generate_staggered_data()
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         assert isinstance(results, BaconDecompositionResults)
@@ -377,15 +319,11 @@ def test_plot_bacon_scatter(self):
 
         data = generate_staggered_data()
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         # Should not raise
-        ax = plot_bacon(results, plot_type='scatter', show=False)
+        ax = plot_bacon(results, plot_type="scatter", show=False)
         assert ax is not None
 
     def test_plot_bacon_bar(self):
@@ -395,15 +333,11 @@ def test_plot_bacon_bar(self):
 
         data = generate_staggered_data()
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         # Should not raise
-        ax = plot_bacon(results, plot_type='bar', show=False)
+        ax = plot_bacon(results, plot_type="bar", show=False)
         assert ax is not None
 
     def test_plot_bacon_invalid_type(self):
@@ -413,15 +347,11 @@ def test_plot_bacon_invalid_type(self):
 
         data = generate_staggered_data()
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         with pytest.raises(ValueError, match="Unknown plot_type"):
-            plot_bacon(results, plot_type='invalid', show=False)
+            plot_bacon(results, plot_type="invalid", show=False)
 
 
 class TestWeightsParameter:
@@ -441,11 +371,7 @@ def test_exact_weights_default(self):
         assert decomp.weights == "exact"
 
         results = decomp.fit(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         # Weights should sum to 1
@@ -467,11 +393,7 @@ def test_approximate_weights_opt_in(self):
         assert decomp.weights == "approximate"
 
         results = decomp.fit(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         # Sum-to-1 contract preserved via normalization, but tolerance is
@@ -487,11 +409,7 @@ def test_exact_weights(self):
         assert decomp.weights == "exact"
 
         results = decomp.fit(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         # Weights should still sum to 1
@@ -504,20 +422,20 @@ def test_exact_vs_approximate_different(self):
 
         results_approx = bacon_decompose(
             data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat',
-            weights="approximate"
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            weights="approximate",
         )
 
         results_exact = bacon_decompose(
             data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat',
-            weights="exact"
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            weights="exact",
         )
 
         # TWFE estimates should be the same
@@ -532,20 +450,20 @@ def test_exact_weights_lower_decomposition_error(self):
 
         results_approx = bacon_decompose(
             data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat',
-            weights="approximate"
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            weights="approximate",
         )
 
         results_exact = bacon_decompose(
             data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat',
-            weights="exact"
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            weights="exact",
         )
 
         # Exact weights should have equal or lower decomposition error
@@ -563,11 +481,11 @@ def test_convenience_function_weights_param(self):
 
         results = bacon_decompose(
             data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat',
-            weights="exact"
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            weights="exact",
         )
 
         assert isinstance(results, BaconDecompositionResults)
@@ -581,22 +499,22 @@ def test_twfe_decompose_weights_param(self):
         # Test with approximate
         decomp_approx = twfe.decompose(
             data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat',
-            weights="approximate"
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            weights="approximate",
         )
         assert isinstance(decomp_approx, BaconDecompositionResults)
 
         # Test with exact
         decomp_exact = twfe.decompose(
             data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat',
-            weights="exact"
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            weights="exact",
         )
         assert isinstance(decomp_exact, BaconDecompositionResults)
 
@@ -622,24 +540,17 @@ def test_unbalanced_panel_warning(self):
 
         # Remove some observations from specific units to make it unbalanced
         # This ensures different units have different numbers of periods
-        mask = ~((data['unit'] == 0) & (data['time'] == 0))  # Remove one period from unit 0
+        mask = ~((data["unit"] == 0) & (data["time"] == 0))  # Remove one period from unit 0
         data = data[mask].reset_index(drop=True)
 
         with warnings.catch_warnings(record=True) as w:
             warnings.simplefilter("always")
             bacon_decompose(
-                data,
-                outcome='outcome',
-                unit='unit',
-                time='time',
-                first_treat='first_treat'
+                data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
             )
 
             # Should have warning about unbalanced panel
-            unbalanced_warnings = [
-                x for x in w
-                if "unbalanced" in str(x.message).lower()
-            ]
+            unbalanced_warnings = [x for x in w if "unbalanced" in str(x.message).lower()]
             assert len(unbalanced_warnings) > 0
 
     def test_balanced_panel_no_warning(self):
@@ -649,18 +560,11 @@ def test_balanced_panel_no_warning(self):
         with warnings.catch_warnings(record=True) as w:
             warnings.simplefilter("always")
             bacon_decompose(
-                data,
-                outcome='outcome',
-                unit='unit',
-                time='time',
-                first_treat='first_treat'
+                data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
             )
 
             # Should NOT have warning about unbalanced panel
-            unbalanced_warnings = [
-                x for x in w
-                if "unbalanced" in str(x.message).lower()
-            ]
+            unbalanced_warnings = [x for x in w if "unbalanced" in str(x.message).lower()]
             assert len(unbalanced_warnings) == 0
 
 
@@ -672,11 +576,7 @@ def test_small_sample(self):
         data = generate_staggered_data(n_units=20, n_periods=5, n_cohorts=2)
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         assert len(results.comparisons) > 0
@@ -688,11 +588,7 @@ def test_many_cohorts(self):
         )
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         # Should have comparisons from all cohort pairs
@@ -703,15 +599,38 @@ def test_inf_for_never_treated(self):
         data = generate_staggered_data()
 
         # Replace 0 with inf for never-treated
-        data['first_treat'] = data['first_treat'].replace(0, np.inf)
+        data["first_treat"] = data["first_treat"].replace(0, np.inf)
 
         results = bacon_decompose(
-            data,
-            outcome='outcome',
-            unit='unit',
-            time='time',
-            first_treat='first_treat'
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
         )
 
         assert results.n_never_treated > 0
         assert len(results.comparisons) > 0
+
+
+class TestAbsorbedTreatmentSnap:
+    """A treatment indicator spanned by the FEs in an internal 2x2 TWFE cell
+    must hit the deterministic zero-variance guard with a cause warning,
+    not an arbitrary junk/junk division (REGISTRY 'Absorbed FE'). The 0.0
+    return for the degenerate cell is _compute_twfe's pre-existing contract.
+    """
+
+    def test_fe_spanned_treatment_hits_zero_variance_guard_with_warning(self):
+        rng = np.random.default_rng(0)
+        # treat == post for every unit -> treatment is a function of time,
+        # spanned by the time FE
+        unit = np.repeat(np.arange(20), 4)
+        time = np.tile(np.arange(4), 20)
+        df = pd.DataFrame(
+            {
+                "unit": unit,
+                "time": time,
+                "y": rng.normal(size=unit.size),
+                "__bacon_treated_internal__": (time >= 2).astype(float),
+            }
+        )
+        decomp = BaconDecomposition()
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            beta = decomp._compute_twfe(df, outcome="y", unit="unit", time="time")
+        assert beta == 0.0  # deterministic d_var == 0 guard, not junk/junk
diff --git a/tests/test_estimators.py b/tests/test_estimators.py
index 5448a347f..2c61b8d0a 100644
--- a/tests/test_estimators.py
+++ b/tests/test_estimators.py
@@ -1,5 +1,7 @@
 """Tests for difference-in-differences estimators."""
 
+import warnings
+
 import numpy as np
 import pandas as pd
 import pytest
@@ -12,6 +14,7 @@
     PeriodEffect,
     SyntheticDiD,
     SyntheticDiDResults,
+    TwoWayFixedEffects,
 )
 
 
@@ -3264,11 +3267,25 @@ def test_twfe_with_unbalanced_panel(self):
         df = pd.DataFrame(data)
 
         twfe = TwoWayFixedEffects()
-        results = twfe.fit(df, outcome="outcome", treatment="post", unit="unit", time="period")
+        # NOTE: this test previously passed treatment="post" with time="period",
+        # making the treatment interaction a pure function of the period FE -
+        # an unidentifiable specification that "worked" only because the
+        # FE-spanned junk column survived rank detection and produced a finite
+        # garbage ATT. The v3.6.x span guard now routes that spec into TWFE's
+        # collinearity error, so the test uses the well-specified form.
+        results = twfe.fit(df, outcome="outcome", treatment="treated", unit="unit", time="post")
 
         # Should produce valid results
         assert np.isfinite(results.att)
         assert results.se > 0
+        # the degenerate spec now raises the pre-existing identification error
+        # (routed there deterministically by the FE-span snap):
+        with pytest.raises(ValueError, match="cannot be identified"):
+            with warnings.catch_warnings():
+                warnings.simplefilter("ignore")
+                TwoWayFixedEffects().fit(
+                    df, outcome="outcome", treatment="post", unit="unit", time="period"
+                )
 
     def test_multiperiod_with_sparse_data(self):
         """Test MultiPeriodDiD with sparse data across periods."""
@@ -3549,8 +3566,12 @@ def test_twfe_with_absorbed_covariate(self):
         # absorbed by unit FE (becomes zero after within-transformation)
         twfe = TwoWayFixedEffects(rank_deficient_action="silent")
         results = twfe.fit(
-            df, outcome="outcome", treatment="treated", unit="unit",
-            time="post", covariates=["unit_covariate"],
+            df,
+            outcome="outcome",
+            treatment="treated",
+            unit="unit",
+            time="post",
+            covariates=["unit_covariate"],
         )
 
         assert np.isfinite(results.att)
@@ -3586,13 +3607,13 @@ def _mpd_data():
             ueff = rng.normal()
             for period in range(6):
                 y = (
-                    10.0 + ueff + 0.5 * period
+                    10.0
+                    + ueff
+                    + 0.5 * period
                     + (3.0 if (treated and period >= 3) else 0.0)
                     + rng.normal(0, 0.5)
                 )
-                rows.append(
-                    {"unit": unit, "treated": treated, "time": period, "outcome": y}
-                )
+                rows.append({"unit": unit, "treated": treated, "time": period, "outcome": y})
         df = pd.DataFrame(rows)
         df["x1"] = np.random.default_rng(11).normal(size=len(df))
         return df
@@ -3602,9 +3623,7 @@ def test_did_collision_for_each_structural_name(self):
         # the test cannot drift from the real var_names (const/treated/post/
         # treated:post), plus the internal _treat_time working column.
         data = self._did_data()
-        clean = DifferenceInDifferences().fit(
-            data, "outcome", "treated", "post", covariates=["x1"]
-        )
+        clean = DifferenceInDifferences().fit(data, "outcome", "treated", "post", covariates=["x1"])
         reserved = [k for k in clean.coefficients if k != "x1"] + ["_treat_time"]
         assert "const" in reserved and "treated:post" in reserved  # sanity
         for name in reserved:
@@ -3612,9 +3631,7 @@ def test_did_collision_for_each_structural_name(self):
             if name not in d2.columns:
                 d2[name] = np.random.default_rng(5).normal(size=len(d2))
             with pytest.raises(ValueError, match="collide"):
-                DifferenceInDifferences().fit(
-                    d2, "outcome", "treated", "post", covariates=[name]
-                )
+                DifferenceInDifferences().fit(d2, "outcome", "treated", "post", covariates=[name])
 
     def test_did_fixed_effects_dummy_collision(self):
         # get_dummies(region, prefix="region", drop_first=True) drops "region_A"
@@ -3624,15 +3641,17 @@ def test_did_fixed_effects_dummy_collision(self):
         data["region_B"] = np.random.default_rng(2).normal(size=len(data))
         with pytest.raises(ValueError, match="collide"):
             DifferenceInDifferences().fit(
-                data, "outcome", "treated", "post",
-                covariates=["region_B"], fixed_effects=["region"],
+                data,
+                "outcome",
+                "treated",
+                "post",
+                covariates=["region_B"],
+                fixed_effects=["region"],
             )
 
     def test_did_noncolliding_preserves_structural_coefs(self):
         data = self._did_data()
-        r = DifferenceInDifferences().fit(
-            data, "outcome", "treated", "post", covariates=["x1"]
-        )
+        r = DifferenceInDifferences().fit(data, "outcome", "treated", "post", covariates=["x1"])
         ck = r.coefficients
         assert np.isfinite(ck["const"]) and np.isfinite(ck["treated:post"])
         assert "x1" in ck and ck["x1"] != ck["treated:post"]
@@ -3659,9 +3678,7 @@ def test_did_duplicate_covariates_raise(self):
 
     def test_mpd_collision_for_each_structural_name(self):
         data = self._mpd_data()
-        clean = MultiPeriodDiD().fit(
-            data, "outcome", "treated", "time", covariates=["x1"]
-        )
+        clean = MultiPeriodDiD().fit(data, "outcome", "treated", "time", covariates=["x1"])
         # const/treated/period_*/treated:period_* (actual keys) + internal column.
         reserved = [k for k in clean.coefficients if k != "x1"] + ["_did_treatment"]
         assert any(k.startswith("period_") for k in reserved)  # sanity
@@ -3670,15 +3687,11 @@ def test_mpd_collision_for_each_structural_name(self):
             if name not in d2.columns:
                 d2[name] = np.random.default_rng(5).normal(size=len(d2))
             with pytest.raises(ValueError, match="collide"):
-                MultiPeriodDiD().fit(
-                    d2, "outcome", "treated", "time", covariates=[name]
-                )
+                MultiPeriodDiD().fit(d2, "outcome", "treated", "time", covariates=[name])
 
     def test_mpd_noncolliding_preserves_structural_coefs(self):
         data = self._mpd_data()
-        r = MultiPeriodDiD().fit(
-            data, "outcome", "treated", "time", covariates=["x1"]
-        )
+        r = MultiPeriodDiD().fit(data, "outcome", "treated", "time", covariates=["x1"])
         ck = r.coefficients
         assert "x1" in ck and np.isfinite(ck["const"])
         assert any(k.startswith("period_") for k in ck)
@@ -3693,8 +3706,12 @@ def test_mpd_fixed_effect_dummy_collides_with_period_keys(self):
         data["period"] = data["time"]  # FE column 'period' -> 'period_1'... dummies
         with pytest.raises(ValueError, match="collide"):
             MultiPeriodDiD().fit(
-                data, "outcome", "treated", "time",
-                covariates=["x1"], fixed_effects=["period"],
+                data,
+                "outcome",
+                "treated",
+                "time",
+                covariates=["x1"],
+                fixed_effects=["period"],
             )
 
     def test_synthetic_did_unaffected_by_guard(self):
@@ -3707,17 +3724,275 @@ def test_synthetic_did_unaffected_by_guard(self):
             ueff = rng.normal(0, 3)
             for period in range(8):
                 y = (
-                    10.0 + ueff + 0.5 * period
+                    10.0
+                    + ueff
+                    + 0.5 * period
                     + (5.0 if (treated and period >= 4) else 0.0)
                     + rng.normal(0, 0.5)
                 )
-                rows.append(
-                    {"unit": unit, "period": period, "treated": treated, "outcome": y}
-                )
+                rows.append({"unit": unit, "period": period, "treated": treated, "outcome": y})
         d = pd.DataFrame(rows)
         d["x1"] = rng.normal(size=len(d))
         r = SyntheticDiD().fit(
-            d, "outcome", "treated", unit="unit", time="period",
-            post_periods=[4, 5, 6, 7], covariates=["x1"],
+            d,
+            "outcome",
+            "treated",
+            unit="unit",
+            time="period",
+            post_periods=[4, 5, 6, 7],
+            covariates=["x1"],
         )
         assert np.isfinite(r.att)
+
+
+class TestAbsorbedRegressorSnap:
+    """FE-spanned regressors snap to zero -> deterministic NaN + warning
+    (REGISTRY 'Absorbed Fixed Effects'; the pre-snap junk columns perturbed
+    identified coefficients at ~1e-5, tolerance- and implementation-dependent).
+    """
+
+    @staticmethod
+    def _geo_frame(seed=0, n_states=20, n_months=12, n_rows=4_000):
+        rng = np.random.default_rng(seed)
+        state = rng.integers(0, n_states, n_rows)
+        month = rng.integers(0, n_months, n_rows)
+        treated = (state < n_states // 2).astype(int)
+        post = (month >= n_months // 2).astype(int)
+        y = (
+            rng.normal(0, 1, n_states)[state]
+            + rng.normal(0, 0.5, n_months)[month]
+            + 0.3 * treated * post
+            + rng.normal(size=n_rows)
+        )
+        return pd.DataFrame(
+            {"y": y, "treated": treated, "post": post, "state": state, "month": month}
+        )
+
+    def test_did_absorb_spanned_main_effects_nan_with_cause_warning(self):
+        df = self._geo_frame()
+        est = DifferenceInDifferences()
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            res = est.fit(
+                df,
+                outcome="y",
+                treatment="treated",
+                time="post",
+                absorb=["state", "month"],
+            )
+        assert np.isnan(res.coefficients["treated"])
+        assert np.isnan(res.coefficients["post"])
+        assert np.isfinite(res.att) and np.isfinite(res.se)
+
+    def test_did_absorb_att_matches_clean_fwl_ground_truth(self):
+        """With the junk columns snapped, ATT must equal the regression on the
+        demeaned interaction ALONE (the identified FWL estimand) at 1e-10."""
+        from diff_diff.utils import demean_by_groups
+
+        df = self._geo_frame(seed=1)
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            res = DifferenceInDifferences().fit(
+                df,
+                outcome="y",
+                treatment="treated",
+                time="post",
+                absorb=["state", "month"],
+            )
+        d = df.copy()
+        d["_tt"] = d["treated"].values.astype(float) * d["post"].values.astype(float)
+        out, _ = demean_by_groups(d, ["y", "_tt"], ["state", "month"], suffix="_dm", tol=1e-12)
+        tt = out["_tt_dm"].values
+        gt = float(np.dot(tt, out["y_dm"].values) / np.dot(tt, tt))
+        assert res.att == pytest.approx(gt, abs=1e-10)
+
+    def test_multiperiod_absorb_spanned_period_dummies_snap(self):
+        """MPD with an absorbed time dimension: the period dummies are spanned
+        (NaN), the interaction effects stay identified."""
+        df = self._geo_frame(seed=2)
+        est = MultiPeriodDiD()
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            res = est.fit(
+                df,
+                outcome="y",
+                treatment="treated",
+                time="month",
+                absorb=["state", "month"],
+            )
+        assert any(np.isfinite(e.effect) for e in res.period_effects.values())
+
+    def test_twfe_unit_constant_covariate_nan_att_unaffected(self):
+        rng = np.random.default_rng(3)
+        n_units, n_periods = 80, 6
+        unit = np.repeat(np.arange(n_units), n_periods)
+        time = np.tile(np.arange(n_periods), n_units)
+        treated = (unit < n_units // 2).astype(int)
+        post = (time >= n_periods // 2).astype(int)
+        y = 0.4 * treated * post + rng.normal(size=unit.size)
+        df = pd.DataFrame({"y": y, "treated": treated, "post": post, "unit": unit, "time": time})
+        df["xc"] = np.repeat(rng.normal(size=n_units), n_periods)  # unit-constant
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            base = TwoWayFixedEffects().fit(
+                df, outcome="y", treatment="treated", time="post", unit="unit"
+            )
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            res = TwoWayFixedEffects().fit(
+                df,
+                outcome="y",
+                treatment="treated",
+                time="post",
+                unit="unit",
+                covariates=["xc"],
+            )
+        # TWFE results expose only the ATT coefficient; the snapped covariate
+        # column is dropped by rank handling, so the fit must reduce EXACTLY
+        # to the no-covariate design.
+        assert res.att == pytest.approx(base.att, abs=1e-10)
+        assert res.se == pytest.approx(base.se, rel=1e-8)
+
+
+class TestReplicateRefitSnap:
+    """Replicate refits must fail closed when a regressor becomes FE-spanned
+    within a replicate's effective half-sample (local-review follow-up)."""
+
+    def test_replicate_local_spanning_yields_finite_inference(self):
+        from diff_diff.survey import SurveyDesign
+
+        rng = np.random.default_rng(30)
+        n_states, n_months, n_rows = 8, 10, 4_000
+        state = rng.integers(0, n_states, n_rows)
+        month = rng.integers(0, n_months, n_rows)
+        treated = (state < n_states // 2).astype(int)
+        post = (month >= n_months // 2).astype(int)
+        # covariate varies ONLY within state 0's rows; on any replicate that
+        # zeroes state 0 it becomes state-spanned in the effective sample
+        xc = np.where(state == 0, rng.normal(size=n_rows), 1.0)
+        y = (
+            rng.normal(0, 1, n_states)[state]
+            + rng.normal(0, 0.3, n_months)[month]
+            + 0.3 * treated * post
+            + 0.1 * xc
+            + rng.normal(size=n_rows)
+        )
+        df = pd.DataFrame(
+            {
+                "y": y,
+                "treated": treated,
+                "post": post,
+                "xc": xc,
+                "state": state,
+                "month": month,
+                "w": np.ones(n_rows),
+            }
+        )
+        # BRR-style replicates; replicate 1 zeroes states {0, 1} so xc is
+        # constant (spanned by state FE) on its effective sample
+        halves = [
+            np.array([1, 0, 1, 0, 1, 0, 1, 0]),
+            np.array([0, 1, 0, 1, 1, 0, 0, 1]),
+            np.array([1, 1, 0, 0, 0, 1, 1, 0]),
+            np.array([0, 0, 1, 1, 0, 1, 0, 1]),
+        ]
+        for i, h in enumerate(halves, 1):
+            df[f"rw{i}"] = 2.0 * h[state]
+        design = SurveyDesign(
+            weights="w",
+            replicate_weights=[f"rw{i}" for i in range(1, 5)],
+            replicate_method="BRR",
+        )
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            res = DifferenceInDifferences().fit(
+                df,
+                outcome="y",
+                treatment="treated",
+                time="post",
+                absorb=["state", "month"],
+                covariates=["xc"],
+                survey_design=design,
+            )
+        # fail-closed contract: inference is finite (invalid replicates are
+        # dropped by compute_replicate_refit_variance), never junk-inflated
+        assert np.isfinite(res.att)
+        assert np.isfinite(res.se) and res.se > 0
+
+
+class TestJointSpanCovariateSnap:
+    """Local-review P0 end-to-end: a covariate exactly in the joint span of
+    the absorbed dimensions must be dropped (NaN + warning) and leave the
+    identified ATT untouched, even when MAP truncation masks it from the
+    fast-path norm test (unbalanced, correlated FE incidence)."""
+
+    @staticmethod
+    def _frame(seed=7):
+        rng = np.random.default_rng(seed)
+        n_units, n_periods, span = 150, 30, 9
+        unit = np.repeat(np.arange(n_units), n_periods)
+        time = np.tile(np.arange(n_periods), n_units)
+        entry = rng.integers(0, n_periods - span, n_units)
+        keep = (time >= entry[unit]) & (time < entry[unit] + span)
+        unit, time = unit[keep], time[keep]
+        treated = (unit < n_units // 2).astype(int)
+        post = (time >= n_periods // 2).astype(int)
+        y = 0.4 * treated * post + rng.normal(size=unit.size)
+        a = rng.normal(0, 1, n_units)
+        b = rng.normal(0, 1, n_periods)
+        return pd.DataFrame(
+            {
+                "y": y,
+                "treated": treated,
+                "post": post,
+                "unit": unit,
+                "time": time,
+                "xspan": a[unit] + b[time],
+            }
+        )
+
+    def test_did_absorb_joint_span_covariate_dropped_att_stable(self):
+        df = self._frame()
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            base = DifferenceInDifferences().fit(
+                df,
+                outcome="y",
+                treatment="treated",
+                time="post",
+                absorb=["unit", "time"],
+            )
+        with pytest.warns(UserWarning, match=r"xspan.*collinear with the absorbed"):
+            res = DifferenceInDifferences().fit(
+                df,
+                outcome="y",
+                treatment="treated",
+                time="post",
+                absorb=["unit", "time"],
+                covariates=["xspan"],
+            )
+        assert np.isnan(res.coefficients["xspan"])
+        assert res.att == pytest.approx(base.att, abs=1e-10)
+
+    def test_sun_abraham_joint_span_covariate_dropped_att_stable(self):
+        from diff_diff import SunAbraham
+
+        df = self._frame(seed=8)
+        rng = np.random.default_rng(9)
+        first = np.where(np.arange(df["unit"].nunique()) % 3 == 0, 0, 15)[
+            df["unit"].values % df["unit"].nunique()
+        ]
+        df = df.assign(first_treat=first)
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            base = SunAbraham().fit(
+                df, outcome="y", unit="unit", time="time", first_treat="first_treat"
+            )
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            res = SunAbraham().fit(
+                df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+                covariates=["xspan"],
+            )
+        assert res.att == pytest.approx(base.att, abs=1e-8)
diff --git a/tests/test_sun_abraham.py b/tests/test_sun_abraham.py
index 21c21d573..28fc55a16 100644
--- a/tests/test_sun_abraham.py
+++ b/tests/test_sun_abraham.py
@@ -1968,3 +1968,47 @@ def test_bootstrap_finite_and_point_estimates_invariant(self, ci_params):
                 f"{vt}={results[vt].overall_att}. Bootstrap refits in the new "
                 "full-dummy branch are not FWL-invariant."
             )
+
+
+class TestAbsorbedCovariateSnap:
+    """A unit-constant covariate is spanned by the unit FE: it must snap to
+    zero (deterministic drop + cause warning) instead of leaving a junk
+    column that perturbs the IW estimates (REGISTRY 'Absorbed FE')."""
+
+    @staticmethod
+    def _panel(seed=0, with_cov=False):
+        rng = np.random.default_rng(seed)
+        n_units, n_periods = 90, 8
+        unit = np.repeat(np.arange(n_units), n_periods)
+        time = np.tile(np.arange(n_periods), n_units)
+        first = np.repeat(rng.choice([0, 4, 5], n_units), n_periods)
+        d = (first > 0) & (time >= first)
+        y = 0.3 * d + rng.normal(size=unit.size)
+        df = pd.DataFrame({"y": y, "unit": unit, "time": time, "first_treat": first})
+        if with_cov:
+            df["xc"] = np.repeat(rng.normal(size=n_units), n_periods)
+        return df
+
+    def test_unit_constant_covariate_snaps_att_unaffected(self):
+        base_df = self._panel(seed=42)
+        cov_df = self._panel(seed=42, with_cov=True)
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            base = SunAbraham().fit(
+                base_df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+            )
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            res = SunAbraham().fit(
+                cov_df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+                covariates=["xc"],
+            )
+        # snapped covariate -> dropped -> identical to the no-covariate design
+        assert res.att == pytest.approx(base.att, abs=1e-10)
diff --git a/tests/test_utils.py b/tests/test_utils.py
index 3edfa69df..ae2144566 100644
--- a/tests/test_utils.py
+++ b/tests/test_utils.py
@@ -11,6 +11,8 @@
 - Placebo effects for Synthetic DiD
 """
 
+import warnings
+
 import numpy as np
 import pandas as pd
 import pytest
@@ -47,11 +49,13 @@ def simple_regression_data():
     """Create simple regression data for testing robust SE."""
     np.random.seed(42)
     n = 100
-    X = np.column_stack([
-        np.ones(n),
-        np.random.randn(n),
-        np.random.randn(n),
-    ])
+    X = np.column_stack(
+        [
+            np.ones(n),
+            np.random.randn(n),
+            np.random.randn(n),
+        ]
+    )
     beta_true = np.array([1.0, 2.0, -1.0])
     # Heteroskedastic errors
     errors = np.random.randn(n) * (1 + np.abs(X[:, 1]))
@@ -70,10 +74,12 @@ def clustered_regression_data():
     cluster_ids = np.repeat(np.arange(n_clusters), obs_per_cluster)
     cluster_effects = np.random.randn(n_clusters)
 
-    X = np.column_stack([
-        np.ones(n),
-        np.random.randn(n),
-    ])
+    X = np.column_stack(
+        [
+            np.ones(n),
+            np.random.randn(n),
+        ]
+    )
 
     beta_true = np.array([5.0, 2.0])
     # Cluster-correlated errors
@@ -106,12 +112,14 @@ def parallel_trends_data():
 
             y += np.random.normal(0, 0.5)
 
-            data.append({
-                "unit": unit,
-                "period": period,
-                "treated": int(is_treated),
-                "outcome": y,
-            })
+            data.append(
+                {
+                    "unit": unit,
+                    "period": period,
+                    "treated": int(is_treated),
+                    "outcome": y,
+                }
+            )
 
     return pd.DataFrame(data)
 
@@ -143,12 +151,14 @@ def non_parallel_trends_data():
 
             y += np.random.normal(0, 0.5)
 
-            data.append({
-                "unit": unit,
-                "period": period,
-                "treated": int(is_treated),
-                "outcome": y,
-            })
+            data.append(
+                {
+                    "unit": unit,
+                    "period": period,
+                    "treated": int(is_treated),
+                    "outcome": y,
+                }
+            )
 
     return pd.DataFrame(data)
 
@@ -169,12 +179,14 @@ def sdid_panel_data():
         unit_effect = np.random.normal(0, 2)
         for period in range(n_pre + n_post):
             y = 10.0 + unit_effect + period * 0.5 + np.random.normal(0, 0.3)
-            data.append({
-                "unit": unit,
-                "period": period,
-                "treated": 0,
-                "outcome": y,
-            })
+            data.append(
+                {
+                    "unit": unit,
+                    "period": period,
+                    "treated": 0,
+                    "outcome": y,
+                }
+            )
 
     # Treated units
     for unit in range(n_control, n_control + n_treated):
@@ -184,12 +196,14 @@ def sdid_panel_data():
             if period >= n_pre:
                 y += 3.0  # Treatment effect
             y += np.random.normal(0, 0.3)
-            data.append({
-                "unit": unit,
-                "period": period,
-                "treated": 1,
-                "outcome": y,
-            })
+            data.append(
+                {
+                    "unit": unit,
+                    "period": period,
+                    "treated": 1,
+                    "outcome": y,
+                }
+            )
 
     return pd.DataFrame(data)
 
@@ -640,7 +654,7 @@ def test_returns_expected_keys(self, parallel_trends_data):
             outcome="outcome",
             time="period",
             treatment_group="treated",
-            pre_periods=[0, 1, 2]
+            pre_periods=[0, 1, 2],
         )
 
         expected_keys = [
@@ -665,7 +679,7 @@ def test_parallel_trends_detected(self, parallel_trends_data):
             outcome="outcome",
             time="period",
             treatment_group="treated",
-            pre_periods=[0, 1, 2]
+            pre_periods=[0, 1, 2],
         )
 
         # Should not reject parallel trends
@@ -679,7 +693,7 @@ def test_non_parallel_trends_detected(self, non_parallel_trends_data):
             outcome="outcome",
             time="period",
             treatment_group="treated",
-            pre_periods=[0, 1, 2]
+            pre_periods=[0, 1, 2],
         )
 
         # Should reject parallel trends (different slopes)
@@ -693,7 +707,7 @@ def test_trend_difference_sign(self, non_parallel_trends_data):
             outcome="outcome",
             time="period",
             treatment_group="treated",
-            pre_periods=[0, 1, 2]
+            pre_periods=[0, 1, 2],
         )
 
         # Treated has steeper trend (3.0 vs 1.0), so difference should be positive
@@ -705,7 +719,7 @@ def test_auto_infer_pre_periods(self, parallel_trends_data):
             parallel_trends_data,
             outcome="outcome",
             time="period",
-            treatment_group="treated"
+            treatment_group="treated",
             # pre_periods not specified
         )
 
@@ -715,18 +729,16 @@ def test_auto_infer_pre_periods(self, parallel_trends_data):
 
     def test_single_period_returns_nan(self):
         """Test that single pre-period returns NaN for trends."""
-        data = pd.DataFrame({
-            "outcome": [10, 11, 12, 13],
-            "period": [0, 0, 0, 0],  # All same period
-            "treated": [1, 1, 0, 0],
-        })
+        data = pd.DataFrame(
+            {
+                "outcome": [10, 11, 12, 13],
+                "period": [0, 0, 0, 0],  # All same period
+                "treated": [1, 1, 0, 0],
+            }
+        )
 
         results = check_parallel_trends(
-            data,
-            outcome="outcome",
-            time="period",
-            treatment_group="treated",
-            pre_periods=[0]
+            data, outcome="outcome", time="period", treatment_group="treated", pre_periods=[0]
         )
 
         # Cannot compute trend with single period
@@ -740,7 +752,7 @@ def test_standard_errors_positive(self, parallel_trends_data):
             outcome="outcome",
             time="period",
             treatment_group="treated",
-            pre_periods=[0, 1, 2]
+            pre_periods=[0, 1, 2],
         )
 
         assert results["treated_trend_se"] > 0
@@ -761,11 +773,7 @@ def test_with_unit_specified(self, parallel_trends_data):
         pre_data = parallel_trends_data[parallel_trends_data["period"] < 3]
 
         treated_changes, control_changes = _compute_outcome_changes(
-            pre_data,
-            outcome="outcome",
-            time="period",
-            treatment_group="treated",
-            unit="unit"
+            pre_data, outcome="outcome", time="period", treatment_group="treated", unit="unit"
         )
 
         # Should have changes for each unit-period transition
@@ -777,11 +785,7 @@ def test_without_unit_specified(self, parallel_trends_data):
         pre_data = parallel_trends_data[parallel_trends_data["period"] < 3]
 
         treated_changes, control_changes = _compute_outcome_changes(
-            pre_data,
-            outcome="outcome",
-            time="period",
-            treatment_group="treated",
-            unit=None
+            pre_data, outcome="outcome", time="period", treatment_group="treated", unit=None
         )
 
         # Should have aggregate changes (fewer than with unit)
@@ -793,11 +797,7 @@ def test_returns_float_arrays(self, parallel_trends_data):
         pre_data = parallel_trends_data[parallel_trends_data["period"] < 3]
 
         treated_changes, control_changes = _compute_outcome_changes(
-            pre_data,
-            outcome="outcome",
-            time="period",
-            treatment_group="treated",
-            unit="unit"
+            pre_data, outcome="outcome", time="period", treatment_group="treated", unit="unit"
         )
 
         assert treated_changes.dtype == np.float64
@@ -811,21 +811,19 @@ def test_changes_reflect_trend(self):
             is_treated = unit < 5
             for period in range(3):
                 y = 10.0 + period * 2.0  # Trend of 2.0 per period
-                data.append({
-                    "unit": unit,
-                    "period": period,
-                    "treated": int(is_treated),
-                    "outcome": y,
-                })
+                data.append(
+                    {
+                        "unit": unit,
+                        "period": period,
+                        "treated": int(is_treated),
+                        "outcome": y,
+                    }
+                )
 
         df = pd.DataFrame(data)
 
         treated_changes, control_changes = _compute_outcome_changes(
-            df,
-            outcome="outcome",
-            time="period",
-            treatment_group="treated",
-            unit="unit"
+            df, outcome="outcome", time="period", treatment_group="treated", unit="unit"
         )
 
         # All changes should be approximately 2.0
@@ -841,17 +839,24 @@ def test_silent_on_balanced_panel(self):
         for unit in range(10):
             treated = int(unit >= 5)
             for t in range(1, 5):
-                rows.append({
-                    "unit": unit, "period": t,
-                    "treated": treated, "outcome": rng.normal(),
-                })
+                rows.append(
+                    {
+                        "unit": unit,
+                        "period": t,
+                        "treated": treated,
+                        "outcome": rng.normal(),
+                    }
+                )
         df = pd.DataFrame(rows)
 
         with warnings.catch_warnings(record=True) as w:
             warnings.simplefilter("always")
             _compute_outcome_changes(
-                df, outcome="outcome", time="period",
-                treatment_group="treated", unit="unit",
+                df,
+                outcome="outcome",
+                time="period",
+                treatment_group="treated",
+                unit="unit",
             )
 
         # Generic filter on "dropped" catches both the old and new label so a
@@ -867,10 +872,14 @@ def test_warns_on_nan_outcomes_with_excess_drop_count(self):
         for unit in range(10):
             treated = int(unit >= 5)
             for t in range(1, 5):
-                rows.append({
-                    "unit": unit, "period": t,
-                    "treated": treated, "outcome": rng.normal(),
-                })
+                rows.append(
+                    {
+                        "unit": unit,
+                        "period": t,
+                        "treated": treated,
+                        "outcome": rng.normal(),
+                    }
+                )
         df = pd.DataFrame(rows)
         df.loc[[5, 12, 22], "outcome"] = np.nan
 
@@ -879,8 +888,11 @@ def test_warns_on_nan_outcomes_with_excess_drop_count(self):
             match=r"parallel-trend diagnostic: dropped \d+ row\(s\).*additional NaN first-differences",
         ):
             _compute_outcome_changes(
-                df, outcome="outcome", time="period",
-                treatment_group="treated", unit="unit",
+                df,
+                outcome="outcome",
+                time="period",
+                treatment_group="treated",
+                unit="unit",
             )
 
     def test_warning_label_reflects_public_caller(self):
@@ -892,24 +904,35 @@ def test_warning_label_reflects_public_caller(self):
         for unit in range(10):
             treated = int(unit >= 5)
             for t in range(1, 5):
-                rows.append({
-                    "unit": unit, "period": t,
-                    "treated": treated, "outcome": rng.normal(),
-                })
+                rows.append(
+                    {
+                        "unit": unit,
+                        "period": t,
+                        "treated": treated,
+                        "outcome": rng.normal(),
+                    }
+                )
         df = pd.DataFrame(rows)
         df.loc[[5, 12, 22], "outcome"] = np.nan
 
         with pytest.warns(UserWarning, match="check_parallel_trends_robust:"):
             check_parallel_trends_robust(
-                df, outcome="outcome", time="period",
-                treatment_group="treated", unit="unit",
-                n_permutations=100, seed=0,
+                df,
+                outcome="outcome",
+                time="period",
+                treatment_group="treated",
+                unit="unit",
+                n_permutations=100,
+                seed=0,
             )
 
         with pytest.warns(UserWarning, match="equivalence_test_trends:"):
             equivalence_test_trends(
-                df, outcome="outcome", time="period",
-                treatment_group="treated", unit="unit",
+                df,
+                outcome="outcome",
+                time="period",
+                treatment_group="treated",
+                unit="unit",
             )
 
 
@@ -930,7 +953,7 @@ def test_reproducibility_with_seed(self, parallel_trends_data):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0, 1, 2],
-            seed=42
+            seed=42,
         )
 
         results2 = check_parallel_trends_robust(
@@ -940,7 +963,7 @@ def test_reproducibility_with_seed(self, parallel_trends_data):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0, 1, 2],
-            seed=42
+            seed=42,
         )
 
         assert results1["wasserstein_p_value"] == results2["wasserstein_p_value"]
@@ -954,7 +977,7 @@ def test_different_seeds_different_results(self, parallel_trends_data):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0, 1, 2],
-            seed=42
+            seed=42,
         )
 
         results2 = check_parallel_trends_robust(
@@ -964,7 +987,7 @@ def test_different_seeds_different_results(self, parallel_trends_data):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0, 1, 2],
-            seed=123
+            seed=123,
         )
 
         # May be equal by chance but typically different
@@ -982,7 +1005,7 @@ def test_n_permutations_affects_precision(self, parallel_trends_data):
             unit="unit",
             pre_periods=[0, 1, 2],
             n_permutations=100,
-            seed=42
+            seed=42,
         )
 
         results_many = check_parallel_trends_robust(
@@ -993,7 +1016,7 @@ def test_n_permutations_affects_precision(self, parallel_trends_data):
             unit="unit",
             pre_periods=[0, 1, 2],
             n_permutations=1000,
-            seed=42
+            seed=42,
         )
 
         # Both should be valid
@@ -1009,7 +1032,7 @@ def test_wasserstein_normalized_returned(self, parallel_trends_data):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0, 1, 2],
-            seed=42
+            seed=42,
         )
 
         assert "wasserstein_normalized" in results
@@ -1024,7 +1047,7 @@ def test_sample_sizes_returned(self, parallel_trends_data):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0, 1, 2],
-            seed=42
+            seed=42,
         )
 
         assert "n_treated" in results
@@ -1035,12 +1058,14 @@ def test_sample_sizes_returned(self, parallel_trends_data):
     def test_insufficient_data_returns_nan(self):
         """Test that insufficient data returns NaN values."""
         # Only one observation per group
-        data = pd.DataFrame({
-            "unit": [0, 1],
-            "period": [0, 0],
-            "treated": [1, 0],
-            "outcome": [10.0, 12.0],
-        })
+        data = pd.DataFrame(
+            {
+                "unit": [0, 1],
+                "period": [0, 0],
+                "treated": [1, 0],
+                "outcome": [10.0, 12.0],
+            }
+        )
 
         results = check_parallel_trends_robust(
             data,
@@ -1049,7 +1074,7 @@ def test_insufficient_data_returns_nan(self):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0],
-            seed=42
+            seed=42,
         )
 
         assert np.isnan(results["wasserstein_distance"])
@@ -1072,7 +1097,7 @@ def test_tost_p_value_in_range(self, parallel_trends_data):
             time="period",
             treatment_group="treated",
             unit="unit",
-            pre_periods=[0, 1, 2]
+            pre_periods=[0, 1, 2],
         )
 
         assert 0 <= results["tost_p_value"] <= 1
@@ -1085,7 +1110,7 @@ def test_equivalence_margin_auto_set(self, parallel_trends_data):
             time="period",
             treatment_group="treated",
             unit="unit",
-            pre_periods=[0, 1, 2]
+            pre_periods=[0, 1, 2],
         )
 
         assert results["equivalence_margin"] > 0
@@ -1098,7 +1123,7 @@ def test_degrees_of_freedom_returned(self, parallel_trends_data):
             time="period",
             treatment_group="treated",
             unit="unit",
-            pre_periods=[0, 1, 2]
+            pre_periods=[0, 1, 2],
         )
 
         assert "degrees_of_freedom" in results
@@ -1113,7 +1138,7 @@ def test_tighter_margin_harder_to_pass(self, parallel_trends_data):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0, 1, 2],
-            equivalence_margin=10.0  # Very wide margin
+            equivalence_margin=10.0,  # Very wide margin
         )
 
         results_tight = equivalence_test_trends(
@@ -1123,7 +1148,7 @@ def test_tighter_margin_harder_to_pass(self, parallel_trends_data):
             treatment_group="treated",
             unit="unit",
             pre_periods=[0, 1, 2],
-            equivalence_margin=0.001  # Very tight margin
+            equivalence_margin=0.001,  # Very tight margin
         )
 
         # Wide margin should have smaller TOST p-value (easier to show equivalence)
@@ -1258,9 +1283,7 @@ def test_uniform_weights_equals_did(self):
         time_weights = np.array([1.0])
 
         tau = compute_sdid_estimator(
-            Y_pre_control, Y_post_control,
-            Y_pre_treated, Y_post_treated,
-            unit_weights, time_weights
+            Y_pre_control, Y_post_control, Y_pre_treated, Y_post_treated, unit_weights, time_weights
         )
 
         # Standard DiD: (16-10) - (12-10) = 6 - 2 = 4
@@ -1278,9 +1301,7 @@ def test_concentrated_unit_weights(self):
         time_weights = np.array([1.0])
 
         tau = compute_sdid_estimator(
-            Y_pre_control, Y_post_control,
-            Y_pre_treated, Y_post_treated,
-            unit_weights, time_weights
+            Y_pre_control, Y_post_control, Y_pre_treated, Y_post_treated, unit_weights, time_weights
         )
 
         # DiD using only first control: (20-15) - (12-10) = 5 - 2 = 3
@@ -1297,9 +1318,7 @@ def test_multiple_post_periods(self):
         time_weights = np.array([1.0])
 
         tau = compute_sdid_estimator(
-            Y_pre_control, Y_post_control,
-            Y_pre_treated, Y_post_treated,
-            unit_weights, time_weights
+            Y_pre_control, Y_post_control, Y_pre_treated, Y_post_treated, unit_weights, time_weights
         )
 
         # Treated post mean: (17+19+21)/3 = 19
@@ -1399,12 +1418,7 @@ def test_matches_get_dummies(self, col):
 def _unbalanced_2way_panel(seed=0, drop=0.30):
     """Unbalanced (non-orthogonal) 2-way panel: some unit-period cells dropped."""
     rng = np.random.default_rng(seed)
-    rows = [
-        (u, t)
-        for u in range(8)
-        for t in range(6)
-        if rng.random() >= drop
-    ]
+    rows = [(u, t) for u in range(8) for t in range(6) if rng.random() >= drop]
     df = pd.DataFrame(rows, columns=["unit", "period"])
     n = len(df)
     df["x1"] = rng.normal(size=n)
@@ -1445,10 +1459,14 @@ def _fwl_slopes(demeaned, xcols, weights=None):
 def _frozen_old_within_transform_weighted(
     df, variables, unit, time, weights, max_iter=100, tol=1e-8
 ):
-    """Byte-for-byte copy of the PRE-REFACTOR within_transform weighted loop.
-
-    Used as the byte-identity guard: demean_by_groups([unit, time], weighted) and
-    the refactored within_transform must reproduce this exactly.
+    """Byte-for-byte copy of the PRE-v3.6.x pandas-groupby weighted MAP loop.
+
+    Historical reference implementation. The v3.6.x factorize-once + bincount
+    engine is NOT bit-identical to this loop (bincount accumulation is not
+    Kahan-compensated the way pandas' grouped mean is; drift compounds across
+    MAP iterations to ~1e-10 order — see REGISTRY "Absorbed Fixed Effects").
+    Kept as a drift-bound guard: the new engine must agree with this loop at
+    atol=1e-9, and with full-dummy WLS ground truth at atol=1e-9.
     """
     w = np.asarray(weights, dtype=np.float64)
     unit_groups = df[unit].values
@@ -1479,19 +1497,11 @@ def test_len1_byte_identical_to_demean_by_group(self, weighted):
         """One grouping var must delegate to demean_by_group (byte-identical)."""
         df = _unbalanced_2way_panel(seed=1)
         w = df["w"].values if weighted else None
-        out_groups, n_g = demean_by_groups(
-            df, ["y", "x1"], ["unit"], suffix="_dm", weights=w
-        )
-        out_single, n_s = demean_by_group(
-            df, ["y", "x1"], "unit", suffix="_dm", weights=w
-        )
+        out_groups, n_g = demean_by_groups(df, ["y", "x1"], ["unit"], suffix="_dm", weights=w)
+        out_single, n_s = demean_by_group(df, ["y", "x1"], "unit", suffix="_dm", weights=w)
         assert n_g == n_s
-        np.testing.assert_array_equal(
-            out_groups["y_dm"].values, out_single["y_dm"].values
-        )
-        np.testing.assert_array_equal(
-            out_groups["x1_dm"].values, out_single["x1_dm"].values
-        )
+        np.testing.assert_array_equal(out_groups["y_dm"].values, out_single["y_dm"].values)
+        np.testing.assert_array_equal(out_groups["x1_dm"].values, out_single["x1_dm"].values)
 
     def test_n_effects_is_sum_nunique_minus_one(self):
         df = _unbalanced_2way_panel(seed=2)
@@ -1515,19 +1525,12 @@ def test_unbalanced_2way_matches_full_dummy_ols(self, weighted):
     def test_n3_absorb_matches_full_dummy_ols(self):
         """Generalizes to 3 absorbed dimensions."""
         rng = np.random.default_rng(4)
-        rows = [
-            (u, t, (u + t) % 4)
-            for u in range(10)
-            for t in range(6)
-            if rng.random() >= 0.4
-        ]
+        rows = [(u, t, (u + t) % 4) for u in range(10) for t in range(6) if rng.random() >= 0.4]
         df = pd.DataFrame(rows, columns=["unit", "period", "firm"])
         n = len(df)
         df["x1"] = rng.normal(size=n)
         df["y"] = 2.0 * df["x1"] + rng.normal(size=n)
-        out, _ = demean_by_groups(
-            df, ["y", "x1"], ["unit", "period", "firm"], suffix="_dm"
-        )
+        out, _ = demean_by_groups(df, ["y", "x1"], ["unit", "period", "firm"], suffix="_dm")
         demeaned = {c: out[f"{c}_dm"].values for c in ("y", "x1")}
         map_slope = _fwl_slopes(demeaned, ["x1"])[0]
         gt = _full_dummy_slopes(df, ["unit", "period", "firm"], ["x1"])[0]
@@ -1538,9 +1541,7 @@ def test_result_orthogonal_to_fe_spans(self, weighted):
         """Demeaned variables must have ~0 (weighted) group means in every FE dim."""
         df = _unbalanced_2way_panel(seed=5)
         w = df["w"].values if weighted else None
-        out, _ = demean_by_groups(
-            df, ["y"], ["unit", "period"], suffix="_dm", weights=w, tol=1e-12
-        )
+        out, _ = demean_by_groups(df, ["y"], ["unit", "period"], suffix="_dm", weights=w, tol=1e-12)
         ydm = out["y_dm"].values
         for g in ("unit", "period"):
             if weighted:
@@ -1551,29 +1552,38 @@ def test_result_orthogonal_to_fe_spans(self, weighted):
                 means = pd.Series(ydm).groupby(df[g].values).transform("mean").values
             assert np.max(np.abs(means)) < 1e-9
 
-    def test_weighted_byte_identity_vs_frozen_within_transform(self):
-        """demean_by_groups([unit, time], weighted) reproduces the old loop exactly."""
+    def test_weighted_close_to_frozen_pandas_loop(self):
+        """demean_by_groups([unit, time], weighted) agrees with the legacy
+        pandas-groupby loop at the documented ~1e-9 drift bound (bincount
+        accumulation vs Kahan-compensated pandas mean; not bit-identical)."""
         df = _unbalanced_2way_panel(seed=6)
         w = df["w"].values
-        frozen = _frozen_old_within_transform_weighted(
-            df, ["y", "x1", "x2"], "unit", "period", w
-        )
+        frozen = _frozen_old_within_transform_weighted(df, ["y", "x1", "x2"], "unit", "period", w)
         out, _ = demean_by_groups(
             df, ["y", "x1", "x2"], ["unit", "period"], suffix="_dm", weights=w, tol=1e-8
         )
         for var in ("y", "x1", "x2"):
-            np.testing.assert_array_equal(out[f"{var}_dm"].values, frozen[var])
+            np.testing.assert_allclose(out[f"{var}_dm"].values, frozen[var], atol=1e-9, rtol=0)
 
-    def test_within_transform_weighted_unchanged(self):
-        """The refactored within_transform weighted path is byte-identical to before."""
+    def test_within_transform_weighted_close_to_frozen_loop(self):
+        """within_transform weighted path agrees with the legacy loop at 1e-9."""
         df = _unbalanced_2way_panel(seed=7)
         w = df["w"].values
-        frozen = _frozen_old_within_transform_weighted(
-            df, ["y", "x1"], "unit", "period", w
-        )
+        frozen = _frozen_old_within_transform_weighted(df, ["y", "x1"], "unit", "period", w)
         out = within_transform(df, ["y", "x1"], "unit", "period", weights=w)
-        np.testing.assert_array_equal(out["y_demeaned"].values, frozen["y"])
-        np.testing.assert_array_equal(out["x1_demeaned"].values, frozen["x1"])
+        np.testing.assert_allclose(out["y_demeaned"].values, frozen["y"], atol=1e-9, rtol=0)
+        np.testing.assert_allclose(out["x1_demeaned"].values, frozen["x1"], atol=1e-9, rtol=0)
+
+    def test_within_transform_weighted_matches_full_dummy_wls(self):
+        """Ground truth (not implementation identity): weighted within_transform
+        residualization must reproduce full-dummy WLS slopes."""
+        df = _unbalanced_2way_panel(seed=7)
+        w = df["w"].values
+        out = within_transform(df, ["y", "x1", "x2"], "unit", "period", weights=w)
+        demeaned = {c: out[f"{c}_demeaned"].values for c in ("y", "x1", "x2")}
+        map_slopes = _fwl_slopes(demeaned, ["x1", "x2"], weights=w)
+        gt = _full_dummy_slopes(df, ["unit", "period"], ["x1", "x2"], weights=w)
+        np.testing.assert_allclose(map_slopes, gt, atol=1e-9, rtol=0)
 
     def test_within_transform_unweighted_now_matches_full_dummy(self):
         """Unweighted within_transform now uses MAP -> exact on unbalanced panels."""
@@ -1588,11 +1598,323 @@ def test_nonconvergence_emits_warning(self):
         """A starved iteration budget on an unbalanced panel warns (not silent)."""
         df = _unbalanced_2way_panel(seed=9)
         with pytest.warns(UserWarning, match="did not converge"):
-            demean_by_groups(
-                df, ["y"], ["unit", "period"], suffix="_dm", max_iter=1, tol=1e-15
-            )
+            demean_by_groups(df, ["y"], ["unit", "period"], suffix="_dm", max_iter=1, tol=1e-15)
 
     def test_empty_group_vars_raises(self):
         df = _unbalanced_2way_panel(seed=10)
         with pytest.raises(ValueError, match="at least one grouping variable"):
             demean_by_groups(df, ["y"], [])
+
+    @pytest.mark.parametrize("weighted", [False, True])
+    def test_nan_group_key_raises(self, weighted):
+        """NaN in an absorbed group column must raise, naming the column.
+
+        pd.factorize codes NaN as -1, which would silently index the LAST
+        group's mean; the pre-v3.6.x behavior was also silently wrong
+        (unweighted NaN-poisoned rows; weighted passed them through
+        un-demeaned). REGISTRY "Absorbed Fixed Effects" edge case.
+        """
+        df = _unbalanced_2way_panel(seed=11)
+        df.loc[df.index[5], "period"] = np.nan
+        w = df["w"].values if weighted else None
+        with pytest.raises(ValueError, match="'period' contains NaN"):
+            demean_by_groups(df, ["y"], ["unit", "period"], suffix="_dm", weights=w)
+
+    def test_zero_total_weight_group_rows_inert_and_finite(self):
+        """Zero-total-weight groups stay inert (no NaN/Inf poisoning) and
+        positive-weight groups remain weighted-orthogonal."""
+        df = _unbalanced_2way_panel(seed=12)
+        w = df["w"].values.copy()
+        zero_units = df["unit"].values == df["unit"].values[0]
+        w[zero_units] = 0.0
+        out, _ = demean_by_groups(df, ["y"], ["unit", "period"], suffix="_dm", weights=w, tol=1e-12)
+        ydm = out["y_dm"].values
+        assert np.isfinite(ydm).all()
+        pos = ~zero_units
+        num = pd.Series(w * ydm).groupby(df["unit"].values).transform("sum").values
+        den = pd.Series(w).groupby(df["unit"].values).transform("sum").values
+        means = np.divide(num, den, out=np.zeros_like(num), where=den > 0)
+        assert np.max(np.abs(means[pos])) < 1e-9
+
+    def test_defaults_max_iter_10000_both_entry_points(self):
+        """Both re-declared max_iter defaults must stay in sync at 10,000
+        (fixest fixef.iter / pyfixest fixef_maxiter parity); raising only one
+        would leave the within_transform family capped differently."""
+        import inspect
+
+        assert inspect.signature(demean_by_groups).parameters["max_iter"].default == 10_000
+        assert inspect.signature(within_transform).parameters["max_iter"].default == 10_000
+
+    def test_balanced_panel_converges_within_two_iterations(self):
+        """Balanced fully-crossed panels have orthogonal FE subspaces: MAP must
+        converge in ~2 sweeps (iteration-count regression guard)."""
+        rng = np.random.default_rng(13)
+        df = pd.DataFrame(
+            {
+                "unit": np.repeat(np.arange(30), 8),
+                "period": np.tile(np.arange(8), 30),
+            }
+        )
+        df["y"] = rng.normal(size=len(df))
+        with warnings.catch_warnings():
+            warnings.simplefilter("error")  # a convergence warning fails the test
+            demean_by_groups(df, ["y"], ["unit", "period"], suffix="_dm", max_iter=2, tol=1e-10)
+
+    def test_correlated_fe_needs_over_100_iterations_and_converges(self):
+        """The headline max_iter change: banded (contiguous-lifetime) two-way
+        incidence genuinely needs >100 MAP iterations. Under the old cap of 100
+        this warned and returned slightly-off residuals; under the 10,000
+        default it must converge silently AND match full-dummy OLS.
+
+        Iteration count depends on the angle between the FE subspaces, not on
+        data size, so this fixture stays small and fast.
+        """
+        rng = np.random.default_rng(14)
+        n_units, n_periods = 300, 60
+        span = max(2, int(n_periods * 0.1))  # 10% contiguous lifetimes
+        entry = rng.integers(0, n_periods - span + 1, n_units)
+        rows = [(u, t) for u in range(n_units) for t in range(entry[u], entry[u] + span)]
+        df = pd.DataFrame(rows, columns=["unit", "period"])
+        df["x1"] = rng.normal(size=len(df))
+        df["y"] = 1.5 * df["x1"] + rng.normal(size=len(df))
+
+        # the fixture is honest: the old cap genuinely fails on it
+        with pytest.warns(UserWarning, match="did not converge"):
+            demean_by_groups(df.copy(), ["y", "x1"], ["unit", "period"], suffix="_dm", max_iter=100)
+
+        # the new default converges silently and matches ground truth
+        with warnings.catch_warnings():
+            warnings.simplefilter("error")
+            out, _ = demean_by_groups(df, ["y", "x1"], ["unit", "period"], suffix="_dm")
+        demeaned = {c: out[f"{c}_dm"].values for c in ("y", "x1")}
+        map_slope = _fwl_slopes(demeaned, ["x1"])[0]
+        gt = _full_dummy_slopes(df, ["unit", "period"], ["x1"])[0]
+        np.testing.assert_allclose(map_slope, gt, atol=1e-8, rtol=0)
+
+
+class TestSnapAbsorbedRegressors:
+    """Unit tests for the FE-spanned-regressor snap (REGISTRY 'Absorbed FE')."""
+
+    def _frame(self, seed=0):
+        rng = np.random.default_rng(seed)
+        n = 40
+        return pd.DataFrame(
+            {
+                "junk_dm": rng.normal(0.0, 1e-14, n),  # spanned: numerical junk
+                "real_dm": rng.normal(0.0, 1.0, n),  # genuinely identified
+                "g": np.arange(n) % 4,  # FE column for group_vars
+            }
+        )
+
+    def test_snaps_junk_keeps_real(self):
+        from diff_diff.utils import snap_absorbed_regressors
+
+        df = self._frame()
+        real_before = df["real_dm"].values.copy()
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            snapped = snap_absorbed_regressors(
+                df,
+                ["junk", "real"],
+                {"junk": 1.0, "real": 1.0},
+                absorbed_desc="test FEs",
+                group_vars=["g"],
+                suffix="_dm",
+            )
+        assert snapped == ["junk"]
+        assert (df["junk_dm"].values == 0.0).all()
+        np.testing.assert_array_equal(df["real_dm"].values, real_before)
+
+    @pytest.mark.parametrize("action", ["silent", "error"])
+    def test_non_warn_actions_snap_without_warning(self, action):
+        from diff_diff.utils import snap_absorbed_regressors
+
+        df = self._frame()
+        with warnings.catch_warnings():
+            warnings.simplefilter("error")  # any warning fails the test
+            snapped = snap_absorbed_regressors(
+                df,
+                ["junk"],
+                {"junk": 1.0},
+                absorbed_desc="test FEs",
+                group_vars=["g"],
+                rank_deficient_action=action,
+                suffix="_dm",
+            )
+        assert snapped == ["junk"]
+        assert (df["junk_dm"].values == 0.0).all()
+
+    def test_zero_pre_norm_skipped(self):
+        from diff_diff.utils import snap_absorbed_regressors
+
+        df = self._frame()
+        with warnings.catch_warnings():
+            warnings.simplefilter("error")
+            snapped = snap_absorbed_regressors(
+                df,
+                ["junk"],
+                {"junk": 0.0},  # all-zero input column: rank handling owns it
+                absorbed_desc="test FEs",
+                group_vars=["g"],
+                suffix="_dm",
+            )
+        assert snapped == []
+
+    def test_display_names_in_warning(self):
+        from diff_diff.utils import snap_absorbed_regressors
+
+        df = self._frame()
+        with pytest.warns(UserWarning, match=r"treated:post"):
+            snap_absorbed_regressors(
+                df,
+                ["junk"],
+                {"junk": 1.0},
+                absorbed_desc="test FEs",
+                group_vars=["g"],
+                suffix="_dm",
+                display_names={"junk": "treated:post"},
+            )
+
+    def test_pre_demean_norms_captures_l2(self):
+        from diff_diff.utils import pre_demean_norms
+
+        df = pd.DataFrame({"a": [3.0, 4.0], "b": [0, 0]})
+        norms = pre_demean_norms(df, ["a", "b"])
+        assert norms["a"] == pytest.approx(5.0)
+        assert norms["b"] == 0.0
+
+
+class TestReviewFollowupsAbsorbedFE:
+    """Local-review follow-ups: one-way NaN guard + weight-aware snap."""
+
+    @pytest.mark.parametrize("weighted", [False, True])
+    def test_one_way_nan_group_key_raises(self, weighted):
+        """The NaN-group guard must cover len(group_vars) == 1 too (the
+        delegation to demean_by_group previously bypassed it)."""
+        df = _unbalanced_2way_panel(seed=20)
+        df["unit"] = df["unit"].astype(float)
+        df.loc[df.index[3], "unit"] = np.nan
+        w = df["w"].values if weighted else None
+        with pytest.raises(ValueError, match="'unit' contains NaN"):
+            demean_by_groups(df, ["y"], ["unit"], suffix="_dm", weights=w)
+
+    def test_estimator_one_way_absorb_nan_raises(self):
+        from diff_diff import DifferenceInDifferences
+
+        df = _unbalanced_2way_panel(seed=21)
+        df["treated"] = (df["unit"] < df["unit"].median()).astype(int)
+        df["post"] = (df["period"] >= df["period"].median()).astype(int)
+        df["fe"] = df["unit"].astype(float)
+        df.loc[df.index[5], "fe"] = np.nan
+        with pytest.raises(ValueError, match="'fe' contains NaN"):
+            DifferenceInDifferences().fit(
+                df, outcome="y", treatment="treated", time="post", absorb=["fe"]
+            )
+
+    def test_weight_aware_snap_zero_weight_domain(self):
+        """A regressor FE-spanned on the POSITIVE-weight sample must snap even
+        though zero-weight domain rows (left inert by the weighted demean)
+        carry non-zero demeaned values that would mask it unweighted."""
+        from diff_diff.utils import pre_demean_norms, snap_absorbed_regressors
+
+        rng = np.random.default_rng(22)
+        n_units, n_periods = 30, 6
+        unit = np.repeat(np.arange(n_units), n_periods)
+        period = np.tile(np.arange(n_periods), n_units)
+        df = pd.DataFrame({"unit": unit, "period": period})
+        # xc is unit-constant on positive-weight units (spanned by unit FE
+        # there) but VARIES on the zero-weight unit's rows
+        xc = np.repeat(rng.normal(size=n_units), n_periods).astype(float)
+        zero_rows = unit == 0
+        xc[zero_rows] = rng.normal(size=zero_rows.sum())  # varying junk
+        df["xc"] = xc
+        df["y"] = rng.normal(size=len(df))
+        w = np.ones(len(df))
+        w[zero_rows] = 0.0  # domain-excluded unit
+
+        pre = pre_demean_norms(df, ["xc"], weights=w)
+        out, _ = demean_by_groups(
+            df, ["y", "xc"], ["unit", "period"], suffix="_dm", weights=w, tol=1e-12
+        )
+        # unweighted norm is large (inert zero-weight rows keep raw values):
+        assert float(np.linalg.norm(out["xc_dm"].values)) > 1e-3
+        # ...but the weight-aware snap sees the effective sample and fires
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            snapped = snap_absorbed_regressors(
+                out,
+                ["xc"],
+                pre,
+                absorbed_desc="unit and period FEs",
+                group_vars=["unit", "period"],
+                suffix="_dm",
+                weights=w,
+            )
+        assert snapped == ["xc"]
+        assert (out["xc_dm"].values == 0.0).all()
+
+
+class TestJointSpanSnapConfirmation:
+    """Local-review P0: a column in the JOINT span of two FE dimensions
+    (x = a[unit] + b[time]) converges to zero slowly under MAP on unbalanced
+    panels, so its truncation residual can sit far above the fast-path snap
+    threshold. Stage 2 (exact LSMR span confirmation) must catch it; a
+    genuinely identified low-within-variation covariate must NOT be snapped.
+    """
+
+    @staticmethod
+    def _unbalanced_panel(seed=7, n_units=120, n_periods=30, span=9):
+        rng = np.random.default_rng(seed)
+        unit = np.repeat(np.arange(n_units), n_periods)
+        period = np.tile(np.arange(n_periods), n_units)
+        entry = rng.integers(0, n_periods - span, n_units)
+        keep = (period >= entry[unit]) & (period < entry[unit] + span)
+        unit, period = unit[keep], period[keep]
+        df = pd.DataFrame({"unit": unit, "period": period})
+        df["y"] = rng.normal(size=len(df))
+        a = rng.normal(0, 1, n_units)
+        b = rng.normal(0, 1, n_periods)
+        df["xspan"] = a[unit] + b[period]  # exactly in span(unit + period)
+        return df, rng
+
+    def test_joint_span_column_snapped_via_confirmation(self):
+        from diff_diff.utils import pre_demean_norms, snap_absorbed_regressors
+
+        df, _ = self._unbalanced_panel()
+        pre = pre_demean_norms(df, ["xspan"])
+        out, _ = demean_by_groups(df, ["y", "xspan"], ["unit", "period"], suffix="_dm", tol=1e-8)
+        rel = np.linalg.norm(out["xspan_dm"].values) / pre["xspan"]
+        assert rel > 1e-10  # the fixture is honest: fast path alone misses it
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            snapped = snap_absorbed_regressors(
+                out,
+                ["xspan"],
+                pre,
+                absorbed_desc="unit and period FEs",
+                group_vars=["unit", "period"],
+                suffix="_dm",
+            )
+        assert snapped == ["xspan"]
+        assert (out["xspan_dm"].values == 0.0).all()
+
+    def test_identified_low_variation_covariate_not_snapped(self):
+        """Counter-test: near-spanned but genuinely identified (tiny real
+        within-variation) must survive stage 2 untouched."""
+        from diff_diff.utils import pre_demean_norms, snap_absorbed_regressors
+
+        df, rng = self._unbalanced_panel(seed=8)
+        z = rng.normal(size=len(df))
+        df["xnear"] = df["xspan"] + 1e-5 * z  # real within-FE variation ~1e-5
+        pre = pre_demean_norms(df, ["xnear"])
+        out, _ = demean_by_groups(df, ["xnear"], ["unit", "period"], suffix="_dm", tol=1e-10)
+        before = out["xnear_dm"].values.copy()
+        with warnings.catch_warnings():
+            warnings.simplefilter("error")  # any snap warning fails the test
+            snapped = snap_absorbed_regressors(
+                out,
+                ["xnear"],
+                pre,
+                absorbed_desc="unit and period FEs",
+                group_vars=["unit", "period"],
+                suffix="_dm",
+            )
+        assert snapped == []
+        np.testing.assert_array_equal(out["xnear_dm"].values, before)
diff --git a/tests/test_wooldridge.py b/tests/test_wooldridge.py
index e5264a45a..ef9f0ad05 100644
--- a/tests/test_wooldridge.py
+++ b/tests/test_wooldridge.py
@@ -1093,8 +1093,12 @@ def test_parity_with_dummy_ols_not_yet_treated(self, unbalanced_data):
         # Build explicit dummy regression on same sample
         sample = _filter_sample(df, "unit", "time", "cohort", "not_yet_treated", 0)
         X_int, _, gt_keys = _build_interaction_matrix(
-            sample, cohort="cohort", time="time", anticipation=0,
-            control_group="not_yet_treated", method="ols",
+            sample,
+            cohort="cohort",
+            time="time",
+            anticipation=0,
+            control_group="not_yet_treated",
+            method="ols",
         )
         unit_dummies = pd.get_dummies(sample["unit"], drop_first=True).values.astype(float)
         time_dummies = pd.get_dummies(sample["time"], drop_first=True).values.astype(float)
@@ -1169,7 +1173,7 @@ def test_logit_never_treated_post_treatment_only(self, binary_data):
             binary_data, outcome="y", unit="unit", time="time", cohort="cohort"
         )
         # All cells should be post-treatment
-        for (g, t) in r.group_time_effects:
+        for g, t in r.group_time_effects:
             assert t >= g, f"Pre-treatment cell ({g},{t}) in nonlinear never_treated"
         # All expected post-treatment cells present
         expected = {(g, t) for g in [3, 4] for t in range(1, 6) if t >= g}
@@ -1180,7 +1184,7 @@ def test_poisson_never_treated_post_treatment_only(self, count_data):
         r = WooldridgeDiD(method="poisson", control_group="never_treated").fit(
             count_data, outcome="y", unit="unit", time="time", cohort="cohort"
         )
-        for (g, t) in r.group_time_effects:
+        for g, t in r.group_time_effects:
             assert t >= g, f"Pre-treatment cell ({g},{t}) in nonlinear never_treated"
         expected = {(g, t) for g in [3, 4] for t in range(1, 6) if t >= g}
         assert set(r.group_time_effects.keys()) == expected
@@ -1197,15 +1201,9 @@ def test_interaction_matrix_fewer_cols_for_nonlinear(self):
                 rows.append({"unit": u, "time": t, "cohort": g, "y": 0.0})
         df = pd.DataFrame(rows)
 
-        X_ols, _, _ = _build_interaction_matrix(
-            df, "cohort", "time", 0, "never_treated", "ols"
-        )
-        X_logit, _, _ = _build_interaction_matrix(
-            df, "cohort", "time", 0, "never_treated", "logit"
-        )
-        X_nyt, _, _ = _build_interaction_matrix(
-            df, "cohort", "time", 0, "not_yet_treated", "ols"
-        )
+        X_ols, _, _ = _build_interaction_matrix(df, "cohort", "time", 0, "never_treated", "ols")
+        X_logit, _, _ = _build_interaction_matrix(df, "cohort", "time", 0, "never_treated", "logit")
+        X_nyt, _, _ = _build_interaction_matrix(df, "cohort", "time", 0, "not_yet_treated", "ols")
         # OLS never_treated > nonlinear never_treated == not_yet_treated
         assert X_ols.shape[1] > X_logit.shape[1]
         assert X_logit.shape[1] == X_nyt.shape[1]
@@ -1267,24 +1265,32 @@ def test_ols_covariate_parity_with_full_basis_dummy_ols(self, cov_data):
         cell_cov = np.column_stack([X_int[:, i] * x1_demeaned for i in range(n_int)])
 
         # D_g × X (cohort × covariate)
-        cohort_cov = np.column_stack([
-            (sample["cohort"].values == g).astype(float) * x1_raw for g in groups
-        ])
+        cohort_cov = np.column_stack(
+            [(sample["cohort"].values == g).astype(float) * x1_raw for g in groups]
+        )
 
         # f_t × X (time × covariate, drop first)
         times = sorted(sample["time"].unique())
-        time_cov = np.column_stack([
-            (sample["time"].values == t).astype(float) * x1_raw for t in times[1:]
-        ])
+        time_cov = np.column_stack(
+            [(sample["time"].values == t).astype(float) * x1_raw for t in times[1:]]
+        )
 
         # Full design: intercept + cells + cell×cov + D_g×X + f_t×X + raw_X + unit + time dummies
         unit_dummies = pd.get_dummies(sample["unit"], drop_first=True).values.astype(float)
         time_dummies = pd.get_dummies(sample["time"], drop_first=True).values.astype(float)
         intercept = np.ones((len(sample), 1))
-        X_full = np.hstack([
-            intercept, X_int, cell_cov, cohort_cov, time_cov,
-            x1_raw.reshape(-1, 1), unit_dummies, time_dummies,
-        ])
+        X_full = np.hstack(
+            [
+                intercept,
+                X_int,
+                cell_cov,
+                cohort_cov,
+                time_cov,
+                x1_raw.reshape(-1, 1),
+                unit_dummies,
+                time_dummies,
+            ]
+        )
         y = sample["y"].values
         coefs_dummy, _, _ = solve_ols(X_full, y, rank_deficient_action="silent")
 
@@ -1304,12 +1310,10 @@ def test_covariates_affect_ols_att(self, cov_data):
         r_cov = WooldridgeDiD().fit(
             df, outcome="y", unit="unit", time="time", cohort="cohort", exovar=["x1"]
         )
-        r_nocov = WooldridgeDiD().fit(
-            df, outcome="y", unit="unit", time="time", cohort="cohort"
-        )
-        assert r_cov.overall_att != r_nocov.overall_att, (
-            "Covariate-adjusted ATT should differ from unadjusted"
-        )
+        r_nocov = WooldridgeDiD().fit(df, outcome="y", unit="unit", time="time", cohort="cohort")
+        assert (
+            r_cov.overall_att != r_nocov.overall_att
+        ), "Covariate-adjusted ATT should differ from unadjusted"
 
 
 class TestWooldridgeSurvey:
@@ -1333,20 +1337,33 @@ def survey_panel(self):
                 y_bin = int(rng.random() < 1 / (1 + np.exp(-eta)))
                 mu = np.exp(0.5 + 0.5 * treated + 0.1 * rng.standard_normal())
                 y_count = float(rng.poisson(mu))
-                rows.append({
-                    "unit": u, "time": t, "cohort": cohort,
-                    "y": y_cont, "y_bin": y_bin, "y_count": y_count,
-                    "stratum": stratum, "psu": psu, "weight": weight,
-                })
+                rows.append(
+                    {
+                        "unit": u,
+                        "time": t,
+                        "cohort": cohort,
+                        "y": y_cont,
+                        "y_bin": y_bin,
+                        "y_count": y_count,
+                        "stratum": stratum,
+                        "psu": psu,
+                        "weight": weight,
+                    }
+                )
         return pd.DataFrame(rows)
 
     def test_ols_survey_runs(self, survey_panel):
         """OLS with full survey design completes."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight", strata="stratum", psu="psu")
         r = WooldridgeDiD().fit(
-            survey_panel, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         assert np.isfinite(r.overall_att)
         assert np.isfinite(r.overall_se)
@@ -1355,13 +1372,21 @@ def test_ols_survey_runs(self, survey_panel):
     def test_ols_survey_se_differs_from_naive(self, survey_panel):
         """Survey SE should differ from naive (unweighted) SE."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight", strata="stratum", psu="psu")
         r_survey = WooldridgeDiD().fit(
-            survey_panel, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         r_naive = WooldridgeDiD().fit(
-            survey_panel, outcome="y", unit="unit", time="time",
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
             cohort="cohort",
         )
         assert r_survey.overall_se != r_naive.overall_se
@@ -1369,10 +1394,15 @@ def test_ols_survey_se_differs_from_naive(self, survey_panel):
     def test_logit_survey_runs(self, survey_panel):
         """Logit with survey design completes."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight", strata="stratum", psu="psu")
         r = WooldridgeDiD(method="logit").fit(
-            survey_panel, outcome="y_bin", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y_bin",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         assert np.isfinite(r.overall_att)
         assert np.isfinite(r.overall_se)
@@ -1381,10 +1411,15 @@ def test_logit_survey_runs(self, survey_panel):
     def test_poisson_survey_runs(self, survey_panel):
         """Poisson with survey design completes."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight", strata="stratum", psu="psu")
         r = WooldridgeDiD(method="poisson").fit(
-            survey_panel, outcome="y_count", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y_count",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         assert np.isfinite(r.overall_att)
         assert np.isfinite(r.overall_se)
@@ -1393,20 +1428,32 @@ def test_poisson_survey_runs(self, survey_panel):
     def test_bootstrap_survey_rejected(self, survey_panel):
         """n_bootstrap > 0 with survey_design raises ValueError."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight")
-        with pytest.raises(ValueError, match="Bootstrap inference is not supported with survey_design"):
+        with pytest.raises(
+            ValueError, match="Bootstrap inference is not supported with survey_design"
+        ):
             WooldridgeDiD(n_bootstrap=100).fit(
-                survey_panel, outcome="y", unit="unit", time="time",
-                cohort="cohort", survey_design=sd,
+                survey_panel,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
+                survey_design=sd,
             )
 
     def test_weights_only_survey(self, survey_panel):
         """Weights-only survey (no strata/PSU) works."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight")
         r = WooldridgeDiD().fit(
-            survey_panel, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         assert np.isfinite(r.overall_att)
         assert np.isfinite(r.overall_se)
@@ -1415,10 +1462,15 @@ def test_weights_only_survey(self, survey_panel):
     def test_survey_metadata_present(self, survey_panel):
         """survey_metadata is populated with correct fields."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight", strata="stratum", psu="psu")
         r = WooldridgeDiD().fit(
-            survey_panel, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         sm = r.survey_metadata
         assert sm is not None
@@ -1432,6 +1484,7 @@ def test_survey_metadata_present(self, survey_panel):
     def test_replicate_weights_rejected(self, survey_panel):
         """Replicate-weight designs raise NotImplementedError."""
         from diff_diff.survey import SurveyDesign
+
         # Add replicate weight columns
         survey_panel["rep_w1"] = 1.0
         survey_panel["rep_w2"] = 1.0
@@ -1442,17 +1495,26 @@ def test_replicate_weights_rejected(self, survey_panel):
         )
         with pytest.raises(NotImplementedError, match="replicate-weight variance"):
             WooldridgeDiD().fit(
-                survey_panel, outcome="y", unit="unit", time="time",
-                cohort="cohort", survey_design=sd,
+                survey_panel,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
+                survey_design=sd,
             )
 
     def test_weights_only_plus_cluster(self, survey_panel):
         """Weights-only survey + cluster= injects cluster as PSU."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight")
         r = WooldridgeDiD(cluster="stratum").fit(
-            survey_panel, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         # Cluster should have been injected as PSU
         n_strata = survey_panel["stratum"].nunique()
@@ -1461,38 +1523,55 @@ def test_weights_only_plus_cluster(self, survey_panel):
 
         # SE should differ from same run without cluster
         r_no_cluster = WooldridgeDiD().fit(
-            survey_panel, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         assert r.overall_se != r_no_cluster.overall_se
 
     def test_survey_gt_weights_are_counts(self, survey_panel):
         """Survey aggregation uses cell counts, not survey-weight sums."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight", strata="stratum", psu="unit")
         r = WooldridgeDiD(method="logit").fit(
-            survey_panel, outcome="y_bin", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y_bin",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         for k, w in r._gt_weights.items():
             assert isinstance(w, int), (
-                f"gt_weights[{k}] = {w} (type {type(w).__name__}); "
-                f"expected int (cell count)"
+                f"gt_weights[{k}] = {w} (type {type(w).__name__}); " f"expected int (cell count)"
             )
 
     def test_weights_only_no_cluster_implicit_psu(self, survey_panel):
         """Weights-only survey without cluster= keeps implicit per-obs PSUs."""
         from diff_diff.survey import SurveyDesign
         from diff_diff.wooldridge import _filter_sample
+
         sd = SurveyDesign(weights="weight")
         r = WooldridgeDiD().fit(
-            survey_panel, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         # n_psu should equal n_obs in the filtered sample (not n_units)
         sample = _filter_sample(
             survey_panel.copy().assign(cohort=lambda d: d["cohort"].fillna(0)),
-            "unit", "time", "cohort", "not_yet_treated", 0,
+            "unit",
+            "time",
+            "cohort",
+            "not_yet_treated",
+            0,
         )
         assert r.survey_metadata is not None
         assert r.survey_metadata.n_psu == len(sample)
@@ -1500,6 +1579,7 @@ def test_weights_only_no_cluster_implicit_psu(self, survey_panel):
     def test_fweight_rejected(self, survey_panel):
         """fweight raises ValueError (pweight only)."""
         from diff_diff.survey import SurveyDesign
+
         # Use integer weights so fweight validation passes in resolve(),
         # and the pweight guard in _resolve_survey_for_wooldridge fires.
         df = survey_panel.copy()
@@ -1507,20 +1587,29 @@ def test_fweight_rejected(self, survey_panel):
         sd = SurveyDesign(weights="int_weight", weight_type="fweight")
         with pytest.raises(ValueError, match="weight_type='pweight'"):
             WooldridgeDiD().fit(
-                df, outcome="y", unit="unit", time="time",
-                cohort="cohort", survey_design=sd,
+                df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
+                survey_design=sd,
             )
 
     def test_poisson_zero_weight_cell(self, survey_panel):
         """Poisson survey fit handles zero-weight treated cells cleanly."""
         from diff_diff.survey import SurveyDesign
+
         df = survey_panel.copy()
         # Zero out weights for one treated cohort so some cells have zero weight
         df.loc[df["cohort"] == 3, "weight"] = 0.0
         sd = SurveyDesign(weights="weight", strata="stratum", psu="unit")
         r = WooldridgeDiD(method="poisson").fit(
-            df, outcome="y_count", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            df,
+            outcome="y_count",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         assert np.isfinite(r.overall_att)
         assert np.isfinite(r.overall_se)
@@ -1528,12 +1617,17 @@ def test_poisson_zero_weight_cell(self, survey_panel):
     def test_ols_survey_rank_deficient(self, survey_panel):
         """Survey OLS handles rank-deficient all-eventually-treated designs."""
         from diff_diff.survey import SurveyDesign
+
         # Remove never-treated (cohort=0) to create rank-deficient design
         df = survey_panel[survey_panel["cohort"] > 0].copy()
         sd = SurveyDesign(weights="weight", strata="stratum", psu="unit")
         r = WooldridgeDiD(control_group="not_yet_treated").fit(
-            df, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            df,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         assert np.isfinite(r.overall_att)
         assert np.isfinite(r.overall_se)
@@ -1541,25 +1635,35 @@ def test_ols_survey_rank_deficient(self, survey_panel):
     def test_ols_survey_zero_weight_unit_rejected(self, survey_panel):
         """Zero-weight unit raises ValueError before within_transform."""
         from diff_diff.survey import SurveyDesign
+
         df = survey_panel.copy()
         # Zero out all weights for unit 0
         df.loc[df["unit"] == 0, "weight"] = 0.0
         sd = SurveyDesign(weights="weight", strata="stratum", psu="unit")
         with pytest.raises(ValueError, match="Survey weights sum to zero for unit"):
             WooldridgeDiD().fit(
-                df, outcome="y", unit="unit", time="time",
-                cohort="cohort", survey_design=sd,
+                df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
+                survey_design=sd,
             )
 
     def test_logit_survey_zero_weight_cell(self, survey_panel):
         """Logit survey fit skips zero-weight treated cells cleanly."""
         from diff_diff.survey import SurveyDesign
+
         df = survey_panel.copy()
         df.loc[df["cohort"] == 3, "weight"] = 0.0
         sd = SurveyDesign(weights="weight", strata="stratum", psu="unit")
         r = WooldridgeDiD(method="logit").fit(
-            df, outcome="y_bin", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            df,
+            outcome="y_bin",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         assert np.isfinite(r.overall_att)
         assert np.isfinite(r.overall_se)
@@ -1567,23 +1671,33 @@ def test_logit_survey_zero_weight_cell(self, survey_panel):
     def test_ols_survey_non_range_index(self, survey_panel):
         """OLS survey zero-weight guard works with non-RangeIndex DataFrames."""
         from diff_diff.survey import SurveyDesign
+
         df = survey_panel.copy()
         df.index = df.index + 1000  # shift to non-zero-based index
         df.loc[df["unit"] == 0, "weight"] = 0.0
         sd = SurveyDesign(weights="weight", strata="stratum", psu="unit")
         with pytest.raises(ValueError, match="Survey weights sum to zero for unit"):
             WooldridgeDiD().fit(
-                df, outcome="y", unit="unit", time="time",
-                cohort="cohort", survey_design=sd,
+                df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
+                survey_design=sd,
             )
 
     def test_survey_aggregate_and_summary(self, survey_panel):
         """Survey aggregate() uses df_survey and summary() shows survey block."""
         from diff_diff.survey import SurveyDesign
+
         sd = SurveyDesign(weights="weight", strata="stratum", psu="unit")
         r = WooldridgeDiD().fit(
-            survey_panel, outcome="y", unit="unit", time="time",
-            cohort="cohort", survey_design=sd,
+            survey_panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
+            survey_design=sd,
         )
         # aggregate() should use t-distribution with survey df
         r.aggregate("group")
@@ -1607,10 +1721,14 @@ def _make_panel_with_nan_cohort():
         for unit in range(10):
             cohort_val = np.nan if unit < 2 else 0.0
             for t in range(1, 5):
-                rows.append({
-                    "unit": unit, "time": t, "cohort": cohort_val,
-                    "y": unit + t + np.random.default_rng(unit).normal(0, 0.1),
-                })
+                rows.append(
+                    {
+                        "unit": unit,
+                        "time": t,
+                        "cohort": cohort_val,
+                        "y": unit + t + np.random.default_rng(unit).normal(0, 0.1),
+                    }
+                )
         return pd.DataFrame(rows)
 
     def test_fit_warns_on_nan_cohort_with_count(self):
@@ -1624,6 +1742,7 @@ def test_fit_warns_on_nan_cohort_with_count(self):
 
     def test_fit_silent_on_clean_cohort(self):
         import warnings
+
         df = self._make_panel_with_nan_cohort()
         df["cohort"] = df["cohort"].fillna(0)
         est = WooldridgeDiD(method="ols")
@@ -1640,8 +1759,12 @@ def test_select_sample_helper_warns(self):
         df = self._make_panel_with_nan_cohort()
         with pytest.warns(UserWarning, match=r"8 row\(s\) have NaN cohort values"):
             _filter_sample(
-                df, unit="unit", time="time", cohort="cohort",
-                control_group="never_treated", anticipation=0,
+                df,
+                unit="unit",
+                time="time",
+                cohort="cohort",
+                control_group="never_treated",
+                anticipation=0,
             )
 
 
@@ -1669,9 +1792,7 @@ def _make_vcov_panel(n_units=40, n_periods=6, seed=202605211230):
         0.0,
     )
     y = 0.7 + 0.1 * periods + 0.05 * units + tau + 0.15 * rng.normal(size=len(units))
-    return pd.DataFrame(
-        {"unit": units, "time": periods, "cohort": cohort_per_obs, "y": y}
-    )
+    return pd.DataFrame({"unit": units, "time": periods, "cohort": cohort_per_obs, "y": y})
 
 
 class TestWooldridgeVcovType:
@@ -1683,19 +1804,23 @@ def test_default_vcov_type_is_hc1(self):
         assert est._vcov_type_explicit is False
 
     def test_hc1_se_bit_equal_to_pre_pr_baseline(self):
-        """HC1 within-transform path must match pre-PR baseline at atol=1e-14.
-
-        Baseline captured on the Phase 1b PR 3/8 branch with
-        ``_make_vcov_panel(seed=202605211230)``. FWL preserves the CR1
-        cluster-robust score, so the new ``vcov_type`` branching keeps HC1
-        bit-equal to the prior hard-coded HC1 behavior.
+        """HC1 within-transform path must match the frozen baseline at atol=1e-14.
+
+        Baseline originally captured on the Phase 1b PR 3/8 branch with
+        ``_make_vcov_panel(seed=202605211230)`` (FWL preserves the CR1
+        cluster-robust score, so the ``vcov_type`` branching kept HC1 bit-equal
+        to the prior hard-coded HC1 behavior). Recaptured under the v3.6.x
+        factorize-once + bincount demeaner, which moved the values by ~1e-15
+        (bincount accumulation vs pandas' compensated grouped mean - see
+        REGISTRY "Absorbed Fixed Effects"); the 1e-14 lock semantics are
+        unchanged.
         """
         df = _make_vcov_panel()
         res = WooldridgeDiD(method="ols", vcov_type="hc1").fit(
             df, outcome="y", unit="unit", time="time", cohort="cohort"
         )
-        assert res.overall_att == pytest.approx(0.9178849934516247, abs=1e-14)
-        assert res.overall_se == pytest.approx(0.03149488781317814, abs=1e-14)
+        assert res.overall_att == pytest.approx(0.9178849934516233, abs=1e-14)
+        assert res.overall_se == pytest.approx(0.03149488781317813, abs=1e-14)
 
     def test_hc2_bm_finite_and_inflates_over_hc1(self):
         df = _make_vcov_panel()
@@ -1790,7 +1915,11 @@ def test_survey_design_plus_hc2_bm_rejected(self):
         est = WooldridgeDiD(method="ols", vcov_type="hc2_bm")
         with pytest.raises(NotImplementedError, match=r"survey_design"):
             est.fit(
-                df, outcome="y", unit="unit", time="time", cohort="cohort",
+                df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
                 survey_design=design,
             )
 
@@ -1803,7 +1932,11 @@ def test_survey_design_plus_classical_rejected(self):
         est = WooldridgeDiD(method="ols", vcov_type="classical")
         with pytest.raises(NotImplementedError, match=r"survey_design"):
             est.fit(
-                df, outcome="y", unit="unit", time="time", cohort="cohort",
+                df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
                 survey_design=design,
             )
 
@@ -1841,13 +1974,11 @@ def test_hc2_bm_plus_bootstrap_finite_inference(self):
             df, outcome="y", unit="unit", time="time", cohort="cohort"
         )
         # Bootstrap fit on the same data + seed.
-        res_boot = WooldridgeDiD(
-            method="ols", vcov_type="hc2_bm", n_bootstrap=50, seed=0
-        ).fit(df, outcome="y", unit="unit", time="time", cohort="cohort")
-        # ATT is unchanged by the bootstrap (only SE is overridden)
-        assert res_boot.overall_att == pytest.approx(
-            res_analytical.overall_att, abs=1e-10
+        res_boot = WooldridgeDiD(method="ols", vcov_type="hc2_bm", n_bootstrap=50, seed=0).fit(
+            df, outcome="y", unit="unit", time="time", cohort="cohort"
         )
+        # ATT is unchanged by the bootstrap (only SE is overridden)
+        assert res_boot.overall_att == pytest.approx(res_analytical.overall_att, abs=1e-10)
         # SE finite + sensible (positive, smaller than the panel SD of y)
         assert np.isfinite(res_boot.overall_se)
         assert res_boot.overall_se > 0
@@ -1884,18 +2015,14 @@ def test_hc2_bm_plus_bootstrap_rank_deficient(self):
         periods = np.tile(np.arange(1, n_periods + 1), n_units)
         cohorts = rng.choice([3, 5, 7], size=n_units)
         cohort_per_obs = cohorts[units]
-        tau = np.where(
-            periods >= cohort_per_obs, 0.5 + 0.2 * (periods - cohort_per_obs), 0.0
-        )
+        tau = np.where(periods >= cohort_per_obs, 0.5 + 0.2 * (periods - cohort_per_obs), 0.0)
         y = 1.0 + 0.1 * periods + tau + 0.1 * rng.normal(size=len(units))
-        df = pd.DataFrame(
-            {"unit": units, "time": periods, "cohort": cohort_per_obs, "y": y}
-        )
+        df = pd.DataFrame({"unit": units, "time": periods, "cohort": cohort_per_obs, "y": y})
         with warnings.catch_warnings():
             warnings.simplefilter("ignore", UserWarning)
-            res = WooldridgeDiD(
-                method="ols", vcov_type="hc2_bm", n_bootstrap=50, seed=0
-            ).fit(df, outcome="y", unit="unit", time="time", cohort="cohort")
+            res = WooldridgeDiD(method="ols", vcov_type="hc2_bm", n_bootstrap=50, seed=0).fit(
+                df, outcome="y", unit="unit", time="time", cohort="cohort"
+            )
         assert np.isfinite(res.overall_att)
         assert np.isfinite(res.overall_se)
         assert res.overall_se > 0
@@ -1963,7 +2090,11 @@ def test_survey_design_clears_cluster_metadata(self):
         # have surfaced cluster_name='unit', n_clusters=N — but survey TSL
         # replaces that vcov, so the dataclass must report None.
         res = WooldridgeDiD(method="ols", vcov_type="hc1").fit(
-            df, outcome="y", unit="unit", time="time", cohort="cohort",
+            df,
+            outcome="y",
+            unit="unit",
+            time="time",
+            cohort="cohort",
             survey_design=design,
         )
         assert res.survey_metadata is not None
@@ -2077,7 +2208,9 @@ def test_aggregate_event_under_hc2_bm_uses_bm_contrast_dof(self):
         for k, eff in res.event_study_effects.items():
             assert np.isfinite(eff["att"])
             assert np.isfinite(eff["se"])
-            assert np.isfinite(eff["t_stat"]), f"event k={k} t_stat NaN — BM DOF threading regressed"
+            assert np.isfinite(
+                eff["t_stat"]
+            ), f"event k={k} t_stat NaN — BM DOF threading regressed"
             assert np.isfinite(eff["p_value"])
             assert np.isfinite(eff["conf_int"][0])
             assert np.isfinite(eff["conf_int"][1])
@@ -2094,7 +2227,9 @@ def test_aggregate_calendar_under_hc2_bm_uses_bm_contrast_dof(self):
         for t, eff in res.calendar_effects.items():
             assert np.isfinite(eff["att"])
             assert np.isfinite(eff["se"])
-            assert np.isfinite(eff["t_stat"]), f"calendar t={t} t_stat NaN — BM DOF threading regressed"
+            assert np.isfinite(
+                eff["t_stat"]
+            ), f"calendar t={t} t_stat NaN — BM DOF threading regressed"
             assert np.isfinite(eff["p_value"])
             assert np.isfinite(eff["conf_int"][0])
             assert np.isfinite(eff["conf_int"][1])
@@ -2113,13 +2248,9 @@ def test_hc2_bm_handles_rank_deficient_all_eventually_treated(self):
         periods = np.tile(np.arange(1, n_periods + 1), n_units)
         cohorts = rng.choice([3, 5, 7], size=n_units)
         cohort_per_obs = cohorts[units]
-        tau = np.where(
-            periods >= cohort_per_obs, 0.5 + 0.2 * (periods - cohort_per_obs), 0.0
-        )
+        tau = np.where(periods >= cohort_per_obs, 0.5 + 0.2 * (periods - cohort_per_obs), 0.0)
         y = 1.0 + 0.1 * periods + tau + 0.1 * rng.normal(size=len(units))
-        df = pd.DataFrame(
-            {"unit": units, "time": periods, "cohort": cohort_per_obs, "y": y}
-        )
+        df = pd.DataFrame({"unit": units, "time": periods, "cohort": cohort_per_obs, "y": y})
         # Expect a rank-deficient warning from solve_ols (late-cohort drop).
         with warnings.catch_warnings():
             warnings.simplefilter("ignore", UserWarning)
@@ -2133,7 +2264,9 @@ def test_hc2_bm_handles_rank_deficient_all_eventually_treated(self):
         for k, eff in res.group_time_effects.items():
             assert np.isfinite(eff["att"]), f"({k}) att NaN"
             assert np.isfinite(eff["se"]), f"({k}) se NaN"
-            assert np.isfinite(eff["t_stat"]), f"({k}) t_stat NaN — BM DOF not threaded on reduced design"
+            assert np.isfinite(
+                eff["t_stat"]
+            ), f"({k}) t_stat NaN — BM DOF not threaded on reduced design"
             assert np.isfinite(eff["p_value"]), f"({k}) p_value NaN"
             assert np.isfinite(eff["conf_int"][0])
             assert np.isfinite(eff["conf_int"][1])
@@ -2146,18 +2279,24 @@ def test_hc2_bm_handles_rank_deficient_all_eventually_treated(self):
             res.aggregate(agg_type)
         assert res.event_study_effects is not None
         for k, eff in res.event_study_effects.items():
-            assert np.isfinite(eff["t_stat"]), f"event k={k} t_stat NaN — aggregate BM DOF on reduced design regressed"
+            assert np.isfinite(
+                eff["t_stat"]
+            ), f"event k={k} t_stat NaN — aggregate BM DOF on reduced design regressed"
             assert np.isfinite(eff["p_value"])
         assert res.group_effects is not None
         for g, eff in res.group_effects.items():
-            assert np.isfinite(eff["t_stat"]), f"group g={g} t_stat NaN — aggregate BM DOF on reduced design regressed"
+            assert np.isfinite(
+                eff["t_stat"]
+            ), f"group g={g} t_stat NaN — aggregate BM DOF on reduced design regressed"
             assert np.isfinite(eff["p_value"])
         assert res.calendar_effects is not None
         # Calendar entries with at least one identified treated cell should
         # have finite BM inference; entirely-pre-treatment calendar periods
         # are absent from calendar_effects (their cells aren't post-treatment).
         for t, eff in res.calendar_effects.items():
-            assert np.isfinite(eff["t_stat"]), f"calendar t={t} t_stat NaN — aggregate BM DOF on reduced design regressed"
+            assert np.isfinite(
+                eff["t_stat"]
+            ), f"calendar t={t} t_stat NaN — aggregate BM DOF on reduced design regressed"
             assert np.isfinite(eff["p_value"])
 
     def test_hc2_bm_handles_rank_deficient_with_unit_invariant_exovar(self):
@@ -2182,7 +2321,9 @@ def test_hc2_bm_handles_rank_deficient_with_unit_invariant_exovar(self):
         for k, eff in res.group_time_effects.items():
             assert np.isfinite(eff["att"]), f"({k}) att NaN"
             assert np.isfinite(eff["se"]), f"({k}) se NaN"
-            assert np.isfinite(eff["t_stat"]), f"({k}) t_stat NaN under rank-deficient exovar — BM DOF not threaded"
+            assert np.isfinite(
+                eff["t_stat"]
+            ), f"({k}) t_stat NaN under rank-deficient exovar — BM DOF not threaded"
             assert np.isfinite(eff["p_value"])
         assert np.isfinite(res.overall_t_stat)
         assert np.isfinite(res.overall_p_value)
@@ -2193,7 +2334,9 @@ def test_hc2_bm_handles_rank_deficient_with_unit_invariant_exovar(self):
         for g, eff in (res.group_effects or {}).items():
             assert np.isfinite(eff["t_stat"]), f"group g={g} t_stat NaN under rank-deficient exovar"
         for t, eff in (res.calendar_effects or {}).items():
-            assert np.isfinite(eff["t_stat"]), f"calendar t={t} t_stat NaN under rank-deficient exovar"
+            assert np.isfinite(
+                eff["t_stat"]
+            ), f"calendar t={t} t_stat NaN under rank-deficient exovar"
         for k, eff in (res.event_study_effects or {}).items():
             assert np.isfinite(eff["t_stat"]), f"event k={k} t_stat NaN under rank-deficient exovar"
 
@@ -2351,3 +2494,36 @@ def test_filter_sample_preserves_outcome_support(self):
         for cg in ("not_yet_treated", "never_treated"):
             sample = _filter_sample(df, "unit", "time", "cohort", cg, 0)
             assert sorted(sample["y"].unique()) == sorted(df["y"].unique())
+
+
+class TestAbsorbedCovariateSnap:
+    """Unit-constant exovar columns (and their cohort interactions, which are
+    also unit-constant) are spanned by the unit FE on the within-transform
+    path: they must snap to deterministic NaN with a cause warning
+    (REGISTRY 'Absorbed FE')."""
+
+    def test_unit_constant_exovar_snaps_with_cause_warning(self):
+        rng = np.random.default_rng(5)
+        n_units, n_periods = 60, 6
+        units = np.repeat(np.arange(n_units), n_periods)
+        t = np.tile(np.arange(n_periods), n_units)
+        cohort = np.repeat(
+            np.where(np.arange(n_units) % 3 == 0, 0, np.where(np.arange(n_units) % 3 == 1, 3, 4)),
+            n_periods,
+        )
+        d = (cohort > 0) & (t >= cohort)
+        y = 0.5 * d + rng.normal(size=len(units))
+        df = pd.DataFrame({"unit": units, "time": t, "cohort": cohort, "y": y})
+        df["xc"] = np.repeat(rng.normal(size=n_units), n_periods)
+
+        with pytest.warns(UserWarning, match="collinear with the absorbed"):
+            res = WooldridgeDiD(method="ols", vcov_type="hc1").fit(
+                df,
+                outcome="y",
+                unit="unit",
+                time="time",
+                cohort="cohort",
+                exovar=["xc"],
+            )
+        assert np.isfinite(res.overall_att)
+        assert np.isfinite(res.overall_se)

From a03319a048e69d4da0739044e72fe4c3fbdffa58 Mon Sep 17 00:00:00 2001
From: igerber <isaac.gerber@gmail.com>
Date: Thu, 2 Jul 2026 19:46:33 -0400
Subject: [PATCH 2/2] docs: mention snap_absorbed_regressors in the utils
 doc-deps note (review P3)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 docs/doc-deps.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/doc-deps.yaml b/docs/doc-deps.yaml
index e90ea6b70..a773db435 100644
--- a/docs/doc-deps.yaml
+++ b/docs/doc-deps.yaml
@@ -904,7 +904,7 @@ sources:
       - path: docs/methodology/REGISTRY.md
         section: "Absorbed Fixed Effects with Survey Weights"
         type: methodology
-        note: "demean_by_groups / within_transform: MAP convergence contract (max_iter/tol), bincount accumulation numerics, NaN-group-key guard"
+        note: "demean_by_groups / within_transform: MAP convergence contract (max_iter/tol), bincount accumulation numerics, NaN-group-key guard, snap_absorbed_regressors FE-spanned snap + exact LSMR span confirmation"
       - path: docs/practitioner_getting_started.rst
         section: "Step 4: Check Whether the Result Is Trustworthy"
         type: user_guide