feat(#1424): self.upstream property for pre-restricted ancestor access#1473
Open
dimitri-yatsenko wants to merge 1 commit into
Open
feat(#1424): self.upstream property for pre-restricted ancestor access#1473dimitri-yatsenko wants to merge 1 commit into
dimitri-yatsenko wants to merge 1 commit into
Conversation
4 tasks
Implements T2.2.b of the provenance trinity. Inside make(), self.upstream exposes a pre-constructed Diagram.trace(self & key) so users can read declared ancestors with provenance-safe, ergonomic syntax. Branch stacked on feat/1423-diagram-trace (#1471) for Diagram.trace(). What's added: - src/datajoint/autopopulate.py: - AutoPopulate._upstream class attribute (default None) — instance storage for the per-make() trace. - AutoPopulate.upstream property — returns the trace if set, raises DataJointError with a clear "only available inside make()" message otherwise. The error includes the fallback pattern (dj.Diagram.trace(self & key)) so the user knows the escape hatch. - In _populate_one, set self._upstream = Diagram.trace(self & dict(key)) immediately before the make() invocation block. Construction is lazy at this layer (graph copy only); the SQL fetch fires when the user accesses self.upstream[T].fetch(...). - The existing `finally` block (line 716) that resets _allow_insert now also resets _upstream to None, so subsequent attribute access raises a clear error rather than silently returning a stale trace from the previous make() call. What's not changed: - make() signature: unchanged — key remains a dict, make_kwargs work as before. self.upstream is a new attribute on self, not a new parameter. - Tripartite make pattern: self._upstream is set once before all three make() invocations, so all three phases see the same upstream view. Tests in tests/integration/test_autopopulate.py (5 new): - test_upstream_provides_pre_restricted_ancestor — basic case: make() reads self.upstream[Subject].fetch1("name") and the value is correctly pre-restricted to the current key. - test_upstream_rejects_non_ancestor — self.upstream[Unrelated] raises DataJointError ("not in this trace"). Inherited from Diagram.__getitem__. - test_upstream_unset_outside_make — accessing the property outside of make() raises with the helpful "only available inside make()" message. - test_upstream_cleared_after_make — after populate() completes, accessing the property on a fresh instance still raises (verifies the finally cleanup; no stale state). - test_upstream_seen_across_tripartite_make — both make_fetch / make_compute / make_insert see the same self.upstream value. Full regression: 17/17 autopopulate tests pass on MySQL. Slated for DataJoint 2.3. Blocked on #1471 (Diagram.trace) merging; T2.2.c (strict_provenance) is stacked on this PR.
dimitri-yatsenko
added a commit
that referenced
this pull request
Jun 23, 2026
Implements T2.2.c of the provenance trinity, completing the trio (Diagram.trace → self.upstream → strict_provenance). When dj.config["strict_provenance"] = True, runtime gates enforce the upstream-only convention inside make(): - Reads must target a table in the active trace's allowed set (declared ancestors + self + self's Parts). - Writes must target self or self's Parts. - Inserted rows' PK columns that overlap with the current key must equal the key's values (key-consistency rule). Default is False. Existing make() bodies are unaffected. Branch stacked on feat/1424-self-upstream (#1473) → feat/1423-diagram-trace (#1471) → fix/1429-cascade-part-part-renamed-fk (#1468). Will rebase onto master after the chain merges. What's added: - src/datajoint/provenance.py (new): the runtime context module. - `_active_strict_make` ContextVar holding (target, allowed_tables, key) for the currently-executing make() invocation. ContextVar chosen over threading.local to propagate correctly across contextvars-aware concurrency boundaries. - `push_strict_make_context` / `pop_strict_make_context` — context lifecycle managed by `_populate_one`'s try/finally. - `assert_read_allowed(query_expression)` — read gate. Recursively discovers base tables via the QueryExpression's `_support` chain and checks each against the allowed set. - `assert_write_allowed(target_table, rows)` — write gate. Verifies the target is self or one of self's Part tables, and checks the key-consistency rule on each dict row. - src/datajoint/settings.py: new `strict_provenance: bool` field on Config (default False), env-var `DJ_STRICT_PROVENANCE`, ENV_VAR_MAPPING entry. - src/datajoint/autopopulate.py: in `_populate_one`, push the strict context (when the flag is on) just before the make() invocation block. The allowed table set = trace's ancestor nodes ∪ {self.full_table_name} ∪ {self's Parts}. Pop in the existing `finally` block. - src/datajoint/expression.py: `QueryExpression.cursor` now calls `assert_read_allowed(self)` before issuing SQL. No-op outside make(). - src/datajoint/table.py: `Table.insert` calls `assert_write_allowed(self, rows)` after the existing `_allow_insert` check. No-op outside make(). Part-table detection uses class `__dict__` traversal (filtered to Part subclasses) instead of `dir/getattr` to avoid triggering the `_JobsDescriptor` (which would lazy-declare ~~table inside the populate transaction — caught by the first test iteration). Documented limitation (deferred): the read gate does not distinguish reads that came through `self.upstream` from reads of the same ancestor via a direct expression. Both are allowed if the table is in the allowed set. The intent is to catch reads from *undeclared* dependencies; tightening the "must come through self.upstream" path requires propagating an attribution marker through QueryExpression composition and is left for a follow-up release. Tests in tests/integration/test_strict_provenance.py (6 new): - test_strict_compliant_make_passes — make() reading via self.upstream and writing self.insert1 with matching key runs cleanly under strict. - test_strict_blocks_read_from_undeclared_table — read from an unrelated table raises with "strict_provenance ... undeclared" message. - test_strict_blocks_write_to_other_table — insert into a non-self, non-Part target raises "not permitted". - test_strict_blocks_write_with_mismatched_key — row PK that disagrees with the current key raises "does not match the current make() key". - test_strict_writes_to_part_table_pass — self.PartName.insert(...) works. - test_strict_off_by_default_no_change — default-off regression check; the canonical "direct (Ancestor & key).fetch1()" pattern still works when strict_provenance is unset. Regression: 17/17 autopopulate tests pass with strict_provenance unset (default). 6/6 new strict tests pass with strict_provenance=True. 8/8 trace tests + 9/9 cascade tests unaffected. Slated for DataJoint 2.3.
86b1ae7 to
7e4130f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
T2.2.b of the provenance trinity. Inside `make()`, `self.upstream` exposes a pre-constructed `Diagram.trace(self & key)` so user code can read declared ancestors with provenance-safe, ergonomic syntax.
Closes #1424. Slated for DataJoint 2.3.
Branch dependency
Stacked on `feat/1423-diagram-trace` (#1471). This PR's base is `feat/1423-diagram-trace`, not `master`. Will rebase onto master once #1471 (and its parent #1468) merge.
What's added
Design notes
Tests
Sequencing
Once this PR merges, T2.2.c (`strict_provenance` config flag, the runtime gate) goes next. It branches off this PR.
Test plan