Fix structural annotation document visibility#2089
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…ty-in-structural-annotations
Code ReviewThis PR fixes a real IDOR vulnerability: structural annotations sharing a Finding 1 — N+1 queries in the
|
…ty-in-structural-annotations
resolve_document (config/graphql/annotation_types.py) could leak a private document that shares a content-hash-deduplicated StructuralAnnotationSet. - structural_document_prefetch now requires a keyword-only user and scopes the prefetch with Document.objects.visible_to_user(user), so the visibility gate runs once per page; resolve_document trusts the user-scoped prefetch and no longer issues a per-row visibility SELECT (review finding 2). - document_id branch returns the already select_related-cached FK instead of a fresh per-row query; falls back to a single permission-scoped fetch only when the FK was not cached (review finding 1). - An empty (fully filtered) prefetch falls through to the corpus-scoped DB fallback instead of short-circuiting to null (review finding 3). - SemanticSearchResultType.resolve_document (config/graphql/social_types.py) delegates to the gated AnnotationType.resolve_document, closing the convenience field's raw-FK bypass and adding structural resolution. - Update both prefetch callers (annotation_queries, search_queries) to pass user; positive assertion added to the regression test. Consolidates the fix from the closed duplicate PR #2088.
Code ReviewThis PR closes a real security gap: 1.
|
…rough service layer Follow-up to eca04a3's structural-annotation visibility fix, addressing second-round review findings on config/graphql/annotation_types.py: - An empty (fully permission-filtered) structural_set__documents prefetch now returns None directly instead of falling through to a second, identically-scoped DB query — the prefetch result was already definitive. - The structural DB fallback now requires corpus_id and returns None without it, rather than risking a cross-corpus document pick when no corpus context is available to scope against. - Both the non-structural and structural fallback branches move out of the resolver and into AnnotationService.resolve_owned_document / resolve_structural_document_fallback, replacing inline Document.objects.filter() + BaseService.filter_visible_qs composition. - Adds a regression test exercising the corpus-scoped prefetch path (not just the DB fallback) with a private shared document, plus a captured-query-count assertion that fails if the _prefetched_objects_cache detection ever silently degrades to a per-row fallback query. Verified the FK is_cached() check flagged as "dead code" in review is correct as written: Field.is_cached() (Django's FieldCacheMixin) checks instance._state.fields_cache, not the raw document_id column, confirmed against the pinned Django 5.2.15 source and an empirical select_related test. Left in place with a clarifying comment instead of a no-op change.
ReviewThis is a well-scoped, well-documented fix for a real IDOR/privacy leak: Correctness
Minor / non-blocking
Test coverageGood — the added tests specifically construct a document that sorts first alphabetically in an unscoped/unfiltered prefetch ( Nits
Overall: solid fix, correctly reasoned through multiple review rounds, no blocking issues found. |
AnnotationService.resolve_owned_document and
resolve_structural_document_fallback only run when a caller fetches
Annotation without select_related("document") / the structural
prefetch — every production query path always applies both, so these
defensive fallbacks were unreachable from the existing GraphQL-level
tests and codecov/patch failed at 41.67%. Exercises both methods
directly and through AnnotationType.resolve_document with
deliberately uncached/unprefetched Annotation instances.
Review: Fix structural annotation document visibilityThis is a well-scoped, well-tested security fix. I traced the full call graph (both GraphQL entry points, the service layer, and the new/updated tests) and the fix is architecturally sound. What's good
Minor / non-blocking observations
No blocking issues found. The security fix is correct, the regression tests meaningfully exercise both the leak and the N+1 risk introduced by the fix, and the code follows the project's established permissioning patterns. |
Motivation
AnnotationType.resolve_documentpath (enabled by@bypass_get_queryset) could return a Document from a sharedStructuralAnnotationSetwithout enforcingDocumentvisibility, allowing private documents or signed URLs to leak when an unrelated shared document is visible.Description
config/graphql/annotation_types.py::AnnotationType.resolve_documentto gate returned documents throughBaseService.filter_visible_qs(...)for the directdocument_idcase, the prefetched structural-set candidate list, and the fallbackdocumentsqueryset so only visible documents are returned.opencontractserver/tests/test_corpus_cards_structural_document_resolution.pythat createsother_userand aprivate_docthat shares the same structural set and would sort first in an unscoped prefetch, along with_UNSCOPED_QUERYandtest_unscoped_structural_resolution_skips_private_shared_documentsto assert no private document is returned.Testing
python -m compileall config/graphql/annotation_types.py opencontractserver/tests/test_corpus_cards_structural_document_resolution.pywhich succeeded.python -m pytest opencontractserver/tests/test_corpus_cards_structural_document_resolution.py -qbut test execution was blocked by the environment due toModuleNotFoundError: No module named 'django'so the new test was not executed here.Fix structural annotation document visibilityand include the code and test updates in the described files.Codex Task