Skip to content

Make the async processing service (psv2) the default and update all existing projects#1353

Merged
mihow merged 5 commits into
mainfrom
chore/enable-async-pipeline-workers-all-projects
Jun 27, 2026
Merged

Make the async processing service (psv2) the default and update all existing projects#1353
mihow merged 5 commits into
mainfrom
chore/enable-async-pipeline-workers-all-projects

Conversation

@mihow

@mihow mihow commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Summary

This flips the switch on the new async processing service (psv2) platform-wide. It does two things together: every project that already exists is switched over, and every project created from now on uses psv2 by default. Concretely, the async_pipeline_workers feature flag drives whether a project's ML jobs run on workers that pull tasks from the NATS queue (psv2) instead of the synchronous push API; this PR makes that the default and rolls it out to the back-catalogue so operators don't have to opt each project in by hand.

We are flipping the default now because psv2 has proven substantially more reliable and much faster than the synchronous service in operator testing — on the order of two orders of magnitude higher throughput on large capture sets (an operational observation on production data, not a controlled benchmark). The remaining psv2 work is tracked but none of it blocks turning psv2 on by default; the open items are refinements, hardening, and follow-ups rather than correctness gates for the common path.

Operational precondition: this routes every project's ML jobs through async (NATS) workers. Each deployment it runs against must have psv2 workers live and consuming the queue. Stranded async jobs are caught by the stale-job reaper rather than hanging indefinitely.

List of Changes

Change (user/operator effect) How (implementation)
New projects use the async processing service (psv2) by default. ProjectFeatureFlags.async_pipeline_workers default changed from False to True in ami/main/models.py. No migration is needed: the feature_flags field deconstructs by its schema= class reference and default-factory reference, neither of which changes when a default inside the pydantic model changes — makemigrations --check confirms no model migration.
Every project that already exists is switched over to psv2, without an operator flipping each one by hand. New data migration 0094_enable_async_pipeline_workers runs a single DB-side UPDATE ... jsonb_set(...) that toggles only the async_pipeline_workers key in place for every existing Project, with a WHERE ... IS DISTINCT FROM guard so rows already at the target are skipped. Updating the one key server-side (rather than reading the whole JSON into Python and writing it back) leaves the other feature flags untouched even under a concurrent change, and runs as one statement instead of one save per row.
The existing-project rollout can be undone in one step. The data migration's reverse flips the flag back to False for every project (a blanket disable). It does not restore per-project values from before the rollout — some projects may have been opted in individually beforehand, so the reverse returns every project to the off state. The reverse does not change the new model default; that is a code change, reverted by reverting the commit.

Notes

  • The default change and the data migration are complementary: the default only affects rows created after deploy, so the data migration is still required to cover projects that already exist.
  • Validated on a fresh test database and end-to-end on a dev box: the data migration enables the flag for all projects and the reverse clears it, while an unrelated flag set on a project is preserved through both directions. The factory that backs new-project creation now returns async_pipeline_workers = True with the other defaults unchanged.

Known follow-ups (PSv2) — none blocking this change

psv2 is the new async/distributed ML backend tracked under the umbrella issue #515 and the PSv2 label. The items below are known and open at the time of this change. None of them block making psv2 the default — they are hardening, performance, and feature-completeness follow-ups. Full lists: Antenna PSv2 issues · Antenna PSv2 PRs · ADC (ami-data-companion) PRs.

Reliability / job-state correctness (Antenna)

Performance / scaling (Antenna)

Auth & permissions (Antenna + ADC)

Pipeline config & result contract (in flight)

Docs, templates, and infra (in flight)

Summary by CodeRabbit

  • New Features

    • Async pipeline workers are now enabled by default for new projects.
    • Existing projects have been updated to use the new default setting automatically.
  • Bug Fixes

    • Improved consistency in job dispatch behavior by making related tests set the project flag explicitly.

Turn on the `async_pipeline_workers` feature flag for every project that
exists at deploy time, rolling out async ML processing (workers that pull
tasks from the NATS queue instead of the synchronous push API) across the
whole platform at once.

The flag lives in the `feature_flags` JSONB column. The data migration reads
each project's flags, sets the one boolean, and writes it back, leaving the
other feature flags untouched. The reverse flips the flag back off for every
project. New projects keep the model default of False until opted in
separately.

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 26, 2026 22:59
@netlify

netlify Bot commented Jun 26, 2026

Copy link
Copy Markdown

Deploy Preview for antenna-ssec canceled.

Name Link
🔨 Latest commit 29cf619
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/6a3f153c035f360009d600ab

@netlify

netlify Bot commented Jun 26, 2026

Copy link
Copy Markdown

👷 Deploy Preview for antenna-preview processing.

Name Link
🔨 Latest commit 276d7c8
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/6a3f043c5431e00008b6db68

@netlify

netlify Bot commented Jun 26, 2026

Copy link
Copy Markdown

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit 29cf619
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/6a3f153bf3b46300080b06f4

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 99e28375-b79b-4dfc-9cf5-ce6c711c7527

📥 Commits

Reviewing files that changed from the base of the PR and between 276d7c8 and 29cf619.

📒 Files selected for processing (3)
  • ami/jobs/tests/test_jobs.py
  • ami/main/migrations/0094_enable_async_pipeline_workers.py
  • ami/main/models.py

📝 Walkthrough

Walkthrough

Updates the async_pipeline_workers default to True, adds a migration that updates existing Project.feature_flags values with SQL, and adjusts one job test to set the flag explicitly before job creation.

Changes

Project flag rollout

Layer / File(s) Summary
Default and data migration
ami/main/models.py, ami/main/migrations/0094_enable_async_pipeline_workers.py
Sets the ProjectFeatureFlags.async_pipeline_workers default to True and updates existing projects through a SQL-backed RunPython migration with forward and reverse toggles.
Job test flag setup
ami/jobs/tests/test_jobs.py
Updates the ML job dispatch test to disable async_pipeline_workers on the project before creating the auto-sync job.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A bunny hops on fields of code,
Where flags now glow in default mode.
Old rows shift with SQL light,
New jobs hop straight and land just right.
🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: making psv2 default and rolling it out to existing projects.
Description check ✅ Passed It covers the summary, change list, notes, and deployment/testing context, though some template sections like issues/screenshots/checklist are absent.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/enable-async-pipeline-workers-all-projects

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
ami/main/migrations/0094_enable_async_pipeline_workers.py (1)

20-37: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick win

Prefer an atomic JSONB update for this rollout.

The read/modify/save loop writes the whole feature_flags value back per project, so a concurrent update to another flag between Line 23/33 and Line 27/37 can be lost. A DB-side jsonb_set update also avoids caching every Project instance and issuing one save per row.

♻️ Proposed direction
+def set_async_pipeline_workers(apps, schema_editor, enabled):
+    Project = apps.get_model("main", "Project")
+    table = schema_editor.connection.ops.quote_name(Project._meta.db_table)
+    with schema_editor.connection.cursor() as cursor:
+        cursor.execute(
+            f"""
+            UPDATE {table}
+            SET feature_flags = jsonb_set(
+                COALESCE(feature_flags, '{{}}'::jsonb),
+                '{{async_pipeline_workers}}',
+                to_jsonb(%s::boolean),
+                true
+            )
+            WHERE COALESCE((feature_flags->>'async_pipeline_workers')::boolean, false)
+                  IS DISTINCT FROM %s
+            """,
+            [enabled, enabled],
+        )
+
+
 def enable_async_pipeline_workers(apps, schema_editor):
-    Project = apps.get_model("main", "Project")
-    for project in Project.objects.all():
-        flags = project.feature_flags
-        if not flags.async_pipeline_workers:
-            flags.async_pipeline_workers = True
-            project.feature_flags = flags
-            project.save(update_fields=["feature_flags"])
+    set_async_pipeline_workers(apps, schema_editor, True)
 
 
 def disable_async_pipeline_workers(apps, schema_editor):
-    Project = apps.get_model("main", "Project")
-    for project in Project.objects.all():
-        flags = project.feature_flags
-        if flags.async_pipeline_workers:
-            flags.async_pipeline_workers = False
-            project.feature_flags = flags
-            project.save(update_fields=["feature_flags"])
+    set_async_pipeline_workers(apps, schema_editor, False)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/migrations/0094_enable_async_pipeline_workers.py` around lines 20 -
37, The `enable_async_pipeline_workers` and `disable_async_pipeline_workers`
migration helpers currently read/modify/save `Project.feature_flags` in Python,
which can overwrite concurrent flag changes and is inefficient per row. Update
these rollout functions to use an atomic DB-side JSONB update on `feature_flags`
(for example via an `update()` with `jsonb_set`-style logic) so only
`async_pipeline_workers` is toggled in place without loading each `Project`
instance or rewriting the full JSON object.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@ami/main/migrations/0094_enable_async_pipeline_workers.py`:
- Around line 20-37: The `enable_async_pipeline_workers` and
`disable_async_pipeline_workers` migration helpers currently read/modify/save
`Project.feature_flags` in Python, which can overwrite concurrent flag changes
and is inefficient per row. Update these rollout functions to use an atomic
DB-side JSONB update on `feature_flags` (for example via an `update()` with
`jsonb_set`-style logic) so only `async_pipeline_workers` is toggled in place
without loading each `Project` instance or rewriting the full JSON object.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9c5db873-d1d4-45fa-a0a1-b151588515b1

📥 Commits

Reviewing files that changed from the base of the PR and between 08ca0a4 and 276d7c8.

📒 Files selected for processing (1)
  • ami/main/migrations/0094_enable_async_pipeline_workers.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a Django data migration that globally enables async ML processing for all existing projects by setting the feature_flags.async_pipeline_workers flag to True, with a reversible migration that disables it again.

Changes:

  • Add migration 0094_enable_async_pipeline_workers to enable async_pipeline_workers for every existing Project.
  • Add reverse migration logic to disable async_pipeline_workers for every Project.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ami/main/migrations/0094_enable_async_pipeline_workers.py Outdated
Comment thread ami/main/migrations/0094_enable_async_pipeline_workers.py Outdated
Comment thread ami/main/migrations/0094_enable_async_pipeline_workers.py Outdated
Comment thread ami/main/migrations/0094_enable_async_pipeline_workers.py Outdated
Flip the `async_pipeline_workers` default in `ProjectFeatureFlags` to True so
projects created from now on use the async processing service (psv2) — workers
that pull tasks from the NATS queue — without an operator opting them in.

No migration is required: the `feature_flags` field deconstructs by its schema
class and default-factory references, neither of which changes when a default
inside the pydantic model changes (`makemigrations --check` reports no changes).
The companion data migration handles existing projects.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow changed the title Turn on async ML processing for all existing projects Make the async processing service (psv2) the default and switch all existing projects to it Jun 26, 2026
@mihow mihow changed the title Make the async processing service (psv2) the default and switch all existing projects to it Make the async processing service (psv2) the default and update all existing projects Jun 26, 2026
Address CodeRabbit review on the data migration: replace the read/modify/save
loop with a single DB-side `jsonb_set` UPDATE that toggles only the
`async_pipeline_workers` key.

Updating the one key server-side leaves the other feature flags untouched even
if another process changes one of them during the deploy (the previous loop
rewrote the whole JSONB value and could clobber a concurrent sibling change),
and it runs as one statement instead of one save per row. A `WHERE ... IS
DISTINCT FROM` guard skips rows already at the target value.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow

mihow commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator Author

Claude says: Addressed the data-migration nitpick in bd4d8814 — switched the rollout to a single DB-side jsonb_set UPDATE that toggles only the async_pipeline_workers key, with a WHERE ... IS DISTINCT FROM guard so already-correct rows are skipped. This removes the read/modify/save loop, so a concurrent change to a sibling flag during deploy can't be clobbered, and it runs as one statement instead of one save per row. Validated on a fresh test DB: flips both directions and preserves an unrelated flag set on a project.

`test_ml_job_dispatch_mode_set_on_creation` asserted that an ML job on a
default project dispatches via sync_api, which relied on
`async_pipeline_workers` defaulting to False. Now that the default is True,
the sync branch must set the flag off explicitly to exercise that path — the
async branch already sets it on. Pins both transitions instead of leaning on
the default.

Co-Authored-By: Claude <noreply@anthropic.com>
The reverse is a blanket disable; some projects may have had the flag enabled
individually before this rollout, so the docstring no longer claims no project
was True beforehand — it states the reverse returns every project to the off
state rather than to its prior value.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow merged commit cdcb3b1 into main Jun 27, 2026
6 of 7 checks passed
@mihow mihow deleted the chore/enable-async-pipeline-workers-all-projects branch June 27, 2026 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants