Skip to content

agent/engine: keep cron/hook client alive for live background tasks#141

Merged
pufit merged 1 commit into
ClickHouse:mainfrom
polyglotAI-bot:polyglot/cron-bg-task-keepalive
Jun 23, 2026
Merged

agent/engine: keep cron/hook client alive for live background tasks#141
pufit merged 1 commit into
ClickHouse:mainfrom
polyglotAI-bot:polyglot/cron-bg-task-keepalive

Conversation

@polyglotAI-bot

Copy link
Copy Markdown
Contributor

Summary

One-shot runs (run_cron / run_hook) tore down the SDK client the instant the agent yielded its turn — even when a Bash(run_in_background) task was still running. Discarding kills the CLI subprocess and the idle-stream watcher that delivers the background task's completion turn, so the task is orphaned and the run never resumes to finish its work (push, update state, notify). This stranded isolated cron workers: a run kicked off a build in the background, ended its turn to "await the completion notification", and was never woken.

Route the three one-shot teardowns through a new _teardown_oneshot_client() that skips the discard while _has_live_background_tasks() is true — the same predicate run_idle_client_sweep already uses to keep such clients alive (#125). The watcher then delivers the completion turn (the agent resumes and finishes) and the idle sweep reaps the client once the task settles — exactly the lifecycle a web/interactive session already has.

Why this is minimal

The capability already worked for web/persistent sessions: the idle-stream watcher delivers a background task's completion as an autonomous turn, and the idle sweep already refuses to discard a client with a live background task. The only gap was the one-shot teardown discarding unconditionally. This makes the cron/hook teardown consistent with that existing guard.

run_persistent_cron is intentionally excluded

Its session_id is stable and reused across runs, so parking the client on a live background task would let the next scheduled run reuse the same client/conversation while the prior task is still in flight. Keep-alive is therefore restricted to the unique-per-run isolated paths (run_cron, run_hook); persistent crons discard as before (keepalive_if_bg=False).

Out of scope (pre-existing)

A background task that never settles parks the subprocess indefinitely — but that is already true for web/interactive sessions today (the sweep skips live-bg sessions unconditionally). Bounding it is a separate, general idle-sweep change for all session types, not cron-specific.

Test plan

  • run_cron / run_hook keep the client alive when a background task is live; run_persistent_cron discards even with a live task.
  • End-to-end: driving the real _idle_stream_watcher against a parked client, a background-completion sequence is delivered as exactly one autonomous resume turn, and the task flips out of "live" so the sweep can reap it.
  • No regression — affected engine/cron/session suites pass (247). Full suite green except 7 pre-existing, unrelated test_codex_* failures (confirmed failing on main without this change).

🤖 Generated with Claude Code

One-shot runs (run_cron / run_hook) discarded the SDK client in their
finally the instant the agent yielded — even with a Bash(run_in_background)
task still running. Discarding kills the subprocess and the idle-stream
watcher that delivers the task's completion turn, so the background task is
orphaned and the run never resumes to finish its work (push, update state,
notify). This stranded isolated cron workers.

Route the three one-shot teardowns through a new _teardown_oneshot_client()
that skips the discard while _has_live_background_tasks() is true — the same
predicate run_idle_client_sweep already uses to keep such clients alive
(ClickHouse#125). The watcher then delivers the completion turn and the idle sweep
reaps the client once the task settles, exactly as for a web/interactive
session.

run_persistent_cron is excluded (keepalive_if_bg=False): its session_id is
stable and reused across runs, so parking the client would let the next run
collide with the still-in-flight task on the same conversation. Keep-alive is
only safe for the unique-per-run isolated paths.

Tests: cron/hook keep the client alive on a live bg task; persistent-cron
still discards; plus an end-to-end test driving the real idle-stream watcher
to resume a parked session when its background task completes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@pufit pufit merged commit 1aadf19 into ClickHouse:main Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants