Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions .github/scripts/podman-cr.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#!/usr/bin/env bash
# Run the gated Podman checkpoint/restore tests inside a modern-podman
# container (invoked by ci.yml's test-integration-podman job, which docker-runs
# this privileged + --cgroupns=host so CRIU can use the runner's kernel).
#
# The hosted runner's own apt podman is unusable (24.04 has no criu; 22.04
# ships podman 3.4.4 whose runtime can't checkpoint and predates the libpod
# v5 API), so we bring podman 5.x + crun + criu via the container image and
# only need the runner for its kernel + Docker.
#
# Skips GREEN (exit 0 + ::warning::) if this runner can't actually
# checkpoint — e.g. the nested cgroup freezer is not permitted. Runs the
# tests for real (failing red) only once a checkpoint smoke test proves the
# environment is capable. Real C/R is also validated locally on podman 5.x +
# criu (OrbStack).
set -uo pipefail

dnf install -y -q criu iptables >/dev/null 2>&1 \
|| { echo "::warning::criu install failed in container — skipping C/R run"; exit 0; }
echo "stack: $(podman --version) / $(criu --version | head -1) / $(crun --version | head -1)"

criu check || { echo "::warning::criu check failed on this runner kernel — skipping C/R run"; exit 0; }

mkdir -p /etc/containers /run/podman
printf '[engine]\nevents_logger="file"\nruntime="crun"\n' > /etc/containers/containers.conf
podman system service --time=0 unix:///run/podman/podman.sock &
for _ in $(seq 1 30); do [ -S /run/podman/podman.sock ] && break; sleep 1; done
test -S /run/podman/podman.sock || { echo "::warning::podman service socket did not come up — skipping"; exit 0; }

# Capability smoke test: can this runner actually freeze + dump a container?
# Nested CRIU frequently can't ("Unable to freeze tasks: Operation not
# permitted"). If it can't, skip green with the real reason rather than fail.
podman run -d --name smoke docker.io/library/alpine:3.20 sleep 600 >/dev/null
sleep 2
if ! podman container checkpoint smoke >/tmp/ckpt.log 2>&1; then
echo "::warning::this runner cannot checkpoint a container (likely cgroup freezer perms): $(tail -1 /tmp/ckpt.log) — skipping. Real C/R is validated locally on podman 5.x + criu."
exit 0
fi
podman rm -f smoke >/dev/null 2>&1 || true
echo "checkpoint smoke passed — running the gated tests for real"

export PODMAN_SOCKET=unix:///run/podman/podman.sock
/w/podman.test -test.run TestIntegration -test.v -test.timeout 15m
/w/int.test -test.run '^TestPodman' -test.v -test.timeout 15m
39 changes: 39 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,45 @@ jobs:
go test -race -count=1 -tags=integration -timeout=15m \
-run "$pattern" ./test/integration/...

# Real Podman + CRIU checkpoint/restore. The runtime/podman backend and
# the Engine checkpoint/restore + project-orchestrator paths only execute
# against a live Podman socket with CRIU (PODMAN_SOCKET-gated). The
# cross-node test (TestPodmanXNode_*) needs two hosts, so it skips here
# (no DCCKPT_XNODE_DIR) — run it on two machines by hand.
#
# Why a container: the hosted runner's apt podman is unusable (24.04 has
# no criu; 22.04's podman 3.4.4 can't checkpoint and predates the libpod
# v5 API). So we build the gated tests on the runner (compile coverage,
# static CGO_ENABLED=0 so they run on Fedora), then run them INSIDE a
# modern-podman container (podman 5.x + crun + criu) that's privileged +
# --cgroupns=host so CRIU can drive the runner's kernel. The script
# smoke-tests an actual checkpoint first and skips green (with the real
# reason) if this runner can't — e.g. nested cgroup-freezer perms — so a
# capable runner runs for real while an incapable one stays green.
test-integration-podman:
runs-on: ubuntu-latest
needs: [lint, test-linux]
steps:
- uses: actions/checkout@v6
- uses: actions/setup-go@v6
Comment thread
coderabbitai[bot] marked this conversation as resolved.
with:
go-version: "1.25"
cache: true
- name: Build gated test binaries (static; compile coverage + run in container)
env:
CGO_ENABLED: "0"
run: |
go test -tags=integration -c ./test/integration -o ./int.test
go test -c ./runtime/podman -o ./podman.test
- name: Checkpoint/restore in a modern-podman privileged container
run: |
docker run --rm --privileged --cgroupns=host \
--security-opt seccomp=unconfined \
--security-opt apparmor=unconfined \
--security-opt label=disable \
-v "$PWD":/w -w /w \
quay.io/podman/stable bash /w/.github/scripts/podman-cr.sh

# Integration tests against a live Apple `container` daemon.
#
# Verified-on-CI status:
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
*.so
*.dylib

# Locally-built CLI/example binaries (extension-less Mach-O/ELF outputs)
/dap
/devcontainer

# Test artifacts
*.test
*.out
Expand Down
18 changes: 16 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,15 @@ The container backend is pluggable. Pick one at engine construction time:
embedded Swift bridge (`libACBridge.dylib`, dlopen'd at runtime).
Lets you run devcontainers on Apple Silicon without Docker
Desktop.

Both backends implement the same `runtime.Runtime` interface — the
- **`runtime/podman`** — Podman over its docker-compatible socket (the
same `moby/moby/client`, embedded), plus CRIU-backed
**checkpoint/restore** (`runtime.CheckpointRuntime`) driven through the
libpod REST API on that one socket. The only backend that can migrate a
running devcontainer — process + memory — to another node; see
[`design/checkpoint-restore.md`](design/checkpoint-restore.md). Linux
only; needs a running `podman system service`.

All three backends implement the same `runtime.Runtime` interface — the
engine, feature pipeline, lifecycle, and compose paths don't care which
one you wire in.

Expand Down Expand Up @@ -177,6 +184,13 @@ Requires:
system start` already up. Swift toolchain only if you're
building the bridge from source — releases embed the
pre-built dylib.
- **Podman:** Linux only. A running `podman system service` exposing
its socket (it serves both the docker-compatible and libpod APIs);
point the backend at it with `podman.Options{Socket}`.
Checkpoint/restore additionally requires `criu` (a CRIU-capable
kernel and an OCI runtime such as `crun`/`runc`). No BuildKit:
in-container builds go through buildah, so pre-built/pulled images
are the fast path.

## Quick start

Expand Down
35 changes: 22 additions & 13 deletions attach.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,17 +63,30 @@ func (e *Engine) AttachWith(ctx context.Context, id WorkspaceID, opts AttachOpti
return nil, err
}

// Reconstruct just enough config for the substituter. Attach can't
// reproduce the full ResolvedConfig (the source devcontainer.json may
// have changed since Up); callers that need it should Resolve again.
return e.reattachWorkspace(ctx, details, id, opts.LocalEnv), nil
}

// reattachWorkspace rebuilds a *Workspace from an already-inspected,
// running container. It is shared by Attach (container found by label)
// and Restore (container freshly imported from a checkpoint archive):
// both have a live container and need the same MINIMAL config + bound
// substituter + userEnv probe, without re-reading devcontainer.json.
//
// It reconstructs just enough config for the substituter (Attach can't
// reproduce the full ResolvedConfig — the source devcontainer.json may
// have changed since Up; callers that need it should Resolve again),
// folds in the image's merged-config metadata label so callers see the
// same RemoteUser / lifecycle hooks / probe config as Up, and re-probes
// userEnv so subsequent Exec calls see the user's rc-file PATH additions.
//
// id stamps the workspace and cfg.DevcontainerID; localEnv may be nil
// (falls back to os.Environ()).
func (e *Engine) reattachWorkspace(ctx context.Context, details *runtime.ContainerDetails, id WorkspaceID, localEnv map[string]string) *Workspace {
cfg := configFromContainerLabels(details)
cfg.DevcontainerID = string(id)

// The container's image carries the merged-config metadata label
// from when Up created it; folding it in here means Attach-only
// callers see the same RemoteUser / lifecycle hooks / probe config
// as Up. Failures to read or parse the label are non-fatal — Attach
// then gives back a minimal cfg as before.
// Reading or parsing the metadata label is best-effort: failures
// leave baseLayers nil and we fall back to the minimal cfg.
var baseLayers []config.FeatureMetadata
if details.Image != "" {
if imgDetails, err := e.runtime.InspectImage(ctx, details.Image); err == nil && imgDetails != nil {
Expand All @@ -84,7 +97,6 @@ func (e *Engine) AttachWith(ctx context.Context, id WorkspaceID, opts AttachOpti
}
}
}
localEnv := opts.LocalEnv
if localEnv == nil {
localEnv = environAsMap(os.Environ())
}
Expand All @@ -104,11 +116,8 @@ func (e *Engine) AttachWith(ctx context.Context, id WorkspaceID, opts AttachOpti
subst: newSubstituter(cfg, details, localEnv),
}

// Re-probe on attach so subsequent Exec calls see PATH additions
// from the user's rc files. The original Up populated probedEnv,
// but a fresh Attach doesn't share that workspace value.
if probed, err := e.probeUserEnv(ctx, ws, cfg.UserEnvProbe); err == nil {
ws.probedEnv = probed
}
return ws, nil
return ws
}
139 changes: 139 additions & 0 deletions checkpoint.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
package devcontainer

import (
"context"
"fmt"

"github.com/crunchloop/devcontainer/runtime"
)

// CheckpointOptions configures Engine.Checkpoint.
type CheckpointOptions struct {
// ArchivePath is where the portable checkpoint archive is written.
// Required. Point it at durable, transferable storage (the workspace
// volume, object storage) — the archive is self-contained, so a
// later Restore can run on a different node by moving this file.
ArchivePath string

// StopAfter stops/removes the container once the archive is written
// — the spot-eviction path, where the node is going away anyway.
// False keeps the container running ("backup" checkpoint).
StopAfter bool

// TCPEstablished requests checkpoint of established TCP connections.
// Recommended true for devcontainers: a container holding a live
// connection at checkpoint time fails to checkpoint without it.
TCPEstablished bool
}

// RestoreOptions configures Engine.Restore.
type RestoreOptions struct {
// ArchivePath is the archive a prior Checkpoint wrote. Required.
ArchivePath string

// Name optionally names the restored container.
Name string

// TCPEstablished must match the checkpoint when the archive captured
// established connections.
TCPEstablished bool

// LocalEnv overrides os.Environ() for the reattached workspace's
// substituter localEnv pass. Nil means use the current process
// environment — matches AttachOptions.LocalEnv. On a cross-node
// restore the destination's env may differ from the source's, so a
// caller that cares can pin it here.
LocalEnv map[string]string
}

// Checkpoint writes a portable checkpoint archive for the workspace's
// container (process + memory state plus the writable rootfs), so it can
// later be restored — possibly on another node — by Restore.
//
// Returns ErrCheckpointUnsupported (wrapped) if the active backend does
// not implement runtime.CheckpointRuntime or advertises
// Capabilities().Checkpoint == false. Callers can errors.Is against
// runtime.ErrCheckpointUnsupported and fall back to a cold path.
//
// Checkpoint is the primitive; deciding *when* to checkpoint (e.g. on a
// spot-reclaim notice) is the caller's job.
func (e *Engine) Checkpoint(ctx context.Context, ws *Workspace, opts CheckpointOptions) (runtime.CheckpointRef, error) {
if err := ctxIfDone(ctx); err != nil {
return runtime.CheckpointRef{}, err
}
if ws == nil || ws.Container == nil {
return runtime.CheckpointRef{}, fmt.Errorf("Checkpoint: workspace has no container")
}
if opts.ArchivePath == "" {
return runtime.CheckpointRef{}, fmt.Errorf("Checkpoint: ArchivePath is required")
}

cr, ok := e.runtime.(runtime.CheckpointRuntime)
if !ok || !e.runtime.Capabilities().Checkpoint {
return runtime.CheckpointRef{}, fmt.Errorf("Checkpoint: %w", runtime.ErrCheckpointUnsupported)
}

ref, err := cr.Checkpoint(ctx, ws.Container.ID, runtime.CheckpointSpec{
ArchivePath: opts.ArchivePath,
StopAfter: opts.StopAfter,
TCPEstablished: opts.TCPEstablished,
})
if err != nil {
return runtime.CheckpointRef{}, fmt.Errorf("checkpoint: %w", err)
}
return ref, nil
}

// Restore re-creates and resumes a container from a checkpoint archive
// written by Checkpoint, reconstructing its mounts and re-attaching
// networking, then rebuilds the *Workspace around it. The original
// container may be gone (the migration case).
//
// The returned Workspace has the MINIMAL config Attach produces — the
// devcontainer labels the checkpoint archive preserves plus the image's
// merged-config metadata — with the substituter bound to the restored
// container's live env and userEnv re-probed. It is enough to drive Exec
// and Down; callers needing the full devcontainer.json view should
// Resolve from source. See the Workspace type docs.
//
// Returns ErrCheckpointUnsupported (wrapped) when the backend can't, and
// a *runtime.RestoreFailedError (from the backend) on a restore failure
// — distinct from a cold-start failure, so callers can fall back to a
// cold Up on the (intact) workspace volume.
func (e *Engine) Restore(ctx context.Context, opts RestoreOptions) (*Workspace, error) {
if err := ctxIfDone(ctx); err != nil {
return nil, err
}
if opts.ArchivePath == "" {
return nil, fmt.Errorf("Restore: ArchivePath is required")
}

cr, ok := e.runtime.(runtime.CheckpointRuntime)
if !ok || !e.runtime.Capabilities().Checkpoint {
return nil, fmt.Errorf("Restore: %w", runtime.ErrCheckpointUnsupported)
}

c, err := cr.Restore(ctx, runtime.RestoreSpec{
ArchivePath: opts.ArchivePath,
Name: opts.Name,
TCPEstablished: opts.TCPEstablished,
})
if err != nil {
return nil, fmt.Errorf("restore: %w", err)
}

// Reattach: the restored container carries the devcontainer labels
// from the archive, so rebuild the Workspace the same way Attach
// does. inspectStable absorbs the post-restore state lag (the daemon
// reports state asynchronously after import-and-start). The workspace
// id is recovered from the container's label.
details, err := e.inspectStable(ctx, c.ID)
if err != nil {
return nil, fmt.Errorf("restore: inspect restored container %s: %w", c.ID, err)
}
id := WorkspaceID(details.Labels[LabelDevcontainerID])
if id == "" {
return nil, fmt.Errorf("restore: restored container %s has no %s label — not a devcontainer workspace archive", c.ID, LabelDevcontainerID)
}
return e.reattachWorkspace(ctx, details, id, opts.LocalEnv), nil
Comment thread
coderabbitai[bot] marked this conversation as resolved.
}
Loading
Loading