Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 15 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

Original Contributors: Hang Yin, Kevin Wang, Andrew Miller

[Documentation](https://docs.phala.com/dstack) · [Examples](https://github.com/Dstack-TEE/dstack-examples) · [Community](https://t.me/+UO4bS4jflr45YmUx)
[Documentation](https://docs.phala.com/dstack) · [Security](./SECURITY.md) · [Examples](https://github.com/Dstack-TEE/dstack-examples) · [Community](https://t.me/+UO4bS4jflr45YmUx)

</div>

Expand Down Expand Up @@ -89,6 +89,19 @@ Your container runs inside a Confidential VM, such as Intel TDX or AMD SEV-SNP,

[Full security model →](./docs/security/security-model.md)

## Security and Trust

Security docs are linked here so deployers and reviewers can quickly find the trust model, production guidance, audit, and the status of already-answered public findings.

- [Security Overview](./docs/security/) - entry point for users, operators, researchers, and AI agents
- [Security Model](./docs/security/security-model.md) - threat model, trust boundaries, and verification checklist
- [Security Issue Triage](./docs/security/security-issue-triage.md) - public status for answered, fixed, accepted, and roadmap security reports
- [Security Best Practices](./docs/security/security-best-practices.md) - production settings and hardening guidance
- [Security Audit](./docs/security/dstack-audit.pdf) - third-party audit by zkSecurity
- [Report a Vulnerability](./SECURITY.md) - use GitHub's private security reporting path

Please do not disclose exploitable vulnerabilities in public GitHub issues. Use the private reporting path in [SECURITY.md](./SECURITY.md).

## SDKs

Apps communicate with the guest agent via HTTP over `/var/run/dstack.sock`. Use the [HTTP API](./sdk/curl/api.md) directly with curl, or use a language SDK:
Expand Down Expand Up @@ -121,14 +134,6 @@ Apps communicate with the guest agent via HTTP over `/var/run/dstack.sock`. Use
- [Design Decisions](./docs/design-and-hardening-decisions.md) - Architecture rationale
- [FAQ](./docs/faq.md) - Frequently asked questions

## Security

- [Security Overview](./docs/security/) - Security documentation and responsible disclosure
- [Security Model](./docs/security/security-model.md) - Threat model and trust boundaries
- [Security Best Practices](./docs/security/security-best-practices.md) - Production hardening
- [Security Audit](./docs/security/dstack-audit.pdf) - Third-party audit by zkSecurity
- [CVM Boundaries](./docs/security/cvm-boundaries.md) - Information exchange and isolation

## FAQ

<details>
Expand Down Expand Up @@ -180,7 +185,7 @@ Yes. dstack runs on supported TEE-capable servers, including Intel TDX-capable h

- **GCP**: Intel TDX (Confidential VMs)
- **AWS**: Nitro Enclaves (NSM attestation)
- **Bare metal**: Intel TDX (4th/5th Gen Xeon) and AMD SEV-SNP on supported dstack OS images
- **Bare metal**: Intel TDX (4th/5th Gen Xeon) and AMD SEV-SNP on supported dstack OS images. Intel TDX is the production path; AMD SEV-SNP is new and experimental.
- **GPUs**: NVIDIA Confidential Computing (H100, Blackwell)

</details>
Expand Down Expand Up @@ -227,5 +232,3 @@ Logo and branding assets: [dstack-logo-kit](./docs/assets/dstack-logo-kit/)
## License

Apache 2.0
</content>
</invoke>
21 changes: 21 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Security

Use this file for vulnerability reports. For the security model, production guidance, audit, and already-answered public findings, start with [Security Documentation](./docs/security/).

## Report a vulnerability

If you believe you found a vulnerability, please use [GitHub's private security reporting features](https://docs.github.com/en/code-security/how-tos/report-and-fix-vulnerabilities/report-privately) for this repository. If GitHub private reporting is unavailable, contact security@phala.network.

Do not open public GitHub issues for exploitable vulnerabilities or details that could help exploit production deployments.

Use private reporting for issues that could expose secrets, bypass attestation or authorization, compromise KMS keys, weaken workload isolation, or enable unauthorized code or configuration changes in production deployments.

## Public security questions

Use public issues only for questions about documented behavior, documentation gaps, already-public findings, or hardening ideas that do not include an exploit path.

Before opening a public security question, check [Security Issue Triage](./docs/security/security-issue-triage.md). It records public findings that were fixed, accepted by design, documented, or moved to roadmap work.

## Production trust boundary

Development settings are not production-safe merely because they are present in the codebase. Production deployments must rely on measured configuration, expected TEE measurements, authorization policy, and attestation verification. The [Security Model](./docs/security/security-model.md#development-modes-are-auditable-not-production-safe) is the source of truth for what dstack treats as a production guarantee.
18 changes: 16 additions & 2 deletions docs/security/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@

dstack security resources for auditors, researchers, and operators.

## Start Here

- **Users and verifiers:** read the [Security Model](./security-model.md) to understand what dstack guarantees and what you must verify.
- **Operators:** read [Security Best Practices](./security-best-practices.md) before deploying production KMS, gateway, or VMM services.
- **Security researchers and AI agents:** report exploitable vulnerabilities through the private path in [SECURITY.md](../../SECURITY.md). For already-public findings or docs questions, check [Security Issue Triage](./security-issue-triage.md) before opening a public issue.
- **Maintainers:** use [Security Issue Triage](./security-issue-triage.md) to classify public reports and close issues once the maintainer position is clear.

## Audit

dstack has been audited by zkSecurity. See the [full audit report](./dstack-audit.pdf).
Expand All @@ -10,8 +17,15 @@ dstack has been audited by zkSecurity. See the [full audit report](./dstack-audi

- [Security Model](./security-model.md) - Threat model, trust boundaries, and verification checklist
- [Security Best Practices](./security-best-practices.md) - Production hardening guide
- [Security Issue Triage](./security-issue-triage.md) - Public status for answered, fixed, accepted, and roadmap reports
- [CVM Boundaries](./cvm-boundaries.md) - Information exchange and isolation details

## Responsible Disclosure
## Already Answered Reports

Some public security reports describe real hardening work. Some describe behavior that is intentional for development or compatibility, and some are false positives under production configuration. The canonical list is [Security Issue Triage](./security-issue-triage.md). Search that page by issue number, component, or exact setting name before treating an old report as unresolved.

## Report Vulnerabilities

If you believe you found an exploitable vulnerability, use GitHub's private security reporting features as described in [SECURITY.md](../../SECURITY.md). If GitHub private reporting is unavailable, contact security@phala.network.

To report a security vulnerability, email security@phala.network. We will respond within 48 hours.
Do not open GitHub issues for exploitable vulnerabilities.
15 changes: 15 additions & 0 deletions docs/security/security-best-practices.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,21 @@ Example app-compose.json:

**But keep in mind, even if you disable exposing app-compose.json, it is just hidden from the public API, the physical machine controller can still access it on the file system.**

## Do not use development trust settings in production

Development settings are intentionally easy to audit, but they are not production-safe. A production deployment should satisfy all of the following:

- KMS quote verification remains enabled. Do not deploy production KMS with `quote_enabled = false`.
- KMS authorization uses webhook/on-chain policy. Do not use `auth_api.type = "dev"` with real key material.
- The KMS contract pins a concrete gateway app id. Do not use `gateway_app_id = "any"` for production traffic.
- TEE quotes are evaluated by deployment policy, including TCB status and expected OS/application measurements.

The KMS TLS listener may keep `rpc.tls.mutual.mandatory = false` because bootstrap endpoints need to be reachable before a client has an RA-TLS certificate. Sensitive KMS routes still require the client certificate and attestation evidence in application code before releasing keys or signing certificates.

## Keep private material owner-only

Secret-bearing files should be owner-only (`0600`) wherever possible, including app keys, decrypted env files, KMS root keys, gateway WireGuard/TLS keys, and ACME credentials. Preserve restrictive permissions when copying volumes, backing up `/etc/kms/certs`, or moving gateway and certbot state between hosts. Public issue [#606](https://github.com/Dstack-TEE/dstack/issues/606) tracks the remaining low-cost hardening work in dstack-managed file writes.

## docker logs is public available by default

Similarly, to facilitate App observability, docker logs are public by default. You can disable exposing docker logs by setting public_logs=false.
Expand Down
54 changes: 54 additions & 0 deletions docs/security/security-issue-triage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Security Issue Triage

Security issues should not remain open after the maintainer position is clear. An open issue means one of two things: a fix is still required, or a concrete design/roadmap item is intentionally being tracked. Everything else should be closed with a final maintainer comment and a link to the code or documentation that records the decision.

This page is not a vulnerability reporting channel. Report exploitable vulnerabilities privately through [SECURITY.md](../../SECURITY.md). Use public issues only for questions, documentation gaps, duplicate-prone prior findings, or hardening ideas that do not disclose an exploit path.

## Triage labels

Use these categories when evaluating public security questions and already-public reports:

| Category | Meaning | Expected issue state |
| --- | --- | --- |
| Real blocker | Confirmed vulnerability that can compromise production security under supported configuration | Keep open until fixed; close as completed when the fix lands |
| Needs hardening | Not a broken trust boundary, but a defense-in-depth improvement with no compatibility cost | Keep open only while the patch is pending; close as completed when merged |
| Fixed | The reported behavior has already been fixed or is fixed by the linked change | Close as completed |
| Docs-only | The behavior is intentional or lower severity, but the repo must say so clearly | Close after documentation is merged |
| Accepted by design | The report conflicts with the documented threat model or with an intentional compatibility constraint | Close as not planned, with the design rationale linked |

When a report mixes several claims, split the actionable work into separate issues before closing the original. Do not leave a broad "security" issue open just to remember future work.

## March 2026 audit cluster (#549-#609)

The March audit cluster contained a mix of real fixes, hardening work, compatibility decisions, false positives, and public threat-model questions. Several implementation PRs in this number range fixed reports, and some issue state has not caught up with the maintainer position.

Already closed as completed: [#550](https://github.com/Dstack-TEE/dstack/issues/550), [#551](https://github.com/Dstack-TEE/dstack/issues/551), [#553](https://github.com/Dstack-TEE/dstack/issues/553), [#558](https://github.com/Dstack-TEE/dstack/issues/558), [#565](https://github.com/Dstack-TEE/dstack/issues/565), and [#568](https://github.com/Dstack-TEE/dstack/issues/568). The sibling public security issues [#614](https://github.com/Dstack-TEE/dstack/issues/614), [#615](https://github.com/Dstack-TEE/dstack/issues/615), [#616](https://github.com/Dstack-TEE/dstack/issues/616), [#617](https://github.com/Dstack-TEE/dstack/issues/617), [#618](https://github.com/Dstack-TEE/dstack/issues/618), and [#619](https://github.com/Dstack-TEE/dstack/issues/619) are also closed as completed.

| Issue | Classification | Maintainer action |
| --- | --- | --- |
| [#549](https://github.com/Dstack-TEE/dstack/issues/549) Disk encryption key collision when `no_instance_id=true` and HKDF context ambiguity | Accepted by design, optional hardening | `no_instance_id=true` intentionally shares disk keys across instances, and the HKDF inputs have fixed lengths. Close the original as not planned, or split zero-padding for the unset instance ID into a separate hardening issue if an owner wants it |
| [#552](https://github.com/Dstack-TEE/dstack/issues/552) Static HKDF salt and no key versioning | Design roadmap, not a near-term vulnerability | Static salt is acceptable with high-entropy KMS root material and explicit context; key versioning/rotation requires a broader compatibility design |
| [#554](https://github.com/Dstack-TEE/dstack/issues/554) Signature concatenation without length prefixes enables collision | Fixed | [#604](https://github.com/Dstack-TEE/dstack/pull/604) enforces the 20-byte `app_id` length in CVM setup; close as completed |
| [#555](https://github.com/Dstack-TEE/dstack/issues/555) LUKS header TOCTOU between validation and `luksOpen` | Accepted by design | The setup code validates and opens the same in-memory LUKS header. Close as not planned with the maintainer rationale |
| [#556](https://github.com/Dstack-TEE/dstack/issues/556) Disk encryption key and WireGuard key visible in `/proc/PID/cmdline` | Needs hardening | Keep open while removing transient command-line exposure for secret-bearing setup commands, or close only if the maintainer explicitly accepts the early-boot exposure in the documented threat model |
| [#557](https://github.com/Dstack-TEE/dstack/issues/557) Runtime event log writable by any VM process | Fixed | [#602](https://github.com/Dstack-TEE/dstack/pull/602) restricts runtime event-log permissions; close as completed |
| [#559](https://github.com/Dstack-TEE/dstack/issues/559) Zero `mr_config_id` bypasses verification and weakens `mr_aggregated` identity | Accepted compatibility decision, docs-only | Zero `mr_config_id` remains an unset-value compatibility case, and configuration changes are still reflected through RTMR-based measurements. Close as not planned after linking the threat-model rationale |
| [#560](https://github.com/Dstack-TEE/dstack/issues/560) Admin token comparison not constant-time | Accepted by design | The comparison is over a SHA-256 digest of a high-entropy token, not the raw token. Close as not planned unless the token format changes |
| [#561](https://github.com/Dstack-TEE/dstack/issues/561) KMS TLS client certificates are non-mandatory in Rocket config | Docs-only for current architecture | The TLS listener allows unauthenticated bootstrap endpoints, while sensitive KMS handlers enforce client certificate and attestation checks in application code |
| [#562](https://github.com/Dstack-TEE/dstack/issues/562) Configfs path overridable through an environment variable | Accepted threat-model decision, possible hardening | A process that can choose its own quote path is already inside the measured CVM behavior. Close the original with that rationale, or split a production guard for `DCAP_TDX_QUOTE_CONFIGFS_PATH` into a hardening issue |
| [#563](https://github.com/Dstack-TEE/dstack/issues/563) `simulate_quote` runtime path in production guest agent | Fixed | [#582](https://github.com/Dstack-TEE/dstack/pull/582) isolates the simulator into a dedicated binary; close as completed |
| [#564](https://github.com/Dstack-TEE/dstack/issues/564) `GetAppEnvEncryptPubKey` unauthenticated app ID enumeration | Accepted by design | The RPC returns a public encryption key before an app has an attested identity, and `app_id` is not treated as secret. Close as not planned after linking the bootstrap rationale |
| [#566](https://github.com/Dstack-TEE/dstack/issues/566) Gzip decompression bomb in RA-TLS cert extension | Fixed | [#595](https://github.com/Dstack-TEE/dstack/pull/595) bounds decompressed RA-TLS event-log extension size; close as completed |
| [#567](https://github.com/Dstack-TEE/dstack/issues/567) Unbounded allocation in `VecOf` decode | Fixed | [#570](https://github.com/Dstack-TEE/dstack/pull/570) caps `VecOf` decode length and pre-allocation; close as completed |
| [#605](https://github.com/Dstack-TEE/dstack/issues/605) Identical raw key material across `ed25519` and `secp256k1` for the same path | Accepted compatibility decision, docs-only | Existing derived key bytes are preserved; docs now state that `path` is the domain separator and callers must use algorithm-specific paths when they require independent keys |
| [#606](https://github.com/Dstack-TEE/dstack/issues/606) App keys and decrypted env files world-readable | Needs hardening | Tightening secret-bearing file writes to owner-only permissions (`0600`) is a valid defense-in-depth improvement with no expected compatibility cost |
| [#607](https://github.com/Dstack-TEE/dstack/issues/607) `gateway_app_id = "any"` disables gateway identity pinning | Accepted by design for dev/test deployments | `gateway_app_id` is KMS contract configuration and is publicly auditable; production deployments must not use `"any"` |
| [#608](https://github.com/Dstack-TEE/dstack/issues/608) `auth_api.type = "dev"` allows all authorization | Accepted by design for local/integration testing | Dev auth is measured runtime configuration, not a production mode; production must use webhook/on-chain authorization |
| [#609](https://github.com/Dstack-TEE/dstack/issues/609) `quote_enabled = false` bypasses attestation | Accepted by design for local development | The flag is measured in runtime configuration and should fail production attestation policy |

Recommended GitHub cleanup for this cluster:

- Keep #556 and #606 open only while their hardening patches are pending, then close them as completed.
- Close #554, #557, #563, #566, and #567 as completed, with links to the fixing PRs.
- Close #549, #555, #559, #560, #561, #562, #564, #605, #607, #608, and #609 with links to the relevant security docs and maintainer rationale.
- Keep a separate roadmap issue for #552 key versioning/rotation if it has an owner and migration plan; otherwise close #552 as not planned for the current KDF version.
Loading
Loading