Execution Sandboxing: A Practical Guide
๐ Key Takeaway: Start by defining trust zones, then apply matching controls that keep untrusted validation isolated from release paths while enforcing ephemeral runners, short-lived credentials, and deny-by-default egress.
This guide translates sandboxing principles into concrete controls for real CI/CD environments.
Use it when you need to answer: โWhat should we deploy this quarter to materially reduce risk from untrusted execution?โ
Scope and assumptions
This playbook focuses on execution risks from:
- untrusted pull requests and dependencies
- shared/self-hosted runners
- build and release tooling with write privileges
- secret exposure and token misuse
- unrestricted network egress
It is applicable regardless of whether automation is triggered by scripts, bots, or AI-enabled systems.
0) Define trust zones first
Before implementing controls, classify workflows into trust zones:
| Zone | Typical sources | Allowed outcomes |
|---|---|---|
| Untrusted validation | fork PRs, external contributions, unknown dependencies | lint/test/build checks only |
| Internal trusted CI | protected branches, approved maintainers | artifact build + internal publish |
| Release/deploy | signed tags, release managers, protected environments | production deploy and signing |
Rule: never run untrusted-zone jobs on release/deploy infrastructure.
1) Runner isolation baseline
Required baseline
- Ephemeral runner per job (destroy after completion).
- No privileged containers unless explicitly justified and isolated.
- No host docker socket mount for untrusted jobs.
- Read-only root filesystem where possible.
- Separate runner pools per trust zone.
Good defaults for containerized runners
- Drop Linux capabilities by default (
cap-drop=ALL). - Use seccomp profile and AppArmor/SELinux policy.
- Set CPU/memory/pids limits.
- Set job timeout and process count limits.
2) Safe handling for untrusted PRs
Untrusted PR execution is the most common CI breach path.
Controls
- Trigger untrusted jobs with minimal token permissions.
- Do not expose deployment credentials, signing keys, or production secrets.
- Prevent write operations to protected branches/tags.
- Restrict artifact publication from untrusted workflows.
- Apply strict outbound network allowlist.
GitHub Actions example (least privilege token)
permissions:
contents: read
pull-requests: readOnly grant elevated scopes per job when required.
Important workflow safety note
Avoid using pull_request_target to run untrusted code with privileged context unless you fully understand and constrain checkout, permissions, and secret access behavior.
3) Secrets and identity controls
Prefer short-lived credentials
- Use OIDC federation to cloud providers instead of static cloud keys in CI secrets.
- Scope roles per workflow purpose (build-only, publish-only, deploy-only).
- Apply time-bound sessions and audience restrictions.
Secret exposure minimization
- Inject secrets only into jobs that require them.
- Mask and redact logs.
- Mount secrets read-only and avoid writing them to workspace.
- Rotate credentials and invalidate on suspected leakage.
Segregate high-impact secrets
Signing keys, registry publish credentials, and production deploy tokens should be available only in protected environments with approval gates.
4) Network egress control
Start from deny all egress, then open only required destinations.
Minimum egress policy structure
- allow source control host (fetch)
- allow package mirrors/registries needed for build
- allow artifact store
- deny direct access to internal admin/control planes
- deny metadata endpoints unless explicitly required
Kubernetes example pattern
- namespace-per-trust-zone
- default deny egress NetworkPolicy
- explicit allowlist policies per pipeline stage
5) Resource and abuse controls
Prevent runaway or abusive workloads:
- CPU/memory/pids quotas per job
- max execution time per stage
- bounded log size and artifact size
- concurrency controls for expensive workflows
- filesystem quota for temporary storage
These controls reduce denial-of-service risk and limit cost impact from malicious or broken tasks.
6) Build integrity and provenance
Sandboxing reduces runtime blast radius; integrity controls reduce supply-chain risk.
Recommended:
- Generate provenance attestations for build artifacts.
- Sign release artifacts.
- Enforce verification before deployment.
- Use reproducible/deterministic build settings where feasible.
Use SLSA-aligned controls to define maturity targets over time.
7) Policy gates (before and after execution)
Implement policy-as-code checks at multiple points:
- Pre-execution: validate workflow configuration (forbidden privileged flags, forbidden secret contexts).
- Runtime: enforce admission and sandbox policy (seccomp, allowed images, namespace constraints).
- Post-execution: verify provenance/signatures and deployment policy.
8) Logging and incident readiness
At minimum capture:
- workflow identity (repo, ref, actor, workflow, runner)
- command/process telemetry (high-risk operations)
- network destinations contacted
- secret access events
- artifact hashes and provenance metadata
Have a playbook for:
- revoke active tokens
- quarantine runner pool
- rotate secrets
- invalidate suspicious artifacts
- re-run trusted build from clean source
30/60/90 day implementation plan
First 30 days
- Split trusted vs untrusted runner pools.
- Enforce least-privilege CI token defaults.
- Disable secrets in untrusted PR workflows.
- Add job timeouts and resource limits.
By day 60
- Introduce default-deny egress with allowlists.
- Deploy seccomp/AppArmor (or equivalent) profiles.
- Migrate static cloud credentials to OIDC short-lived roles.
By day 90
- Add artifact signing and provenance checks.
- Enforce policy-as-code gates for workflow and runtime configuration.
- Run a tabletop exercise for CI compromise response.
Operational checklist
- Untrusted PRs isolated from privileged runners
- Ephemeral runners enabled
- CI token defaults set to read-only
- No long-lived cloud keys in CI
- Egress deny-by-default + destination allowlists
- Seccomp/profile enforcement active
- Resource quotas and timeouts configured
- Artifact signing/provenance verified at release
- Incident response runbook tested
References
-
NIST SP 800-190, Application Container Security Guide: https://csrc.nist.gov/pubs/sp/800/190/final
-
NIST SP 800-204A, Building Secure Microservices-based Applications Using Service-Mesh Architecture: https://csrc.nist.gov/pubs/sp/800/204/a/final
-
NIST SSDF (SP 800-218): https://csrc.nist.gov/pubs/sp/800/218/final
-
GitHub Actions security hardening: https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions
-
Docker, Docker Engine Security: https://docs.docker.com/engine/security/
-
Kubernetes, Network Policies: https://kubernetes.io/docs/concepts/services-networking/network-policies/
-
Kubernetes, Pod Security Standards: https://kubernetes.io/docs/concepts/security/pod-security-standards/
-
Linux kernel documentation, Seccomp BPF: https://www.kernel.org/doc/html/latest/userspace-api/seccomp_filter.html
-
SLSA specification: https://slsa.dev/spec/v1.0/