How I stopped shipping “mystery meat” container images — and the tools that changed the way my team thinks about Docker security in 2025.
Why This Keeps Me Up at Night (And Should Keep You Up Too)
A few years back, I inherited a microservices platform that had been running in production for 18 months. When I finally sat down to audit the supply chain, I found images pulling from unverified base layers, zero provenance on third-party dependencies, and a latest tag being deployed straight to production. We were, to put it bluntly, one compromised upstream package away from a catastrophe.
The SolarWinds attack, the codecov bash uploader incident, the xz utils backdoor — these aren’t abstract threats anymore. Supply chain attacks are the dominant vector targeting containerized workloads. And most engineering teams are woefully underprepared.
This isn’t a “here’s how to run docker pull” tutorial. I’m going to walk you through the exact hardening strategy I now enforce across every production Kubernetes cluster I run: cryptographic image signing with Cosign, generating and attesting Software Bill of Materials (SBOMs), multi-stage build discipline, and automated vulnerability scanning with Trivy and Grype. We’ll go deep on each one.
The Threat Model: What “Supply Chain Attack” Actually Means in Docker Context
Before we harden anything, let’s be precise about what we’re defending against. In Docker’s supply chain, the attack surface spans four layers:
- Base image poisoning — An attacker compromises a popular base image (
node:18,python:3.12-slim) before or after it’s pushed to a registry. - Dependency confusion — A malicious package with the same name as an internal one gets resolved from a public registry.
- Compromised build pipelines — CI systems injecting malicious artifacts during the build phase.
- Registry tampering — An image is modified in transit or at rest before your cluster pulls it.
Each layer requires a different defense. Let’s build them one by one.
Layer 1: Multi-Stage Builds — Your First Line of Defense
I cannot stress this enough: if you are still using single-stage Dockerfiles in production, you are shipping a build environment as a runtime artifact. That means compilers, package managers, build caches, and often credentials that were used during the build, all sitting inside the final image.
Multi-stage builds solve this by keeping your build toolchain completely separate from your runtime environment.
The Pattern I Use for Node.js Services
# ---- Stage 1: Dependency Installation ----
FROM node:20-alpine AS deps
WORKDIR /app
# Copy only what's needed for npm install (layer cache optimization)
COPY package.json package-lock.json ./
# Use ci for reproducible installs; omit devDependencies
RUN npm ci --omit=dev && npm cache clean --force
# ---- Stage 2: Build ----
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
# ---- Stage 3: Production Runtime (minimal attack surface) ----
FROM node:20-alpine AS runtime
# Run as non-root user — ALWAYS
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
WORKDIR /app
# Copy ONLY the built artifact and production deps
COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=deps --chown=appuser:appgroup /app/node_modules ./node_modules
EXPOSE 3000
CMD ["node", "dist/server.js"]
Why This Matters for Supply Chain Security
The final image above contains no npm, no build tools, and no source code. The attack surface is dramatically reduced. If a dependency was compromised during the build, the blast radius is limited to what actually executes at runtime.
Pro Tip: Pin your base images to a specific digest, not just a tag. Tags are mutable; digests are not.
# ❌ Mutable — tag can be overwritten at any time
FROM node:20-alpine
# ✅ Immutable — cryptographically pinned to a specific manifest
FROM node:20-alpine@sha256:b9bb8c6ce1c02b9e5e0fabb4a24c54e1f62a0e2e3de50a52a2e1e15d3f5e7c8d
I use crane digest node:20-alpine to fetch the current digest and lock it in my Dockerfiles, then update it deliberately as part of a controlled upgrade cycle.
Layer 2: Image Signing with Cosign — Cryptographic Provenance at Scale
If multi-stage builds reduce the attack surface, Cosign closes the verification gap between what you built and what you deployed. Without signing, your cluster has no way to know whether the image it’s about to pull is the one your CI pipeline produced — or something an attacker slipped in.
Cosign is part of the Sigstore project and has become the de facto standard for container image signing. It supports both keyless signing (via OIDC-based ephemeral keys, my preference in CI) and long-lived key pairs.
Setting Up Keyless Signing in GitHub Actions
Keyless signing is the mode I use for all my CI pipelines. Instead of managing long-lived key material, Cosign leverages the OIDC token from your CI provider (GitHub, GitLab, etc.) to get a short-lived certificate from Sigstore’s Fulcio CA. The signing event is also logged in Rekor, a transparency log, giving you an immutable audit trail.
# .github/workflows/build-and-sign.yml
name: Build, Scan, Sign
on:
push:
branches: [main]
jobs:
build-sign:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
id-token: write # Required for keyless Cosign signing
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and Push
id: build-push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
# Output the image digest for signing
outputs: type=image,name=ghcr.io/${{ github.repository }},push-by-digest=true,name-canonical=true
- name: Install Cosign
uses: sigstore/cosign-installer@v3
- name: Sign the image (keyless)
run: |
cosign sign --yes \
ghcr.io/${{ github.repository }}@${{ steps.build-push.outputs.digest }}
env:
COSIGN_EXPERIMENTAL: "true"
Verifying Signatures Before Deployment
Signing means nothing if you don’t verify. I enforce this at the admission controller level in Kubernetes using Policy Controller (from Sigstore), but you can also verify manually:
# Verify keyless signature — checks Rekor transparency log automatically
cosign verify \
--certificate-identity="https://github.com/myorg/myrepo/.github/workflows/build-and-sign.yml@refs/heads/main" \
--certificate-oidc-issuer="https://token.actions.githubusercontent.com" \
ghcr.io/myorg/myrepo@sha256:<digest>
Common Pitfall: Engineers often sign the tag (myimage:latest) rather than the digest. Always sign by digest. Tags can be overwritten; a digest is immutable. If you sign a tag and someone pushes a new image to that tag, your signature is now on a different image.
Layer 3: Software Bill of Materials (SBOMs) — Know Exactly What’s in Your Images
An SBOM is a machine-readable inventory of every component in your software artifact: packages, libraries, licenses, versions. Think of it as a nutrition label for your container image. In 2025, with Executive Order 14028 and frameworks like SLSA pushing SBOM requirements into enterprise procurement, this is no longer optional for teams shipping to regulated industries.
I generate SBOMs using Syft and attach them as attestations to my images using Cosign, making them part of the immutable provenance chain.
Generating and Attesting an SBOM
# Install syft (or use the GitHub Action)
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin
# Generate SBOM in SPDX JSON format (also supports CycloneDX)
syft ghcr.io/myorg/myrepo@sha256:<digest> \
-o spdx-json=sbom.spdx.json
# Inspect the SBOM — look at what packages are present
cat sbom.spdx.json | jq '.packages[] | {name: .name, version: .versionInfo, license: .licenseConcluded}'
# In your GitHub Actions workflow, after building:
- name: Generate SBOM
run: |
syft ghcr.io/${{ github.repository }}@${{ steps.build-push.outputs.digest }} \
-o spdx-json=sbom.spdx.json
- name: Attest SBOM to image
run: |
cosign attest --yes \
--predicate sbom.spdx.json \
--type spdxjson \
ghcr.io/${{ github.repository }}@${{ steps.build-push.outputs.digest }}
env:
COSIGN_EXPERIMENTAL: "true"
Now your SBOM is cryptographically attached to the image. Anyone pulling the image can verify and download the SBOM:
cosign verify-attestation \
--type spdxjson \
--certificate-identity="..." \
--certificate-oidc-issuer="..." \
ghcr.io/myorg/myrepo@sha256:<digest> | jq '.payload | @base64d | fromjson'
Lesson Learned: The first time I generated an SBOM for what I thought was a lean Alpine-based service, it revealed 214 packages — including several with known CVEs that had been silently baked into a transitive dependency. We never would have found them without the SBOM because nothing was scanning that deep.
Layer 4: Automated Vulnerability Scanning with Trivy and Grype
Signing and SBOMs give you provenance. Vulnerability scanning gives you risk awareness. I run both Trivy and Grype in my pipelines — they use different databases and have different strengths, so running both catches more.
Trivy: Fast, CI-Friendly, Battle-Tested
Trivy by Aqua Security is my go-to for in-pipeline scanning. It’s fast, has excellent Kubernetes CRD support, and can scan not just container images but also filesystems, git repositories, and IaC configs.
# Install Trivy
brew install trivy # macOS
# or via GitHub Releases for CI
# Scan a local image — fail on HIGH or CRITICAL vulns
trivy image \
--exit-code 1 \
--severity HIGH,CRITICAL \
--ignore-unfixed \
ghcr.io/myorg/myrepo:latest
# Output SARIF for GitHub Security tab integration
trivy image \
--format sarif \
--output trivy-results.sarif \
ghcr.io/myorg/myrepo:latest
# Full GitHub Actions integration with SARIF upload
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: ghcr.io/${{ github.repository }}@${{ steps.build-push.outputs.digest }}
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
exit-code: '1'
ignore-unfixed: true
- name: Upload Trivy scan results to GitHub Security tab
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: 'trivy-results.sarif'
Grype: The Second Opinion You Need
Grype (by Anchore, the same team that makes Syft) uses a different vulnerability database and tends to catch different things than Trivy. I run it as a complementary check.
# Install grype
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin
# Scan and fail on high/critical — also accepts SBOM as input
grype sbom:./sbom.spdx.json \
--fail-on high \
-o table
Pro Tip: Scan your SBOM rather than the live image when possible. It’s faster (no re-analysis of the image layers), and it’s testing the exact inventory you attested. This is particularly useful in CD pipelines where you want fast feedback.
Building a .trivyignore That Isn’t a Cop-Out
Every team eventually encounters a CVE they can’t immediately fix — the upstream package hasn’t patched it, or the vulnerability is in a path not exercised at runtime. The temptation is to blanket-ignore it. Resist that. Use a documented, time-bounded ignore with justification:
# .trivyignore
# CVE-2024-XXXXX — libssl vulnerability in OpenSSL 3.x
# Affected code path: TLS renegotiation (not used in our service)
# Upstream fix expected in OpenSSL 3.3.2 (target: Q3 2025 upgrade)
# Reviewed by: @your-security-team, 2025-01-15
CVE-2024-XXXXX
Bringing It All Together: The Complete Hardened Pipeline
Here’s the full pipeline order I use in production, combining all four layers:
Build (multi-stage Dockerfile with pinned digest)
└─> Push to registry (by digest, not tag)
└─> Generate SBOM (Syft → SPDX JSON)
└─> Scan SBOM for vulnerabilities (Grype)
└─> Scan image layers (Trivy)
└─> Sign image digest (Cosign keyless)
└─> Attest SBOM (Cosign + predicate)
└─> Tag image (only AFTER all checks pass)
└─> Deploy (Kubernetes Policy Controller verifies signature)
Critically, tagging happens last. This means myimage:latest or myimage:v1.2.3 is only ever pointing to an image that has passed signing and scanning. This prevents the common failure mode where a tag gets updated to an unscanned image.
Lessons Learned: What I Wish Someone Had Told Me
- Start with digest pinning, not signing. It’s the highest-ROI change you can make in an afternoon and requires zero new tooling. Everything else builds on it.
- SBOMs are only useful if you store and query them. Set up a lightweight database (even a S3 bucket + Athena will do) to query SBOMs across your fleet. Knowing what packages you have is only valuable when a new CVE drops and you need to answer “are we affected?” in minutes, not days.
- Cosign keyless is superior to key-based in CI. Managing long-lived signing keys in CI is an operational and security burden. Keyless signing with OIDC eliminates the “where is the private key?” problem entirely.
- Trivy false positives are real. Base OS packages often carry CVEs that are patched by the distro maintainer but still show up in vulnerability DBs. Use
--ignore-unfixedjudiciously and cross-reference with your distro’s security advisory feed. - Policy enforcement at admission time is the real goal. Scanning in CI is great; enforcing at the Kubernetes admission controller level means even manual
kubectl applyof a rogue image gets blocked. Kyverno or Sigstore’s Policy Controller are both excellent for this.
Common Pitfalls to Avoid
- Using
USER rootin production Dockerfiles — Even if it “works,” it violates the principle of least privilege. Always drop to a non-root user in the final stage. - Signing the
:latesttag — Tags are mutable. Always sign by digest. Always. - Treating scanning as a one-time event — New CVEs are published daily. Scan images in your registry on a schedule, not just at build time. Tools like Trivy’s
--list-all-pkgsmode help you build a continuous monitoring workflow. - Skipping the SBOM for base images — Your base image’s supply chain is also your supply chain. Syft can generate SBOMs for pulled images you don’t own.
- Not testing your verification policy — Write an integration test that deliberately pushes an unsigned image and verifies your admission controller rejects it.
Pro Tips for Senior Engineers
- Use
docker buildx build --sbom=true --provenance=true— Docker Buildx now natively supports SBOM and provenance generation via the--sbomand--provenanceflags, attaching them as OCI image index attestations automatically. - Integrate Grype into your editor — The Anchore VS Code extension surfaces CVEs in your Dockerfile and dependency files before you even commit.
- Automate base image upgrades — Use Renovate Bot with a digest-pinning configuration. It will automatically open PRs to update your pinned digests on a schedule, keeping you current without manual effort.
- Consider distroless base images — Google’s distroless images contain no shell, no package manager, and no OS utilities. The attack surface is minimal. They’re my preferred base for compiled languages like Go and Rust.
Conclusion
Container supply chain security is not a checkbox. It’s a posture — one that requires defense-in-depth across your entire build and deploy lifecycle. Multi-stage builds reduce your runtime attack surface. Digest pinning eliminates tag mutability as a risk. Cosign and keyless signing provide cryptographic provenance. SBOMs give you a real-time inventory of what you’re running. Trivy and Grype catch known vulnerabilities before they reach production.
The good news: none of this requires a massive platform rewrite. You can layer these controls in incrementally, starting with Dockerfile hygiene this week and reaching full SBOM attestation and policy enforcement within a quarter. Start small, automate aggressively, and build the security culture on your team one PR at a time.
Did I miss something, or have a different approach to container hardening? I’d love to hear it. Drop me a comment or find me on X/Twitter.
Tags: Docker Security, Container Security, Supply Chain Security, Cosign, Trivy, Grype, SBOM, DevSecOps, Kubernetes Security, CI/CD Security

