Secrets Management: Stop Shipping API Keys in .env
A .env file is where secrets go to leak. The progression from .env to platform vars to secret managers to OIDC, and how to stop committing keys for good.
On this page
I have lost count of how many times I have run git log -p on a client repo and watched a live Stripe key, an AWS access key, and a database URL with the password inline scroll past in plaintext. Usually they are still valid. Usually nobody knows they are there. The pattern is always the same: someone committed .env "just once" during setup, deleted it in a later commit, and assumed that was the end of it. Git remembers forever. That key is in the history, in every clone, in every fork, and increasingly in someone's training scraper.
A .env file is a fine local-dev convenience and a terrible production strategy. Let me walk through why, and the actual maturity ladder you climb to get off it.
Why .env is a dev-only stopgap
The .env file solved a real problem: keeping config out of source code so you could follow the Twelve-Factor App advice to store config in the environment. That part is good. The problem is everything around the file.
- Accidental commits. One missing
.gitignoreline and the whole thing is in history. The OWASP Top 10 has carried some form of "Security Misconfiguration" and "Cryptographic Failures" category for years, and leaked credentials are the most boring, most common version of both. - Sprawl. The same secret lives on your laptop, your coworker's laptop, the CI runner, a Slack DM from when you onboarded someone, and a 1Password note nobody updated. There is no single source of truth, so there is no way to answer "who has this key."
- No rotation. Static files do not rotate. When an employee leaves or a key leaks, you are doing archaeology across machines to find every copy.
- No audit trail. You cannot answer "what read this secret, and when." There is nothing to read. A file does not log access.
None of these are theoretical. They are the post-incident findings on basically every credential-leak retro I have sat in.
The maturity spectrum
Think of this as four rungs. You do not skip rungs for fun, but you should know which one you are standing on and why.
| Stage | Mechanism | Rotation | Audit | Good for |
|---|---|---|---|---|
| 0 | .env file | Manual | None | Local dev only |
| 1 | Platform env vars (Vercel, Fly, etc.) | Manual | Partial | Small teams, single platform |
| 2 | Secret manager (AWS Secrets Manager, Vault, Doppler) | Automated | Full | Real production, multiple services |
| 3 | Workload identity / OIDC | No long-lived secret exists | Full | Cloud-to-cloud, CI |
Stage 0: .env, done correctly
If you are going to use .env locally — and you should — commit a template, never the real thing. Two files, one ignored.
# .gitignore
.env
.env.local
.env.*.local
!.env.example# .env.example — committed, fake values, documents required keys
DATABASE_URL="postgresql://user:password@localhost:5432/app"
STRIPE_SECRET_KEY="sk_test_replace_me"
JWT_SIGNING_KEY="generate_with_openssl_rand_-hex_32"
SENTRY_DSN=""The .env.example is the contract. A new developer copies it to .env, fills in real values from your secret manager, and runs. New required key? You add it to the example in the same PR that needs it, so review catches a missing variable instead of a 3 a.m. crash loop.
Stage 1: platform environment variables
Once you deploy, the secret should live where the workload runs, not in a file you upload. On Vercel that is project environment variables, scoped per environment:
# Set a production-only secret, never written to disk in the repo
vercel env add STRIPE_SECRET_KEY production
# Pull non-production values into a gitignored local file for dev
vercel env pull .env.localThis gets you off shared files and gives you per-environment scoping (preview keys cannot touch production). It is a real step up. The ceiling: rotation is still manual, the audit trail is whatever your platform happens to log, and if you run on three platforms you now have three sources of truth.
Stage 2: a real secret manager
This is where most production systems should live. A secret manager is a service whose entire job is to store secrets, control who reads them, rotate them, and log every access. You fetch the secret at startup over an authenticated API instead of baking it into the environment.
Here is the pattern with AWS Secrets Manager and the AWS SDK for JavaScript v3, fetching at boot and caching in memory so you are not hitting the API on every request:
import {
SecretsManagerClient,
GetSecretValueCommand,
} from "@aws-sdk/client-secrets-manager";
const client = new SecretsManagerClient({ region: "eu-central-1" });
let cache: Record<string, string> | null = null;
export async function loadSecrets(): Promise<Record<string, string>> {
if (cache) return cache;
const res = await client.send(
new GetSecretValueCommand({ SecretId: "prod/app/config" }),
);
if (!res.SecretString) {
throw new Error("SecretString missing — is the secret binary?");
}
cache = JSON.parse(res.SecretString) as Record<string, string>;
return cache;
}Notice what is missing: there is no AWS access key in this code. The client picks up credentials from the runtime's role (more on that below). The only thing this process knows how to do is ask for prod/app/config, and IAM decides whether it is allowed.
If you want secrets that live in git but stay encrypted — useful for GitOps and Kubernetes — SOPS (Mozilla's sops, now a CNCF project) encrypts values in place with a KMS key or age key. The file is committed, the values are ciphertext, and only a workload holding the decryption key can read them:
# secrets.enc.yaml — safe to commit; values are KMS-encrypted
stripe_secret_key: ENC[AES256_GCM,data:9KpL...==,type:str]
database_url: ENC[AES256_GCM,data:7Hq2...==,type:str]
sops:
kms:
- arn: arn:aws:kms:eu-central-1:111122223333:key/abcd-1234The honest tradeoff: managers add a network dependency and a small cold-start cost (single-digit milliseconds for a cached client, ~50–200ms for the first uncached fetch). For 99% of services that is invisible. SOPS keeps the git-native workflow but you own key distribution. Doppler and HashiCorp Vault sit in the same tier — Vault if you want dynamic, short-lived database credentials generated on demand; Doppler if you want a managed sync layer with less operational overhead.
Stage 3: stop having a secret at all
The best long-lived secret is the one that does not exist. This is the part most teams have not adopted yet, and it is the single biggest leverage move available.
Workload identity / OIDC replaces a static credential with a short-lived token your platform mints and your cloud verifies. GitHub Actions can present an OIDC token (an RFC 7519 JWT) that AWS, GCP, or Azure trusts. The cloud hands back credentials valid for an hour. Nothing long-lived is ever stored.
Compare the two CI approaches honestly:
| Static access key | OIDC / workload identity | |
|---|---|---|
| Stored in CI | Yes — AWS_SECRET_ACCESS_KEY | No — token minted per run |
| Lifetime | Until you rotate (often: never) | ~1 hour |
| Leak blast radius | Full account access, indefinitely | One run, expires fast |
| Rotation | Manual, easy to forget | Automatic, nothing to rotate |
| Setup cost | Two secrets, five minutes | One IAM role + trust policy |
Here is the GitHub Actions job. There is no access key anywhere — permissions: id-token: write is what lets the runner request the OIDC token:
name: deploy
on:
push:
branches: [main]
permissions:
id-token: write # required to mint the OIDC token
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials via OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::111122223333:role/gha-deploy
aws-region: eu-central-1
# No aws-access-key-id, no aws-secret-access-key.
- run: aws s3 sync ./dist s3://my-bucket --deleteOn the AWS side, the IAM role's trust policy pins it to your exact repo and branch, so a fork or a feature branch cannot assume it. Scope the sub claim — repo:my-org/my-repo:ref:refs/heads/main — never a wildcard. The same idea applies in your running infra: an EC2 instance role, an ECS task role, or a Kubernetes service account via IRSA means your app code holds zero credentials and the SDK example above just works.
The non-negotiables, regardless of stage
These apply at every rung, and skipping them is how "we use a secret manager" still turns into an incident.
- Least privilege and scoping. A service gets read access to its own secrets and nothing else. The deploy role cannot read the database password unless deploy actually needs it. Separate prod and non-prod paths (
prod/app/*vsstaging/app/*) so a staging compromise stays in staging. - Never log secrets. This is where leaks hide in plain sight. Redact in your logger, and never dump
process.envinto an error report. A surprising number of "secret manager" setups leak the secret straight into Datadog because someone logged the config object on boot.
const REDACT = /(secret|token|password|key|dsn)/i;
function safe(obj: Record<string, unknown>) {
return Object.fromEntries(
Object.entries(obj).map(([k, v]) => [k, REDACT.test(k) ? "***" : v]),
);
}
logger.info("config loaded", safe(config)); // values masked by key name- Rotation on a schedule and on every offboard. If rotation is hard, it never happens. Managers automate it; OIDC sidesteps it entirely. When someone leaves, rotation should be a button, not an investigation.
- Build-time vs runtime injection. Know the difference. Anything inlined at build time — a
NEXT_PUBLIC_*var, aVITE_*var — ships to the browser in plaintext. It is not a secret; it is public config. Real secrets must be injected at runtime, server-side only. I have seen a "private" API key bundled into client JS because someone prefixed itNEXT_PUBLIC_to silence a build warning. Assume anything in the client bundle is published.
Leak detection: assume you will slip
You will eventually commit something you should not. Catch it fast.
gitleaksas a pre-commit hook and a CI gate. It scans diffs (and full history with--log-opts) against entropy and provider patterns:
# Pre-commit: block the commit if a secret is staged
gitleaks protect --staged --redact --verbose
# CI / audit: scan the entire history
gitleaks detect --source . --redact- Provider push protection. Turn on GitHub Secret Scanning push protection at the org level. It blocks a push containing a recognized credential pattern before it ever reaches the remote — the cheapest possible save.
- When it does leak, rotate first, scrub second. The instinct is to rewrite history with
git filter-repo. Do that, but only after you have revoked the key. A scrubbed-but-still-valid key in someone's existing clone is still a live key. Revocation is the fix; history rewriting is cleanup.
This ties directly into supply-chain and CI hardening. Your CI runner is the juiciest target you own — it has, by design, the credentials to deploy to production. Short-lived OIDC tokens, pinned action SHAs, scoped roles, and minimal permissions: blocks are the same discipline as secrets management, applied one layer out.
The decision framework
Run down this list whenever you stand up a new service:
- Local dev only?
.envplus a committed.env.example, with.gitignorecorrect on day one. - Deployed on one platform? Move secrets into platform env vars, scoped per environment.
- Production, multiple services, or any compliance need? Adopt a secret manager (AWS Secrets Manager, Vault, Doppler) or SOPS if you want git-native encrypted secrets. Fetch at startup, cache in memory.
- Cloud-to-cloud or CI auth? Use OIDC / workload identity so no long-lived credential exists at all. This should be the default for new CI, not an upgrade you get to "later."
- Always, at every stage: least privilege, no secrets in logs, rotation on a schedule and on offboarding,
gitleaksin CI, and push protection on.
The goal is not perfect security. It is making the lazy path the safe path — a .gitignore that is right by default, a CI pipeline with no key to steal, and a manager that rotates so you never have to remember to. Get those in place and "we leaked a key" stops being a question of when.
Further reading
- The Twelve-Factor App, factor III (Config) — 12factor.net
- OWASP Top 10 — owasp.org
- GitHub Actions: Security hardening with OpenID Connect — docs.github.com
- RFC 7519 (JSON Web Token) and RFC 6749 (OAuth 2.0) — rfc-editor.org
gitleaksandsops— both on GitHub under their respective org repos