All posts
Cloud & DevOps··10 min read

GitHub Actions Patterns That Save Your Team Hours

CI that is slow, flaky, or insecure taxes every PR. Here are the GitHub Actions patterns I use for fast, cacheable, least-privilege pipelines.

By

On this page

A team I worked with last year was burning 14 minutes per CI run on a mid-sized TypeScript monorepo. Forty engineers, roughly 120 pushes a day. Do the math: that is 28 hours of wall-clock CI every single day, most of it spent reinstalling dependencies that had not changed and rerunning the entire suite on a one-line README edit. We cut it to under 4 minutes without touching the test code. The wins were all configuration: caching the right thing with the right key, cancelling superseded runs, sharding the suite, and skipping jobs that the diff did not touch.

None of this is exotic. The patterns below are the ones I reach for on every repo now. They make CI fast, they make it cheaper, and a couple of them close real supply-chain holes that most pipelines leave wide open.

Cache the right thing, and key it correctly

The single most common caching mistake I see is people thinking actions/setup-node with cache: 'npm' caches their node_modules. It does not. It caches the package manager's download cache (~/.npm), so npm ci still runs and still links every package. That is fine and you should use it, but understand what you are paying for: you skip the network fetch, not the install.

When the install itself is the bottleneck, cache node_modules directly with actions/cache and key it on the lockfile hash.

- uses: actions/setup-node@v4
  with:
    node-version: 22
    cache: npm            # caches ~/.npm download cache
 
- name: Cache node_modules
  id: modules
  uses: actions/cache@v4
  with:
    path: node_modules
    key: modules-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
    restore-keys: |
      modules-${{ runner.os }}-
 
- name: Install
  if: steps.modules.outputs.cache-hit != 'true'
  run: npm ci

Two details that matter. The key includes a hash of the lockfile, so any dependency change produces a new cache entry instead of silently restoring a stale one. The restore-keys line gives you a partial fallback: if the exact lockfile hash misses, you still restore the most recent modules- cache and npm ci only reconciles the delta. That single fallback line is often the difference between a 90-second install and a 6-second one.

A warning that has bitten me: never cache node_modules across operating systems or Node major versions. Native modules compile against a specific ABI, and restoring a Linux build onto a Windows runner produces failures that look like application bugs. Put runner.os and the Node version in the key whenever native deps are in play.

Cancel work nobody is waiting for

If someone pushes three commits to a PR in five minutes, the default behavior runs three full pipelines to completion. The first two are dead on arrival — nobody cares about CI results for a commit that has already been superseded. A concurrency group fixes this:

concurrency:
  group: ci-${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

Group by github.ref so each branch gets its own lane and a new push to one branch never cancels another's run. On busy repos this alone reclaims a meaningful chunk of your runner minutes. The one place you do not want cancel-in-progress: true is deploy workflows — cancelling a half-finished deploy can leave you in a broken state. For deploys, use a concurrency group without cancellation so runs queue instead.

Matrix builds and test sharding

A matrix fans one job definition out across dimensions — Node versions, operating systems, database engines. It is the clean way to prove your code works across the support surface without copy-pasting jobs:

strategy:
  fail-fast: false
  matrix:
    node: [20, 22, 24]
    os: [ubuntu-latest, windows-latest]

fail-fast: false is deliberate. The default stops every matrix leg the moment one fails, which is maddening when you want to know whether a failure is Windows-specific or universal. Turn it off so you get the full picture in one run.

The same machinery shards a slow suite across parallel runners. If your tests take 12 minutes on one machine, split them four ways and finish in three:

strategy:
  matrix:
    shard: [1, 2, 3, 4]
steps:
  - run: npx jest --shard=${{ matrix.shard }}/4

Jest, Vitest, Playwright, and most modern runners support native sharding. The economics are simple: GitHub bills by the minute per runner, so four runners for three minutes costs the same as one runner for twelve. You pay nothing extra for the wall-clock time you get back.

Skip jobs the diff never touched

In a monorepo, a change to the docs site should not run the backend integration suite. Path filters let a job decide whether it is even relevant. The cleanest approach uses dorny/paths-filter to compute which areas changed, then gates downstream jobs on its output:

jobs:
  changes:
    runs-on: ubuntu-latest
    outputs:
      backend: ${{ steps.filter.outputs.backend }}
    steps:
      - uses: actions/checkout@v4
      - uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # v3.0.2
        id: filter
        with:
          filters: |
            backend:
              - 'services/api/**'
 
  backend-tests:
    needs: changes
    if: needs.changes.outputs.backend == 'true'
    runs-on: ubuntu-latest
    steps:
      - run: echo "running backend suite"

One caveat for required checks, which I will come back to: if you make backend-tests a required status check and then skip it, GitHub treats a skipped required job as pending forever and your PR never becomes mergeable. The fix is to keep the job but short-circuit its body, or use a small "all checks green" aggregator job that always runs and reports success when its dependencies either passed or were legitimately skipped.

Passing artifacts between jobs

Jobs run on isolated machines with no shared filesystem. When a build job produces something a deploy job needs, upload it as an artifact and download it downstream:

  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm run build
      - uses: actions/upload-artifact@v4
        with:
          name: dist
          path: dist/
          retention-days: 5
 
  deploy:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/download-artifact@v4
        with:
          name: dist
          path: dist/

Set retention-days low. The default is 90 days and artifacts count against storage billing; a build output you will never inspect after the deploy completes does not need to live for three months. For passing small values like a version string or a computed flag, skip artifacts entirely and use job outputs — they are cheaper and clearer.

OIDC instead of long-lived cloud secrets

This is the pattern I most want mid-level engineers to internalize. The old way to deploy from CI was to mint an AWS access key, paste it into repository secrets, and hope it never leaks. That key is long-lived, broadly scoped, and a single compromised action or malicious PR can exfiltrate it. Rotating it across dozens of repos is a chore everyone avoids.

OIDC removes the standing credential entirely. GitHub mints a short-lived, signed token for each run; your cloud provider trusts GitHub as an identity provider and exchanges that token for temporary credentials scoped to exactly one role. Nothing long-lived ever sits in your secrets.

  deploy:
    runs-on: ubuntu-latest
    environment: production
    permissions:
      id-token: write     # required to mint the OIDC token
      contents: read
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-deploy
          aws-region: eu-central-1
      - run: aws s3 sync dist/ s3://my-bucket --delete

On the AWS side you create a trust policy that only accepts tokens from your specific repo and branch — scope the sub claim to repo:my-org/my-repo:ref:refs/heads/main so a fork or feature branch cannot assume the role. The credentials this produces expire in roughly an hour and exist only inside that one job. Azure and Google Cloud have the same federation mechanism. If you are still pasting cloud keys into GitHub secrets in 2026, this is the highest-leverage change on the list.

Least privilege with an explicit permissions block

By default the GITHUB_TOKEN in a workflow gets broad read-write access to the repository. Most jobs need almost none of it. Set permissions explicitly, deny by default, and grant only what each job uses:

permissions:
  contents: read          # repo-wide default: read-only
 
jobs:
  release:
    permissions:
      contents: write      # this job tags and pushes
      packages: write

A workflow-level contents: read means a compromised step in a test job cannot push to your default branch or open a release. Each job then widens its own grant only where it must. This is the same least-privilege instinct you apply to IAM, applied to the token CI hands every third-party action you run.

Pin third-party actions by SHA

Here is the supply-chain hole almost nobody closes. When you write uses: some-org/some-action@v3, that v3 is a mutable git tag. The author — or anyone who compromises their account — can move it to point at new code, and your pipeline will execute that code with whatever token permissions you granted, on your next run, with no PR and no review. There is real history here: the tj-actions/changed-files compromise in 2025 saw a tag retargeted to a commit that dumped runner secrets into build logs across thousands of repos.

Pin to the full commit SHA instead. A SHA is immutable; it cannot be moved.

# Mutable — the tag can be repointed under you:
- uses: actions/checkout@v4
 
# Immutable — pinned to an exact, reviewable commit:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1

Keep the version in a trailing comment so humans still know what they are running. Then let Dependabot manage the bumps — it understands SHA pins and opens PRs that update both the SHA and the comment, so you get an auditable review point for every action upgrade instead of silent mutation. Pin your own org's actions too if they are public; the threat model is identical.

Reference styleMutable?Auditable upgradesSupply-chain safe
@mainYesNoNo
@v3 (major tag)YesNoNo
@v3.1.4 (exact tag)Yes (tags can move)PartialNo
@<full SHA>NoYes (via PR)Yes

Reusable workflows and composite actions

Once you have three repos with near-identical pipelines, stop copy-pasting. A reusable workflow is an entire job graph another workflow can call; a composite action bundles a sequence of steps you drop into a job. Reach for a reusable workflow when you want to share a whole pipeline, and a composite action when you want to share a chunk of steps.

# Caller:
jobs:
  ci:
    uses: my-org/.github/.github/workflows/node-ci.yml@a1b2c3d4
    with:
      node-version: 22
    secrets: inherit

Pin the called workflow by SHA for the same reason you pin actions. Centralizing your pipeline this way means a caching or security fix lands once and every consuming repo inherits it on the next run.

Environment protection for deploys

Wrap production deploys in a GitHub Environment with protection rules: required reviewers, a wait timer, and a branch restriction so only main can deploy to it. Environment secrets are scoped to that environment, so a PR from a feature branch physically cannot read your production OIDC role configuration. This is the guardrail that turns "anyone who can merge can deploy to prod" into "a deploy to prod pauses for a named approver." Combined with OIDC, it is a strong default.

The checklist

Before you call a pipeline done, walk this list:

  • Dependencies cached with a lockfile-hashed key and a restore-keys fallback
  • concurrency group with cancel-in-progress: true on CI (and without it on deploys)
  • Slow suites sharded across parallel runners
  • Path filters skip jobs the diff did not touch
  • Skipped jobs do not block required checks (use an aggregator if needed)
  • Artifacts have a short retention-days; small values use job outputs
  • Top-level permissions: contents: read, widened per job only where needed
  • Cloud deploys use OIDC with a scoped trust policy — zero long-lived keys in secrets
  • Every third-party action pinned to a full commit SHA, with Dependabot managing bumps
  • Shared pipelines extracted into reusable workflows or composite actions
  • Production deploys gated behind a protected Environment with required reviewers

Work top to bottom. The caching and concurrency items pay you back in minutes per run, the OIDC and SHA-pinning items pay you back the one time they stop a credential from leaking — and that one time is worth more than all the minutes combined.