PER-ACTION LOOKUP SEQUENCE ON EVERY CI RUN

Each of these 6 steps fires for every single action in your build graph. On a 5,000-action monorepo with 80% remote cache hit rate, steps 1–3 execute 5,000 times. Steps 4–5 execute 4,000 times. Nothing is ever skipped at the disk layer.

Cache miss

Cache hit

Network round-trip

Bazel computes action key ~0ms

Hashes input digests + command + whitelisted env vars for compile_server. Deterministic if the build is hermetic; changes on every run if env vars or timestamps leak in.

Check --disk_cache (AC lookup) miss

Runner is freshly provisioned. The disk cache directory exists but holds zero entries. Every action misses here, always. Cost per miss: ~0.5ms. Total cost on 5,000 actions: ~2.5s of wasted disk I/O returning nothing.

Check --remote_cache (AC lookup over network) 20-100ms

Round-trip to GCS or S3. Even if Bazel issues these in parallel (32 concurrent requests on a 16-vCPU runner), 5,000 lookups ÷ 32 = 156 batches × 50ms avg = ~8 seconds of pure AC lookup latency per build.

Remote AC returns hit: digest a3f7... for server.o hit

The AC result contains output names mapped to content digests — not the actual file bytes. Bazel now knows what server.o should look like, but doesn't have the file.

Check disk for blob a3f7... miss

Bazel checks the local CAS (disk cache) for the blob before fetching it over the network. Empty disk = miss. This is a second wasted disk lookup per cached output file.

Fetch blob a3f7... from remote CAS 30-200ms per file

Bazel downloads the actual bytes from GCS/S3. For a Rust compile action, server.rlib can be 5–30MB. At 100MB/s effective throughput, a 10MB blob costs 100ms. Multiply by 4,000 cache hits.

Net result: On every run, steps 2 and 5 add ~2.5s of wasted disk lookups that return nothing. Steps 3 and 6 together add 8–15 minutes of network time for a 5,000-action graph with 80% hit rate. The disk cache flag is set. It contributes nothing.

PER-ACTION LOOKUP SEQUENCE WITH NAMESPACE CACHE VOLUME

Same 5,000-action build. The Cache Volume has been warm since run 2. Steps 2 and 5 now return hits at NVMe speed. Steps 3, 4, 6 execute only for actions that genuinely changed since the last committed cache state.

Cache miss

Cache hit

Network round-trip

Bazel computes action key ~0ms

Identical to ephemeral. Action key computation doesn't change. If your build is hermetic, the key is stable run-to-run and the Cache Volume entry remains valid.

Check mounted Cache Volume (AC lookup) hit — <1ms

The Cache Volume persists across job teardown. Bazel's --disk_cache is pointed at the mounted NVMe volume. For unchanged actions, the AC entry is already there from the last committed run. NVMe latency: 0.1–0.5ms per lookup.

Check Cache Volume for blob a3f7... hit — <1ms

The blob is already in the local CAS on the Cache Volume. Bazel reads it from NVMe directly. No network request. No waiting. Steps 3, 4, and 6 are skipped entirely for this action.

3–6

Remote cache consulted only for genuinely changed actions skipped for warm items

Steps 3, 4, 6 still execute for the ~20% of actions that missed the local cache (i.e., files that actually changed). The remote cache handles the delta; the Cache Volume handles the stable majority.

Net result: For 4,000 cached actions, lookup cost drops from ~8s (network AC) + download time to ~4s total (NVMe reads). Network traffic shifts from the full 80% of the graph to only the 20% that actually changed. Run 2 onwards is where the Cache Volume pays off.

Cache Volume lifecycle across runs

Run 1 — cold start

Cache Volume is empty. All actions fall through to remote cache. At the end of the run (exit 0), Bazel's output is committed as the new cache state. Volume now holds the full build output.

Run 2 — warm cache, incremental change

Job forks the committed cache state into a private copy. 80% of actions hit the NVMe volume at <1ms. Only actions downstream of changed files hit the remote cache. Committed at exit 0.

Parallel PRs — concurrent forks

PR A and PR B each get their own fork of the last committed state. Their writes don't collide. Last job to complete successfully wins the commit. Deterministic Bazel outputs make this safe.

Failed job — cache unchanged

If a job exits non-zero, its cache writes are discarded. The committed state reverts to the last successful run. No partial-build corruption.

CACHE LOOKUP LATENCY PER OPERATION

Each AC lookup is one network round-trip (or one disk read). The latency difference between a warm NVMe volume and a cross-region GCS bucket compounds across thousands of actions per build.

Namespace NVMe hot tier

< 1ms

Local SSD (same region)

1-4ms

S3 (same region)

20-40ms

GCS (same region)

25-50ms

GCS (cross-region)

50-100ms

S3 (cross-region)

60-120ms

Cumulative AC lookup overhead — 5,000-action build, 80% hit rate

Bazel issues AC lookups in parallel. Assuming 32 concurrent requests on a 16-vCPU runner:

Namespace NVMe

~2s total

GCS same-region

~4-7s

GCS cross-region

~8-16s

AC lookup overhead is separate from blob download time. A 5,000-action build with 80% hit rate also downloads ~4,000 cached blobs over the network on an ephemeral runner. That's the larger cost — and it also disappears with a warm Cache Volume.

WHY ACTIONS/CACHE WITH ~/.CACHE/BAZEL DOESN'T CLOSE THE GAP

The standard workaround is to archive the Bazel cache directory at the end of each run and restore it at the start of the next. Each failure mode below is a separate problem; most teams hit several simultaneously.

Failure mode	Cause	Impact
Restore overhead	Single-threaded tar extract before job can start	2-4 min added to every run for large Rust/C++ caches
10GB repo cap	GitHub enforces a 10GB limit per repo	Active Rust monorepos exceed this in days; eviction causes cold starts
Branch key mismatch	Cache key includes branch name; PRs miss `main` cache	Every PR runs cold regardless of how warm the main branch cache is
CPU-bound tar on 2-core runner	Standard GitHub runner has 2 vCPUs; tar is single-threaded by default	Save/restore is slower than the download it's trying to avoid
Stale snapshot analysis	Restored cache is a snapshot; Bazel re-analyses full graph to verify	Full graph re-analysis adds 60-90s even when all outputs are valid
No concurrent write safety	Two parallel jobs restore the same cache key, each writes a full archive	Second writer overwrites first; cache state is non-deterministic

Time lost to restore (Rust monorepo, ~20GB cache)

3-5 min

per run, before a single Bazel action fires

Cache Volume restore overhead (Namespace)

< 2s

NVMe volume mount, no tar, no network download

The underlying problem: actions/cache is designed for dependency archives (node_modules, pip packages) that are read once and rarely change. Bazel's output cache is written continuously, grows unboundedly, and needs fine-grained invalidation, not coarse archive snapshots. Fitting it into actions/cache is technically possible and operationally fragile.