Each of these 6 steps fires for every single action in your build graph. On a 5,000-action monorepo with 80% remote cache hit rate, steps 1–3 execute 5,000 times. Steps 4–5 execute 4,000 times. Nothing is ever skipped at the disk layer.

Cache miss
Cache hit
Network round-trip
1
Bazel computes action key ~0ms
Hashes input digests + command + whitelisted env vars for compile_server. Deterministic if the build is hermetic; changes on every run if env vars or timestamps leak in.
2
Check --disk_cache (AC lookup) miss
Runner is freshly provisioned. The disk cache directory exists but holds zero entries. Every action misses here, always. Cost per miss: ~0.5ms. Total cost on 5,000 actions: ~2.5s of wasted disk I/O returning nothing.
3
Check --remote_cache (AC lookup over network) 20-100ms
Round-trip to GCS or S3. Even if Bazel issues these in parallel (32 concurrent requests on a 16-vCPU runner), 5,000 lookups ÷ 32 = 156 batches × 50ms avg = ~8 seconds of pure AC lookup latency per build.
4
Remote AC returns hit: digest a3f7... for server.o hit
The AC result contains output names mapped to content digests — not the actual file bytes. Bazel now knows what server.o should look like, but doesn't have the file.
5
Check disk for blob a3f7... miss
Bazel checks the local CAS (disk cache) for the blob before fetching it over the network. Empty disk = miss. This is a second wasted disk lookup per cached output file.
6
Fetch blob a3f7... from remote CAS 30-200ms per file
Bazel downloads the actual bytes from GCS/S3. For a Rust compile action, server.rlib can be 5–30MB. At 100MB/s effective throughput, a 10MB blob costs 100ms. Multiply by 4,000 cache hits.

Net result: On every run, steps 2 and 5 add ~2.5s of wasted disk lookups that return nothing. Steps 3 and 6 together add 8–15 minutes of network time for a 5,000-action graph with 80% hit rate. The disk cache flag is set. It contributes nothing.

Same 5,000-action build. The Cache Volume has been warm since run 2. Steps 2 and 5 now return hits at NVMe speed. Steps 3, 4, 6 execute only for actions that genuinely changed since the last committed cache state.

Cache miss
Cache hit
Network round-trip
1
Bazel computes action key ~0ms
Identical to ephemeral. Action key computation doesn't change. If your build is hermetic, the key is stable run-to-run and the Cache Volume entry remains valid.
2
Check mounted Cache Volume (AC lookup) hit — <1ms
The Cache Volume persists across job teardown. Bazel's --disk_cache is pointed at the mounted NVMe volume. For unchanged actions, the AC entry is already there from the last committed run. NVMe latency: 0.1–0.5ms per lookup.
5*
Check Cache Volume for blob a3f7... hit — <1ms
The blob is already in the local CAS on the Cache Volume. Bazel reads it from NVMe directly. No network request. No waiting. Steps 3, 4, and 6 are skipped entirely for this action.
3–6
Remote cache consulted only for genuinely changed actions skipped for warm items
Steps 3, 4, 6 still execute for the ~20% of actions that missed the local cache (i.e., files that actually changed). The remote cache handles the delta; the Cache Volume handles the stable majority.

Net result: For 4,000 cached actions, lookup cost drops from ~8s (network AC) + download time to ~4s total (NVMe reads). Network traffic shifts from the full 80% of the graph to only the 20% that actually changed. Run 2 onwards is where the Cache Volume pays off.

Cache Volume lifecycle across runs

Run 1 — cold start
Cache Volume is empty. All actions fall through to remote cache. At the end of the run (exit 0), Bazel's output is committed as the new cache state. Volume now holds the full build output.
Run 2 — warm cache, incremental change
Job forks the committed cache state into a private copy. 80% of actions hit the NVMe volume at <1ms. Only actions downstream of changed files hit the remote cache. Committed at exit 0.
Parallel PRs — concurrent forks
PR A and PR B each get their own fork of the last committed state. Their writes don't collide. Last job to complete successfully wins the commit. Deterministic Bazel outputs make this safe.
Failed job — cache unchanged
If a job exits non-zero, its cache writes are discarded. The committed state reverts to the last successful run. No partial-build corruption.

Each AC lookup is one network round-trip (or one disk read). The latency difference between a warm NVMe volume and a cross-region GCS bucket compounds across thousands of actions per build.

Namespace NVMe hot tier
< 1ms
Local SSD (same region)
1-4ms
S3 (same region)
20-40ms
GCS (same region)
25-50ms
GCS (cross-region)
50-100ms
S3 (cross-region)
60-120ms

Bazel issues AC lookups in parallel. Assuming 32 concurrent requests on a 16-vCPU runner:

Namespace NVMe
~2s total
GCS same-region
~4-7s
GCS cross-region
~8-16s

AC lookup overhead is separate from blob download time. A 5,000-action build with 80% hit rate also downloads ~4,000 cached blobs over the network on an ephemeral runner. That's the larger cost — and it also disappears with a warm Cache Volume.

The standard workaround is to archive the Bazel cache directory at the end of each run and restore it at the start of the next. Each failure mode below is a separate problem; most teams hit several simultaneously.

Failure mode Cause Impact
Restore overhead Single-threaded tar extract before job can start 2-4 min added to every run for large Rust/C++ caches
10GB repo cap GitHub enforces a 10GB limit per repo Active Rust monorepos exceed this in days; eviction causes cold starts
Branch key mismatch Cache key includes branch name; PRs miss main cache Every PR runs cold regardless of how warm the main branch cache is
CPU-bound tar on 2-core runner Standard GitHub runner has 2 vCPUs; tar is single-threaded by default Save/restore is slower than the download it's trying to avoid
Stale snapshot analysis Restored cache is a snapshot; Bazel re-analyses full graph to verify Full graph re-analysis adds 60-90s even when all outputs are valid
No concurrent write safety Two parallel jobs restore the same cache key, each writes a full archive Second writer overwrites first; cache state is non-deterministic
Time lost to restore (Rust monorepo, ~20GB cache)
3-5 min
per run, before a single Bazel action fires
Cache Volume restore overhead (Namespace)
< 2s
NVMe volume mount, no tar, no network download

The underlying problem: actions/cache is designed for dependency archives (node_modules, pip packages) that are read once and rarely change. Bazel's output cache is written continuously, grows unboundedly, and needs fine-grained invalidation, not coarse archive snapshots. Fitting it into actions/cache is technically possible and operationally fragile.