PER-ACTION LOOKUP SEQUENCE ON EVERY CI RUN
Each of these 6 steps fires for every single action in your build graph. On a
5,000-action monorepo with 80% remote cache hit rate, steps 1–3 execute 5,000 times. Steps 4–5 execute 4,000
times. Nothing is ever skipped at the disk layer.
Hashes input digests + command + whitelisted env vars for compile_server. Deterministic if
the build is hermetic; changes on every run if env vars or timestamps leak in.
Runner is freshly provisioned. The disk cache directory exists but holds zero entries. Every action misses
here, always. Cost per miss: ~0.5ms. Total cost on 5,000 actions: ~2.5s of wasted disk I/O returning
nothing.
Round-trip to GCS or S3. Even if Bazel issues these in parallel (32 concurrent requests on a 16-vCPU
runner), 5,000 lookups ÷ 32 = 156 batches × 50ms avg = ~8 seconds of pure AC lookup latency per build.
The AC result contains output names mapped to content digests — not the actual file bytes. Bazel now knows
what server.o should look like, but doesn't have the file.
Bazel checks the local CAS (disk cache) for the blob before fetching it over the network. Empty disk =
miss. This is a second wasted disk lookup per cached output file.
Bazel downloads the actual bytes from GCS/S3. For a Rust compile action, server.rlib can be
5–30MB. At 100MB/s effective throughput, a 10MB blob costs 100ms. Multiply by 4,000 cache hits.
Net result: On every run, steps 2 and 5 add ~2.5s of wasted disk lookups
that return nothing. Steps 3 and 6 together add 8–15 minutes of network time for a 5,000-action graph with 80%
hit rate. The disk cache flag is set. It contributes nothing.
PER-ACTION LOOKUP SEQUENCE WITH NAMESPACE CACHE VOLUME
Same 5,000-action build. The Cache Volume has been warm since run 2. Steps 2
and 5 now return hits at NVMe speed. Steps 3, 4, 6 execute only for actions that genuinely changed since the
last committed cache state.
Identical to ephemeral. Action key computation doesn't change. If your build is hermetic, the key is
stable run-to-run and the Cache Volume entry remains valid.
The Cache Volume persists across job teardown. Bazel's --disk_cache is pointed at the mounted
NVMe volume. For unchanged actions, the AC entry is already there from the last committed run. NVMe
latency: 0.1–0.5ms per lookup.
The blob is already in the local CAS on the Cache Volume. Bazel reads it from NVMe directly. No network
request. No waiting. Steps 3, 4, and 6 are skipped entirely for this action.
Steps 3, 4, 6 still execute for the ~20% of actions that missed the local cache (i.e., files that actually
changed). The remote cache handles the delta; the Cache Volume handles the stable majority.
Net result: For 4,000 cached actions, lookup cost drops from ~8s (network
AC) + download time to ~4s total (NVMe reads). Network traffic shifts from the full 80% of the graph to only
the 20% that actually changed. Run 2 onwards is where the Cache Volume pays off.
Cache Volume lifecycle across runs
Cache Volume is empty. All actions fall through to remote cache. At the end of the run (exit 0), Bazel's
output is committed as the new cache state. Volume now holds the full build output.
Job forks the committed cache state into a private copy. 80% of actions hit the NVMe volume at <1ms. Only
actions downstream of changed files hit the remote cache. Committed at exit 0.
PR A and PR B each get their own fork of the last committed state. Their writes don't collide. Last job to
complete successfully wins the commit. Deterministic Bazel outputs make this safe.
If a job exits non-zero, its cache writes are discarded. The committed state reverts to the last successful
run. No partial-build corruption.
CACHE LOOKUP LATENCY PER OPERATION
Each AC lookup is one network round-trip (or one disk read). The latency
difference between a warm NVMe volume and a cross-region GCS bucket compounds across thousands of actions per
build.
Namespace NVMe hot tier
< 1ms
Local SSD (same region)
1-4ms
GCS (same region)
25-50ms
GCS (cross-region)
50-100ms
S3 (cross-region)
60-120ms
Cumulative AC lookup overhead — 5,000-action build, 80% hit rate
Bazel issues AC lookups in parallel. Assuming 32 concurrent requests on a
16-vCPU runner:
AC lookup overhead is separate from blob download time. A 5,000-action build with 80% hit
rate also downloads ~4,000 cached blobs over the network on an ephemeral runner. That's the larger cost — and
it also disappears with a warm Cache Volume.
WHY ACTIONS/CACHE WITH ~/.CACHE/BAZEL DOESN'T CLOSE THE GAP
The standard workaround is to archive the Bazel cache directory at the end of
each run and restore it at the start of the next. Each failure mode below is a separate problem; most teams hit
several simultaneously.
| Failure mode |
Cause |
Impact |
| Restore overhead |
Single-threaded tar extract before job can start |
2-4 min added to every run for large Rust/C++ caches |
| 10GB repo cap |
GitHub enforces a 10GB limit per repo |
Active Rust monorepos exceed this in days; eviction causes cold starts |
| Branch key mismatch |
Cache key includes branch name; PRs miss main cache |
Every PR runs cold regardless of how warm the main branch cache is |
| CPU-bound tar on 2-core runner |
Standard GitHub runner has 2 vCPUs; tar is single-threaded by default |
Save/restore is slower than the download it's trying to avoid |
| Stale snapshot analysis |
Restored cache is a snapshot; Bazel re-analyses full graph to verify |
Full graph re-analysis adds 60-90s even when all outputs are valid |
| No concurrent write safety |
Two parallel jobs restore the same cache key, each writes a full archive |
Second writer overwrites first; cache state is non-deterministic |
Time lost to restore (Rust monorepo, ~20GB cache)
3-5 min
per run, before a single Bazel action fires
Cache Volume restore overhead (Namespace)
< 2s
NVMe volume mount, no tar, no network download
The underlying problem: actions/cache is designed for
dependency archives (node_modules, pip packages) that are read once and rarely change. Bazel's output cache is
written continuously, grows unboundedly, and needs fine-grained invalidation, not coarse archive snapshots.
Fitting it into actions/cache is technically possible and operationally fragile.