Skip to content

ci: stop refetching translations and rebuilding JS in downstream jobs#513

Open
jkmassel wants to merge 1 commit into
trunkfrom
jkmassel/prime-translations-cache
Open

ci: stop refetching translations and rebuilding JS in downstream jobs#513
jkmassel wants to merge 1 commit into
trunkfrom
jkmassel/prime-translations-cache

Conversation

@jkmassel
Copy link
Copy Markdown
Contributor

@jkmassel jkmassel commented May 13, 2026

Summary

  • Cut PR build wall-time from 13:15 → 9:00 (~32% faster) and per-build agent-time by ~10 minutes.
  • Single source of truth for translation fetching and JS compilation: only Build React App does either, and every other job consumes the dist.tar.gz artifact.
  • Drop node-fetch in favour of Node 20's built-in fetch so the translation fetch can run in parallel with npm ci.

Speedup

Compared against trunk #2251 (last clean run before this PR):

Step Trunk #2251 This PR (#2307) Δ
Build React App 2:43 1:50 −53s
Test Android Library 4:19 2:43 −1:36
iOS Simulator Tests 4:46 2:57 −1:49
Library Tests 3:22 1:18 −2:04
Build XCFramework 4:55 1:29 −3:26
Test Android Library Instrumented 2:11 0:57 −1:14
Test iOS E2E 5:16 5:22 similar
Test Web E2E 6:00 6:07 similar
Other steps (no change) (no change) noise
Total wall-time 13:15 9:00 −4:15

Build XCFramework was on the wall-time critical path; trimming npm ci + JS build + translation fetch from it accounts for most of the headline win.

Root Cause

Three compounding inefficiencies:

1. build-react re-fetched translations every build

make build REFRESH_L10N=1 STRICT_L10N=1 defeated the Makefile's existence-based cache (Makefile:48), forcing all ~50 locales to be re-fetched from translate.wordpress.org on every CI build. Buildkite agents don't share state, so every build paid the full GlotPress round-trip cost.

2. Every downstream `: build` consumer re-fetched translations and re-ran `npm ci`

test-swift-library, test-swift-package, test-android, test-android-library-e2e, and build-resources-xcframework all declare build as a Make prereq. build: npm-dependencies prep-translations, so both ran first. On agents that just extracted an upstream dist.tar.gz:

  • dist/ is populated → build's recipe short-circuited the JS build
  • src/translations/ is empty (not in dist.tar.gz) → prep-translations ran the full fetcher
  • node_modules is empty → npm-dependencies ran npm ci
  • All of that work was then thrown away because dist/ already had everything baked in

3. Two Android jobs rebuilt dist/ from scratch

Test Android Library and Test Android Library Instrumented had no upstream dependency, so they walked the full : build chain — fetch all locales, npm ci, npm run build, copy assets — before finally running gradle.

Fix

.buildkite/pipeline.yml

  • Build React App: backgrounds make prep-translations STRICT_L10N=1 while running make npm-dependencies in the foreground, then waits on the fetch before make build REFRESH_JS_BUILD=1. Shell vars escaped with $$ for Buildkite interpolation.
  • Test Android Library and Test Android Library Instrumented: gain depends_on: build-react and download dist.tar.gz.

Makefile

  • prep-translations: gains a "skip if dist/ exists" clause. Translations are baked into the bundle at JS build time, so downstream consumers don't need a fresh fetch. REFRESH_L10N=1 and direct invocation still force the fetch.
  • build: drops npm-dependencies from its prereq list and invokes it via $(MAKE) inside the rebuild branch. Downstream : build consumers (which run gradle / xcodebuild / swift, not anything from node_modules/.bin) no longer pay for an npm ci they don't need.
  • npm-dependencies: drops the "direct invocation forces install" clause — build's recipe now invokes it via $(MAKE) which would otherwise trigger the force. REFRESH_DEPS=1 remains as the explicit force.
  • test-android and test-android-library-e2e: copy dist/ into Android assets unconditionally. build's recipe short-circuits copy-dist-android when dist/ already exists, so CI agents that download a dist artifact would otherwise have stale assets. Matches what test-android-e2e:329-331 and build-resources-xcframework:131 already do.

bin/prep-translations.js

Drop import fetch from 'node-fetch' in favour of Node 20's built-in global fetch. The script's surface (response.ok, response.status, response.headers.get, response.json()) is API-compatible with WHATWG Fetch in both implementations. Also drops node-fetch from package.json and package-lock.json (98 lines of lockfile).

Decision matrix for prep-translations

dist/ src/translations/ MAKECMDGOALS REFRESH_L10N Outcome
absent empty build (or anything) unset fetch
absent populated build (or anything) unset skip (bundles present)
present either not prep-translations unset skip (dist built) — new
either either prep-translations unset fetch (direct invoke)
either either anything 1 fetch

What we explored

1. CI-level cache with a weekly key ❌

Cache src/translations/ with a key that rolls weekly. Faster on most builds but accepts up to a week of staleness in shipped strings. Rejected — fresh-per-build is cheap once we stop fetching per-job.

2. Run prep-translations as a separate Buildkite step ❌

The first iteration of this PR did exactly that. It bought the agent-time win but added 2:39 of serial wall-time on the critical path because the fetch waited for its own agent startup + npm install. Replaced by the in-step parallel approach (see commit 9b4e35c1).

3. Swap hand-rolled fetcher for Fastlane::Helper::GlotpressDownloader

Separate concern — doesn't change CI speed. Tracked as a follow-up; see the release-toolkit helper.

Test plan

  • CI green on this PR (build #2307, 9:00 total)
  • Per-job wall-time savings on Library Tests, iOS Simulator Tests, Test Android Library, Build XCFramework, and Test Android Library Instrumented — confirms the dist/-based skip is firing
  • Local: make prep-translations still fetches when invoked directly (explicit user intent)
  • Local: make test-swift-library after a successful make build logs both "Skipping translations fetch (dist/ already built…)" and "Skipping NPM dependencies installation (node_modules already exists)" — and does not invoke npm ci
  • Spot-check that STRICT_L10N=1 still fails the prep step on a real GlotPress error (not exercised by this PR's builds — verified previously)

@github-actions github-actions Bot added the [Type] Build Tooling Issues or PRs related to build tooling label May 13, 2026
@wpmobilebot
Copy link
Copy Markdown

wpmobilebot commented May 13, 2026

XCFramework Build

This PR's XCFramework is available for testing. Add the following to your Package.swift:

.package(url: "https://github.com/wordpress-mobile/GutenbergKit", branch: "pr-build/513")

Built from 825a86c

@jkmassel jkmassel changed the title ci: fetch translations once per build, not once per job ci: stop refetching translations and rebuilding JS in downstream jobs May 13, 2026
@jkmassel jkmassel marked this pull request as ready for review May 13, 2026 22:29
Cuts PR build wall-time from 13:15 → 9:00 (32% faster) and per-build
agent-time by ~10 minutes. Single source of truth for translation
fetching and JS compilation: only `Build React App` does either, and
every other job consumes the `dist.tar.gz` artifact.

Three compounding inefficiencies removed:

1. `build-react` re-fetched translations every build. `make build
   REFRESH_L10N=1` defeated the Makefile's existence cache, forcing
   all ~50 locales to be re-fetched from translate.wordpress.org
   on every build.

2. Every downstream `: build` consumer (`test-swift-library`,
   `test-swift-package`, `test-android`, `test-android-library-e2e`,
   `build-resources-xcframework`) re-fetched translations *and* re-ran
   `npm ci`, only to throw the work away when `build`'s recipe
   short-circuited on existing `dist/`.

3. `Test Android Library` and `Test Android Library Instrumented`
   had no upstream dependency, so they walked the full `: build`
   chain — fetch all locales, `npm ci`, `npm run build`, copy assets —
   before finally running gradle.

Pipeline (.buildkite/pipeline.yml):
- `Build React App` backgrounds `make prep-translations` while running
  `make npm-dependencies` in the foreground, then `wait`s on the fetch
  before `make build`. `set -euo pipefail`, plus log redirect/replay so
  the backgrounded fetch's section markers don't interleave with
  `npm ci`'s and confuse Buildkite log folding.
- `Test Android Library` and `Test Android Library Instrumented` now
  `depends_on: build-react` and download `dist.tar.gz`.

Makefile:
- `prep-translations` skips when `dist/` exists *and*
  `src/translations/` is populated. `REFRESH_L10N=1` and direct
  invocation still force the fetch.
- `build` drops `npm-dependencies` from its prereq list and invokes it
  via `$(MAKE) _RECURSIVE_INVOKE=1 npm-dependencies` inside the
  rebuild branch. Downstream `: build` consumers no longer pay for
  an `npm ci` they don't need.
- `npm-dependencies` uses the `_RECURSIVE_INVOKE=1` sentinel to
  distinguish recursive calls from user-direct invocations, so
  `make npm-dependencies` still installs even when node_modules
  exists (matches existing user expectation), while the recursive
  call from `build` short-circuits.
- `test-android` and `test-android-library-e2e` copy `dist/` into
  Android assets unconditionally — `build`'s recipe short-circuits
  `copy-dist-android` when `dist/` exists, so CI agents that
  download a dist artifact would otherwise have stale assets.

bin/prep-translations.js:
- Drop `import fetch from 'node-fetch'` in favour of Node 20's
  built-in global `fetch`. Lets the script run before `npm ci`
  populates `node_modules`, enabling the parallel-with-npm-ci pattern
  in the CI step. Also drops `node-fetch` from package.json /
  package-lock.json (98 lines of lockfile).
@jkmassel jkmassel force-pushed the jkmassel/prime-translations-cache branch from 8adc99c to 825a86c Compare May 13, 2026 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Type] Build Tooling Issues or PRs related to build tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants