refactor: extract materialized_artifact crate out of fspy#344
refactor: extract materialized_artifact crate out of fspy#344branchseer merged 19 commits intomainfrom
Conversation
Split the embedded-binary helper out of fspy so other crates (upcoming vite_task dylib embedding) can reuse it. No behavior change; the Artifact struct, write_to logic, and artifact! macro move verbatim. Minor: Artifact::new now carries #[must_use], write_to gets an `# Errors` section for clippy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move `formatcp!` + `xxh3_128` into `embedded_artifact` behind two macros:
- `artifact_of!(name, bytes)` — new; hashes the bytes at compile time.
Used for small embedded artifacts (preload dylibs).
- `artifact!(name)` — unchanged; still reads `{name}.hash` from OUT_DIR.
Used for large artifacts where the build script has the bytes and can
hash them cheaply, avoiding slow const-time hashing at compile time.
Fspy's preload-dylib call sites now use `artifact_of!` and drop the
direct `const_format` + `xxhash-rust` regular deps. macOS download
script keeps writing `.hash` files for `artifact!`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New `embedded_artifact_build` crate exposes `hash()` and
`write_artifact(out_dir, name, bytes)` so callers' build scripts no
longer need a direct `xxhash-rust` dep — the hashing logic needed by
`embedded_artifact`'s `artifact!` macro is enclosed here.
Fspy's build.rs switches to the helper:
- [build-dependencies]: drop `xxhash-rust`, add `embedded_artifact_build`.
- Use `embedded_artifact_build::hash` for download verification.
- Use `embedded_artifact_build::write_artifact` instead of manually writing
the bytes + {name}.hash files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Privatize `embedded_artifact_build::hash` — only `write_artifact` is a public API now. - Switch fspy's macOS binary download verification from xxh3_128 of the extracted binary to sha256 of the tarball (the natural unit of a GitHub release asset). - Record each expected sha256 next to a comment pointing at the GitHub release page it came from, with a one-liner showing how to regenerate the value after a release bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rework the artifact API so names and hashes are published through
`cargo::rustc-env=…` by `embedded_artifact_build::register(name, path)`
and consumed at compile time by `artifact!($name)` (no const-eval
hashing, no hash sidecar files, no byte copies for cdylibs).
- `embedded_artifact_build::register` hashes with xxh3-128 and emits
`EMBEDDED_ARTIFACT_{name}_PATH` + `_HASH` plus `rerun-if-changed`.
- `artifact!($name)` reads both env vars via `include_bytes!(env!(…))`
and `env!(…)`.
- Preload cdylibs are artifact deps in `[build-dependencies]`;
cfg-gating the section triggers a cargo resolver panic on
cross-compile, so both preload crates are listed unconditionally
with `target = "target"` and each cfg-gates itself to an empty
crate on non-applicable targets.
- fspy's build.rs emits `rerun-if-env-changed` for the
`CARGO_CDYLIB_FILE_FSPY_PRELOAD_*` env var it consumes.
- `BinaryDownload` becomes a named struct.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…te_to -> ensure_in Rename the pair of crates to `bundled_artifact` + `bundled_artifact_build` to cover both the current embed-and-extract mode and a future check-only mode under the same name. Rename `Artifact::write_to` to `ensure_in` since the call skips the write when a file with the same content-addressed name already exists. Update env prefix from `EMBEDDED_ARTIFACT_` to `BUNDLED_ARTIFACT_`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 77bdc8dc79
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| fs::write(&hash_path, format!("{expected_hash:x}"))?; | ||
| for BinaryDownload { name, url, path_in_targz, expected_sha256 } in downloads { | ||
| let dest = out_dir.join(name); | ||
| let tarball = download(url).context(format!("Failed to download {url}"))?; |
There was a problem hiding this comment.
Reuse cached macOS artifacts before downloading
fetch_macos_binaries now unconditionally calls download(url) for every artifact, so any macOS build-script rerun (for example after touching build.rs or when the preload artifact path changes) requires live network access even when the same verified file is already present in OUT_DIR. This is a regression from the previous behavior that reused existing files and will cause avoidable build failures in offline/restricted CI environments and adds unnecessary external dependency to local incremental builds.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed on claude/address-pr-344-comments-Z5Zc6: fetch_macos_binaries now skips download+unpack+write when the extracted binary is already present in OUT_DIR, restoring the previous caching behavior.
Generated by Claude Code
There was a problem hiding this comment.
Reworked in 0d9eb15 to match main's caching workflow: fetch_macos_binaries now hashes the cached file on every rebuild and reuses it when it matches expected_sha256. The hash is now over the extracted binary (not the tarball), so the cached file vouches for its own integrity. Hash constants updated accordingly.
Generated by Claude Code
There was a problem hiding this comment.
Pull request overview
This PR refactors how fspy bundles and materializes runtime artifacts (preload shared libraries and macOS helper binaries) by introducing a reusable bundled_artifact mechanism and updating platform implementations to use it.
Changes:
- Replace
fspy’s bespokeartifactmodule with new workspace cratesbundled_artifactandbundled_artifact_build. - Update Unix/macOS and Windows implementations to materialize artifacts via
Artifact::ensure_in+artifact!(...). - Rework
crates/fspy/build.rsto download/register macOS binaries and register preload cdylibs for embedding.
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/fspy_preload_unix/src/lib.rs | Tightens cfg gating so the crate is effectively empty on non-Unix and musl targets while keeping feature gates scoped. |
| crates/fspy/src/windows/mod.rs | Switches Windows interpose DLL handling to bundled_artifact and materializes via ensure_in. |
| crates/fspy/src/unix/mod.rs | Switches Unix preload dylib + macOS helper materialization to bundled_artifact and ensure_in. |
| crates/fspy/src/unix/macos_artifacts.rs | Migrates macOS helper artifact definitions and test to bundled_artifact naming/materialization. |
| crates/fspy/src/lib.rs | Removes the old artifact module from the fspy crate. |
| crates/fspy/src/artifact.rs | Deletes the old artifact implementation now superseded by bundled_artifact. |
| crates/fspy/build.rs | Adds macOS download+SHA256 verification and registers bundled artifacts via bundled_artifact_build. |
| crates/fspy/Cargo.toml | Updates dependencies/build-dependencies to support artifact registration and embedding; adds cargo-shear ignores. |
| crates/bundled_artifact_build/src/lib.rs | New build-script helper for publishing artifact path+hash to the compile via env vars. |
| crates/bundled_artifact_build/README.md | New crate README. |
| crates/bundled_artifact_build/Cargo.toml | New crate manifest. |
| crates/bundled_artifact/src/lib.rs | New runtime crate that embeds files via artifact! and materializes them with ensure_in. |
| crates/bundled_artifact/README.md | New crate README. |
| crates/bundled_artifact/Cargo.toml | New crate manifest. |
| Cargo.toml | Adds the new crates to workspace dependencies. |
| Cargo.lock | Records dependency graph changes from the new crates and removed hashing/formatting deps in fspy. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Emits three `cargo::…` directives: | ||
| /// `rerun-if-changed={path}`, | ||
| /// `rustc-env=BUNDLED_ARTIFACT_{name}_PATH={path}`, and | ||
| /// `rustc-env=BUNDLED_ARTIFACT_{name}_HASH={hex}`. The runtime resolves these | ||
| /// at compile time via `include_bytes!(env!(…))` and `env!(…)`. | ||
| /// | ||
| /// `name` is used both as the env-var key and as the on-disk filename prefix | ||
| /// (in `Artifact::ensure_in`), so it must be a valid identifier-like string | ||
| /// that matches the one passed to `artifact!`. | ||
| /// | ||
| /// # Panics | ||
| /// | ||
| /// Panics if `path` is not valid UTF-8 or cannot be read. | ||
| pub fn register(name: &str, path: &Path) { | ||
| let path_str = path.to_str().expect("artifact path must be valid UTF-8"); | ||
| #[expect(clippy::print_stdout, reason = "cargo build-script directives")] | ||
| { | ||
| // Emit rerun-if-changed before reading so cargo still sees it even if | ||
| // reading the file below panics. | ||
| println!("cargo::rerun-if-changed={path_str}"); | ||
| let bytes = | ||
| fs::read(path).unwrap_or_else(|e| panic!("failed to read artifact at {path_str}: {e}")); | ||
| let hash = format!("{:x}", xxhash_rust::xxh3::xxh3_128(&bytes)); | ||
| println!("cargo::rustc-env={ENV_PREFIX}{name}_PATH={path_str}"); | ||
| println!("cargo::rustc-env={ENV_PREFIX}{name}_HASH={hash}"); |
There was a problem hiding this comment.
register prints build-script directives with the cargo:: prefix (e.g. cargo::rerun-if-changed / cargo::rustc-env). Cargo expects cargo: (single colon); with the current strings these directives may be ignored, causing the BUNDLED_ARTIFACT_* env vars not to be set and artifact!(...) to fail at compile time. Change these println! lines (and the doc comment mentioning cargo::…) to use cargo:.
There was a problem hiding this comment.
Switched all three directives (and the doc comment) to the cargo: single-colon prefix on claude/address-pr-344-comments-Z5Zc6, matching the rest of the build scripts in the repo.
(Minor nit for the reviewer: cargo::KEY=VALUE was stabilized in Rust 1.77 and would also work given our MSRV of 1.89; but single-colon is more consistent with the rest of the codebase.)
Generated by Claude Code
| }; | ||
| // The cdylib path is content-addressed by cargo; when its content changes | ||
| // the path changes. Track it so we re-publish the hash on update. | ||
| println!("cargo::rerun-if-env-changed={env_name}"); |
There was a problem hiding this comment.
This build script prints cargo::rerun-if-env-changed=.... Cargo build-script directives use the cargo: prefix; with cargo:: Cargo may ignore the directive, so changes to the preload cdylib env var may not trigger reruns and the bundled artifact hash/path may not be refreshed. Use cargo:rerun-if-env-changed={env_name} instead.
| println!("cargo::rerun-if-env-changed={env_name}"); | |
| println!("cargo:rerun-if-env-changed={env_name}"); |
There was a problem hiding this comment.
| impl SpyImpl { | ||
| pub fn init_in(path: &Path) -> io::Result<Self> { | ||
| let dll_path = INTERPOSE_CDYLIB.write_to(path, ".dll").unwrap(); | ||
| let dll_path = INTERPOSE_CDYLIB.ensure_in(path, ".dll", false).unwrap(); |
There was a problem hiding this comment.
init_in returns io::Result<Self>, but the ensure_in(...).unwrap() will panic on I/O errors (e.g. non-existent directory, permission denied) instead of returning the error to the caller. Prefer propagating the error with ? (or mapping it into an io::Error) so callers can handle failures gracefully.
| let dll_path = INTERPOSE_CDYLIB.ensure_in(path, ".dll", false).unwrap(); | |
| let dll_path = INTERPOSE_CDYLIB.ensure_in(path, ".dll", false)?; |
There was a problem hiding this comment.
Applied on claude/address-pr-344-comments-Z5Zc6: ensure_in now propagates via ? so I/O errors surface to the caller of init_in instead of panicking.
Generated by Claude Code
- Reuse extracted macOS binaries in OUT_DIR instead of re-downloading on every build-script rerun (previous regression would break offline/CI). - Switch build-script directives from `cargo::` back to `cargo:` for consistency with the rest of the build scripts. - Propagate I/O errors from `Artifact::ensure_in` instead of unwrapping in `SpyImpl::init_in`. - Inherit `target = "target"` for the fspy_preload cdylib artifact deps through the workspace so `cargo autoinherit` is idempotent.
- Reuse extracted macOS binaries in OUT_DIR instead of re-downloading on every build-script rerun (previous regression would break offline/CI). - Switch build-script directives from `cargo::` back to `cargo:` for consistency with the rest of the build scripts. - Propagate I/O errors from `Artifact::ensure_in` instead of unwrapping in `SpyImpl::init_in`. - Inherit `target = "target"` for the fspy_preload cdylib artifact deps through the workspace so `cargo autoinherit` is idempotent.
- Drop `rand` from fspy (was used by the extracted `artifact.rs`, replaced by tempfile in the new `bundled_artifact` crate). - Drop workspace-level `rand` and `const_format` (no longer used by any member). - Set `test = false` on the lib targets of `bundled_artifact` and `bundled_artifact_build` (no unit tests yet).
Port main's `fetch_macos_binaries` cache logic: hash the cached file on every build, skip download when it matches `expected_sha256`, and verify the freshly-extracted binary against the same value on a cache miss. The hash is now over the extracted binary (not the tarball), which lets the cached file itself vouch for its own integrity without needing the original tarball on disk.
"Bundled" clashes with `include_bytes!`, which also bundles. The distinctive feature of this crate is that it gives you a real path on disk for APIs that need one (`LoadLibrary`, `LD_PRELOAD`, helper binaries) — the value-add over `include_bytes!` is the materialization step. The env-var prefix moves to `MATERIALIZED_ARTIFACT_` accordingly.
The verb mirrors the crate name and makes the value-add over `include_bytes!` obvious at the call site.
Spell out why the temp file lives in the destination directory (single-filesystem requirement for atomic rename), why the Unix mode is set via `Builder::permissions` (no window with wrong bits), and why we use `persist_noclobber` (atomic link-or-fail, race-safe).
Replace `Artifact::materialize_in(dir, suffix, executable)` with a chain
that reads like prose and labels the boolean at the call site:
INTERPOSE_CDYLIB.materialize().suffix(".dll").at(path)?
COREUTILS_BINARY.materialize().executable().at(dir)?
`Artifact` becomes `Copy` so the builder can own it by value — no
`&'a Artifact` in the type. The `executable` field is `#[cfg(unix)]`
so it doesn't take up space on Windows; the method stays unconditional
so cross-platform call sites don't need cfg'd guards.
…suffix lifetime - `.executable()` was `fn(mut self)` with the write to `self.executable` inside `#[cfg(unix)]` — on Windows the `mut` was unused and tripped `-D unused_mut`. Added a `#[cfg_attr(not(unix), expect(unused_mut, ...))]`. - `.suffix(&str)` now re-parameterizes the builder's lifetime to the new suffix's so a `Materialize<'static>` can still accept a short-lived suffix. Lifetimes are elided to keep the signature tidy.
|
@codex review |
vite-task PR voidzero-dev/vite-task#344 made `fspy_preload_unix` (a cdylib) an unconditional build-dep so cargo's artifact resolver doesn't panic during cross-compile. On musl, building a cdylib requires dynamically-linked libc; the default is static, so the build fails with: error: cannot produce cdylib for `fspy_preload_unix` as the target `x86_64-unknown-linux-musl` does not support these crate types Mirror vite-task's own musl CI: set `-C target-feature=-crt-static` in `RUSTFLAGS`. vite+ ships as a NAPI module that links musl libc dynamically, so this matches production.
Why
The helper in fspy that embeds a file at compile time and writes it to disk on demand isn't really about filesystem tracing. It's the pattern you reach for whenever a Windows API wants a DLL path,
LD_PRELOADwants a real file, or a helper binary shipped inside the executable needs to be spawned.Pulling it into its own crate labels that responsibility and lets future callers use it without reaching into fspy internals.
What
The extracted crate is
materialized_artifact, with a companionmaterialized_artifact_buildfor the build-script side. The API gets a minor cleanup along the way —ARTIFACT.materialize().executable().at(dir)?— so call sites read in order instead of ending in an unlabelled boolean.