release: 0.33.0 — GRU, upsample2d Bilinear export, autodiff coverage fix#775
Merged
Conversation
Bumps VERSION_NAME 0.32.4 -> 0.33.0. Bundles the develop changes since 0.32.4: GRU layer (#772/#217), upsample2d Bilinear + StableHLO export (#771), the autodiff dispatch correctness fix + 7 newly-differentiable ops + KSP coverage guard (#774), and norm converters lowering to real stablehlo.reduce (#769). Minor bump (not patch): TensorOps.sin/cos/convTranspose1d became abstract, a source/binary-incompatible change for downstream TensorOps implementers. Validated: full conformance suite (12/12 models + 33/33 ops) green end-to-end on IREE llvm-cpu against this tree (via local-maven 0.32.5-localdev1). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release 0.33.0 — bundles everything merged to
developsince0.32.4.Highlights (user value)
gru(…)network-DSL builder.upsample2dBilinear + StableHLO export (Add upsample2d Bilinear (eager+autodiff) and traceable StableHLO lowering (Nearest + Bilinear) #771) — Nearest & Bilinear lower to fixed reshape/broadcast/dot_general(nocustom_call); unblocks resize/FPN export.elu/leakyRelu/permutehad backward rules that were never wired into the trace dispatch, so their gradients were silently dropped to null. Now wired + guarded by a KSP-generated coverage test. Also makescos/sin/tril/gather/indexSelect/unfold/convTranspose1ddifferentiable, and surfacesisDifferentiableinoperators.json.layerNorm/rmsNorm/batchNormlower to realstablehlo.reduceinstead of export-onlycustom_calls.TensorOps.sin/cos/convTranspose1dare now abstract (were default-throwing) so they trace/export and become differentiable. Anyone implementingTensorOpsdirectly must override them — both bundled backends (DefaultCpuOpsBase,VoidTensorOps) already do. Hence the minor bump.Validation
Full downstream conformance suite — 12/12 models + 33/33 ops — exports, compiles, runs, and validates green end-to-end on IREE (llvm-cpu) against this exact tree (driven via a local-maven
0.32.5-localdev1build of this branch). Includes the new GRU probe (max_abs_err 5.96e-08 vs the SKaiNET CPU oracle) and upsample2d (nearest 0.0, bilinear 1.19e-07 vs numpy).Contents (
gradle.properties0.32.4→0.33.0,CHANGELOG.md, README "What's New")This branch only bumps the version + docs. The code changes are already on
developvia #769, #771, #772, #774.Reviewer notes
developalso carries perf(native-cpu): Q6_K NEON matmul kernel #768 (Q6_K NEON matmul kernel) from a separate workstream — not authored here; please confirm its changelog wording if you want it called out.🤖 Generated with Claude Code