diff --git a/CHANGELOG.md b/CHANGELOG.md index cf033410..3aae584e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,47 @@ ## [Unreleased] +## [0.33.0] - 2026-06-29 + +### Added + +- **GRU layer (`sk.ainet.lang.nn.Gru`).** SKaiNET's first recurrent layer (issue #217): single-layer, + unidirectional, batch-first `[B, S, D] -> [B, S, H]`, PyTorch gate order (reset, update, new). Built + by composing existing primitives (matmul/add/sigmoid/tanh/narrow/concat) **unrolled over the static + sequence length at trace time** — StableHLO has no loop construct, so any recurrence must unroll. It + runs eagerly, is trainable through the standard tape, and exports to StableHLO with no dedicated + converter. Also adds a `gru(hiddenSize) { … }` network-DSL builder. (PR #772) +- **`upsample2d` Bilinear + StableHLO export.** Adds the Bilinear forward (PyTorch coord map, 4-neighbour + blend) and its autodiff backward, and a traceable StableHLO lowering for **both** Nearest and Bilinear + (scale is static at trace time, so everything lowers to fixed reshape/broadcast/`dot_general` — no + runtime index math, no `custom_call`). Unblocks export of resize/FPN-style paths. (PR #771) +- **Seven newly-differentiable ops.** `cos`, `sin`, `tril`, `gather`, `indexSelect`, `unfold`, + `convTranspose1d` now carry `@Diff` and have backward rules (with finite-difference parity tests): + trig for RoPE, `gather` for embedding lookup, `tril` for causal masks, the rest structural. (PR #774) +- **KSP-generated autodiff-coverage guard.** The tracing-wrapper processor now emits + `DifferentiableTensorOpsRules.ruleNames` (the authoritative `@Diff` op set); a unit test asserts the + execution tape's dispatch covers it, so a differentiable op can no longer ship with a backward rule + that is never wired. `operators.json` now records `isDifferentiable` (+ optional `diffRuleName`), + schema-validated. (PR #774) + +### Fixed + +- **Silent gradient drop for `elu`, `leakyRelu`, `permute`.** These were `@Diff` and had correct + backward formulas, but had no arm in the execution tape's trace dispatch, so their gradients fell + through to `null` and were silently discarded. Now wired (and guarded by the coverage test above); + `permuteBackward` also fixed to decode its `axes` attribute as the traced `List`. (PR #774) +- **`layerNorm` / `rmsNorm` / `batchNorm` lower to real `stablehlo.reduce`.** The norm converters + previously emitted non-compilable `reduce_mean` / `reduce_variance` `custom_call`s (export-only); they + now decompose to real `stablehlo.reduce`, so all three compile and run on stock IREE (llvm-cpu). (PR #769) + +### Changed + +- **BREAKING: `TensorOps.sin`, `TensorOps.cos`, `TensorOps.convTranspose1d` are now abstract.** They + previously had default `throw NotImplementedError(...)` bodies; they are abstract so the tracing + wrapper records them (and they become differentiable/exportable). Any type implementing `TensorOps` + directly must now override them — both bundled backends (`DefaultCpuOpsBase`, `VoidTensorOps`) already + do. (PR #774) + ## [0.32.4] - 2026-06-26 ### Fixed diff --git a/README.md b/README.md index 59beabfe..6ff3cdaa 100644 --- a/README.md +++ b/README.md @@ -36,17 +36,13 @@ Add the core dependencies (Gradle Kotlin DSL): ```kotlin dependencies { // Recommended: import the umbrella BOM and drop versions on the engine modules. - implementation(platform("sk.ainet:skainet-bom:0.32.4")) + implementation(platform("sk.ainet:skainet-bom:0.33.0")) implementation("sk.ainet.core:skainet-lang-core") implementation("sk.ainet.core:skainet-backend-cpu") } ``` -> The BOM was first correctly published to Maven Central in 0.22.2 — earlier versions -> shipped at the wrong coordinates and could not be imported. Pin versions directly if -> you need an older release. - ### Hello Neural Net ```kotlin @@ -241,6 +237,23 @@ Runnable examples: --- +## What's New in 0.33.0 + +- **GRU — the first recurrent layer.** `nn.Gru` (`[B,S,D]->[B,S,H]`, PyTorch gate order) composed from + existing primitives and unrolled over the static sequence at trace time, so it runs eagerly, trains + through the standard tape, and exports to StableHLO with no dedicated converter. Plus a `gru(…)` + network-DSL builder. (PR #772, issue #217) +- **`upsample2d` Bilinear + StableHLO export** for both Nearest and Bilinear — everything lowers to fixed + reshape/broadcast/`dot_general` (no `custom_call`), unblocking resize/FPN-style export. (PR #771) +- **Autodiff correctness + coverage.** Fixes a silent gradient-drop for `elu`/`leakyRelu`/`permute` + (backward rules existed but were never wired into the trace dispatch), makes `cos`/`sin`/`tril`/ + `gather`/`indexSelect`/`unfold`/`convTranspose1d` differentiable, and adds a KSP-generated coverage + guard so a differentiable op can no longer ship without a wired backward. (PR #774) +- **Norms compile on stock IREE.** `layerNorm`/`rmsNorm`/`batchNorm` now lower to real `stablehlo.reduce` + instead of export-only `custom_call`s. (PR #769) +- **Breaking:** `TensorOps.sin`/`cos`/`convTranspose1d` are now abstract — backends implementing + `TensorOps` directly must override them (both bundled backends already do). + ## What's New in 0.32.4 - **Streaming detokenization keeps word spaces (`Tokenizer.decodeToken`).** Decoding generated tokens diff --git a/gradle.properties b/gradle.properties index ce9dd46e..e579840a 100644 --- a/gradle.properties +++ b/gradle.properties @@ -1,5 +1,5 @@ GROUP=sk.ainet.core -VERSION_NAME=0.32.4 +VERSION_NAME=0.33.0 POM_DESCRIPTION=SKaiNET POM_URL=https://github.com/SKaiNET-developers/skainet/