diff --git a/CHANGELOG.md b/CHANGELOG.md index 0b455d40..4c8268d4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,19 @@ ## [Unreleased] +## [0.31.2] - 2026-06-18 + +### Added + +- **`RowDequantSource` + `ops.gather` row-dequant path.** Adds `RowDequantSource` (a `TensorData` marker, + `dequantRow(rowIdx): FloatArray`) to `skainet-lang-core`, and teaches `DefaultCpuOps.gather` to use it: + when the gathered table implements `RowDequantSource`, only the rows actually touched are dequantised + (each unique row once, cached) instead of the generic element path — which calls `get()` (unsupported on + such tensors) and would otherwise force a full FP32 materialise of the table. The table declares logical + dtype `FP32`, so `gather` returns FP32 with no typing change. This lets a packed/oversized embedding (a + Q-quantised `token_embd`) stay packed and be looked up via `ops.gather` directly — generalising the + per-row-dequant trick out of the model layer. Adds `GatherRowDequantTest` (commonTest). (PR #741) + ## [0.31.0] - 2026-06-15 ### Fixed diff --git a/README.md b/README.md index b5049c92..39ac917a 100644 --- a/README.md +++ b/README.md @@ -36,7 +36,7 @@ Add the core dependencies (Gradle Kotlin DSL): ```kotlin dependencies { // Recommended: import the umbrella BOM and drop versions on the engine modules. - implementation(platform("sk.ainet:skainet-bom:0.31.0")) + implementation(platform("sk.ainet:skainet-bom:0.31.2")) implementation("sk.ainet.core:skainet-lang-core") implementation("sk.ainet.core:skainet-backend-cpu") @@ -227,6 +227,15 @@ Runnable examples: --- +## What's New in 0.31.2 + +- **`RowDequantSource` + `ops.gather` row-dequant.** A `TensorData` can now mark itself `RowDequantSource` + (`dequantRow(rowIdx): FloatArray`); `ops.gather` then dequantises only the rows it touches instead of + materialising the whole table (and instead of the `get()` path, which such tensors don't support). The + table presents as logical FP32, so a packed/oversized embedding (a Q-quantised `token_embd`) can stay + packed and be looked up via `ops.gather` directly — moving the per-row-dequant trick out of model code + into the engine. (PR #741) + ## What's New in 0.31.0 - **`ops.transpose` lazily handles every packed matmul dtype.** The CPU backend rewraps packed bytes with a flipped shape (metadata-only "lazy transpose") so a packed weight survives `linearProject`'s `matmul(x, transpose(W))` instead of inflating to FP32 — but **Q8_0 and Q4_0** were missing and threw `Byte → Float ClassCastException`. Now the full dispatch set (Q4_K/Q5_K/Q6_K/Q5_0/Q5_1/Q8_0/Q4_0) transposes lazily, so a packed Q8_0/Q4_0 matmul weight (e.g. a tied Q8_0 `lm_head`) stays packed end-to-end on its NEON/SIMD kernel. Regression-tested across all seven packed types. (PRs #736, #737) diff --git a/docs/modules/ROOT/pages/how-to/io-readers.adoc b/docs/modules/ROOT/pages/how-to/io-readers.adoc index 0f97978b..ef0ea83c 100644 --- a/docs/modules/ROOT/pages/how-to/io-readers.adoc +++ b/docs/modules/ROOT/pages/how-to/io-readers.adoc @@ -20,7 +20,7 @@ Add the following dependencies to your `build.gradle.kts`: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet:skainet-bom:0.31.0")) + implementation(platform("sk.ainet:skainet-bom:0.31.2")) implementation("sk.ainet.core:skainet-io-gguf") implementation("org.jetbrains.kotlinx:kotlinx-io-core:0.8.2") @@ -32,7 +32,7 @@ dependencies { [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet:skainet-bom:0.31.0")) + implementation(platform("sk.ainet:skainet-bom:0.31.2")) implementation("sk.ainet.core:skainet-io-onnx") implementation("org.jetbrains.kotlinx:kotlinx-io-core:0.8.2") diff --git a/docs/modules/ROOT/pages/how-to/minerva-export.adoc b/docs/modules/ROOT/pages/how-to/minerva-export.adoc index f07853f6..5c085625 100644 --- a/docs/modules/ROOT/pages/how-to/minerva-export.adoc +++ b/docs/modules/ROOT/pages/how-to/minerva-export.adoc @@ -38,7 +38,7 @@ For a published application, use the SKaiNET BOM and the Minerva artifact: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet:skainet-bom:0.31.0")) + implementation(platform("sk.ainet:skainet-bom:0.31.2")) implementation("sk.ainet.core:skainet-compile-minerva") } ---- diff --git a/docs/modules/ROOT/pages/reference/kernel-support-matrix.adoc b/docs/modules/ROOT/pages/reference/kernel-support-matrix.adoc index 9f61d833..0a1f44f8 100644 --- a/docs/modules/ROOT/pages/reference/kernel-support-matrix.adoc +++ b/docs/modules/ROOT/pages/reference/kernel-support-matrix.adoc @@ -1,7 +1,7 @@ = Kernel × platform support matrix :description: Which compute-kernel provider serves each weight format on each KMP target. -Generated from `kernel-support.json` (version `0.31.0`) by `KernelSupportMatrixTest` — registry introspection of the registered `KernelProvider` implementations. Do not edit by hand; run `./gradlew generateKernelMatrix` to refresh. +Generated from `kernel-support.json` (version `0.31.2`) by `KernelSupportMatrixTest` — registry introspection of the registered `KernelProvider` implementations. Do not edit by hand; run `./gradlew generateKernelMatrix` to refresh. Each cell is the best (highest-priority) provider that serves `Float32 × format` `matmul` on that platform: *native-ffm* (100) → *panama-vector* (50) → *scalar* (0). An empty cell (`—`) means no provider carries a kernel there (the format is dequant-to-FP32 only). diff --git a/docs/modules/ROOT/pages/tutorials/image-data-getting-started.adoc b/docs/modules/ROOT/pages/tutorials/image-data-getting-started.adoc index 6d67a062..8d62a167 100644 --- a/docs/modules/ROOT/pages/tutorials/image-data-getting-started.adoc +++ b/docs/modules/ROOT/pages/tutorials/image-data-getting-started.adoc @@ -32,7 +32,7 @@ For a JVM project, add the image/data modules alongside the CPU backend: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet:skainet-bom:0.31.0")) + implementation(platform("sk.ainet:skainet-bom:0.31.2")) implementation("sk.ainet:skainet-backend-cpu-jvm") implementation("sk.ainet:skainet-io-image-jvm") diff --git a/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc b/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc index b6b6a4e3..d72327e6 100644 --- a/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc +++ b/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc @@ -144,7 +144,7 @@ repositories { dependencies { // Import BOM for version alignment - implementation(platform("sk.ainet:skainet-bom:0.31.0")) + implementation(platform("sk.ainet:skainet-bom:0.31.2")) // Core tensor library implementation("sk.ainet:skainet-lang-core-jvm") diff --git a/gradle.properties b/gradle.properties index a5f52357..61a4823b 100644 --- a/gradle.properties +++ b/gradle.properties @@ -1,5 +1,5 @@ GROUP=sk.ainet.core -VERSION_NAME=0.31.0 +VERSION_NAME=0.31.2 POM_DESCRIPTION=SKaiNET POM_URL=https://github.com/SKaiNET-developers/skainet/