diff --git a/CHANGELOG.md b/CHANGELOG.md index 3fdc28d..a4a916a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,39 @@ version line is kept in lock-step with the underlying SKaiNET engine The format roughly follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.32.0] — 2026-06-25 + +Brings the real-GGUF **Llama** eager path up to the Gemma standard (packed +`NATIVE_OPTIMIZED`) and **unblocks StableHLO/IREE export for Llama-family models** +(traceable interleaved RoPE). Ships against engine **0.32.0**. + +### Added + +- **Eager `NATIVE_OPTIMIZED` packed path for Llama.** `LlamaNetworkLoader.fromGguf(NATIVE_OPTIMIZED)` + keeps `Q4_K`/`Q6_K` weights packed and runs them through `OptimizedLLMRuntime` — new `LlamaQuantLayout` + + `LlamaPackedWeights.convertLlamaWeightsPacked`, mirroring `convertGemmaWeightsPacked`. Coherent + output matching llama.cpp; the low-footprint path real-GGUF Llama inference on constrained ARM was + missing. (ccbd87e) + +### Changed + +- **Fused decode-attention fast path.** `MultiHeadAttention`'s decode step (`seqQ == 1`) now computes + scores → softmax → GQA-weighted-V directly from the cached K/V, bypassing the `repeatKVHeads` concat + and the `unsqueeze → SDPA → squeeze → permute` chain — ~1.5× decode throughput, bit-identical output. + Prefill (`seqLen > 1`) keeps the general SDPA path. (3791f88) +- **Engine pin `skainet 0.31.0 → 0.32.0`.** + +### Fixed + +- **Packed token-embedding gather for Llama** — `fromGguf(NATIVE_OPTIMIZED)` no longer fails with + `gather: unsupported input rank 1`; the packed embedding is wired through the canonical loader. (ccbd87e) +- **Interleaved RoPE is now traceable.** In `INTERLEAVED` mode (Llama / Mistral / most GGUF) the rotation + used a raw float-array path (`copyToFloatArray` / `fromFloatArray`) that, under graph tracing, baked the + rotated Q/K as a *disconnected constant* — severing them from the projection weights and crashing + `iree-compile` (null-deref in constant folding) on the exported graph. `RoPE` now records the rotation + as tensor ops when running under the tracing wrapper; eager execution keeps the byte-identical raw-array + fast path. Unblocks Llama/Mistral/GGUF StableHLO/IREE export. (019b049) + ## [0.31.1] — 2026-06-17 Adds **`transformer-core`** — the framework NN primitives (attention, the KV-cache family, embedding, diff --git a/README.md b/README.md index 8505735..d9c49e3 100644 --- a/README.md +++ b/README.md @@ -103,25 +103,27 @@ Honest status — see the project-status note at the top of this README. ## Current release -The current release is **0.31.1** (against **SKaiNET 0.31.0**). It adds -**`transformer-core`** — the framework NN primitives (attention, KV-cache family, -embedding, norms, RoPE, FFNs, linear projection) extracted out of `llm-core` so they -build on the **full target matrix including `androidNative`** (32-bit + 64-bit ARM); -`llm-core` re-exports it, so nothing changes for existing consumers, and ARM-native -downstreams (e.g. on-device whisper) can reuse the primitives instead of reimplementing -them. The 0.31.0 highlights still apply: the eager `NATIVE_OPTIMIZED` Gemma path keeps the -**tied Q8_0 lm_head packed** (paired with SKaiNET 0.31.0's `ops.transpose` fix -for all packed dtypes), and `GemmaNetworkLoader.load()` takes an optional -`maxInferenceLen` to cap the KV cache for constrained devices — together -dropping FunctionGemma-270M's footprint enough to load eagerly on the 1.9 GB -Astra Machina SL2610. FunctionGemma (`Q5_K_M`) still decodes byte-identically -across the FP32 baseline and both packed paths (`GemmaQ5KPackedParityTest`). +The current release is **0.32.0** (against **SKaiNET 0.32.0**). It brings the +real-GGUF **Llama** eager path up to the Gemma standard and **unblocks StableHLO/IREE +export for Llama-family models**: + +- The eager **`NATIVE_OPTIMIZED` path now works for Llama** (`Q4_K`/`Q6_K`): weights stay + packed and `LlamaNetworkLoader.fromGguf(NATIVE_OPTIMIZED) + OptimizedLLMRuntime` decodes + coherently, matching llama.cpp — fixing the packed token-embedding + `gather: unsupported input rank 1`. +- **Fused decode-attention** (`seqQ == 1`) skips the `repeatKVHeads` concat + SDPA plumbing + for a faster decode loop (~1.5×), bit-identical output. +- **Interleaved RoPE is now traceable**, so Llama/Mistral/GGUF graphs export to StableHLO + (and `iree-compile` to a `vmfb`) instead of baking a disconnected constant. + +The earlier `transformer-core` extraction (0.31.1) and the Gemma `NATIVE_OPTIMIZED` +footprint work (0.31.0) still apply. The recommended way to consume is via the BOM. It pins every published `skainet-transformers-*` artifact and re-exports the upstream `sk.ainet:skainet-bom`, so the engine-side `sk.ainet.core:skainet-*` artifacts get the matching version too — you only need to declare the BOM version in one place. ```kotlin dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.1")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.32.0")) // Versions resolved from the BOM: implementation("sk.ainet.transformers:skainet-transformers-core") @@ -199,6 +201,24 @@ try (KLlamaSession session = KLlamaJava.loadGGUF(modelPath, /* systemPrompt */ n See `llm-test/llm-test-java/src/test/java/.../KLlamaJavaToolCallingTest.java` for a runnable reference. +## What's new in 0.32.0 + +- **Eager `NATIVE_OPTIMIZED` for real-GGUF Llama.** `LlamaNetworkLoader.fromGguf(NATIVE_OPTIMIZED)` + now keeps `Q4_K`/`Q6_K` weights packed and runs them through `OptimizedLLMRuntime`, mirroring the + Gemma path (new `LlamaQuantLayout` + `LlamaPackedWeights.convertLlamaWeightsPacked`). Output is + coherent and matches llama.cpp; fixes the packed token-embedding `gather: unsupported input rank 1`. + This is the low-footprint path real-GGUF Llama inference on constrained ARM was missing. (ccbd87e) +- **Fused decode-attention fast path.** For the decode step (`seqQ == 1`), `MultiHeadAttention` runs + scores → softmax → GQA-weighted-V straight from the cached K/V, bypassing the `repeatKVHeads` concat + and the `unsqueeze → SDPA → squeeze → permute` chain. ~1.5× decode throughput on the JVM eager path; + bit-for-bit-equivalent output. Prefill keeps the general SDPA path. (3791f88) +- **Traceable interleaved RoPE (graph export).** `RoPE` in `INTERLEAVED` mode (Llama / Mistral / most + GGUF) used a raw-array path (`copyToFloatArray` / `fromFloatArray`) that, under graph tracing, recorded + the rotated Q/K as a *disconnected constant* — severing them from the projection weights and crashing + `iree-compile` downstream. It now records the rotation as tensor ops when tracing (gated on the tracing + wrapper; eager keeps the fast raw-array path byte-identical). Unblocks TinyLlama → StableHLO → IREE. (019b049) +- **Engine pin `skainet 0.31.0 → 0.32.0`.** + ## What's new in 0.31.1 - **`transformer-core` module — NN primitives reusable on all targets incl. `androidNative`.** The diff --git a/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc b/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc index 47da0d6..530d47f 100644 --- a/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc +++ b/docs/modules/ROOT/pages/tutorials/getting-started-java.adoc @@ -25,7 +25,7 @@ In your `build.gradle.kts`: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.1")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.32.0")) implementation("sk.ainet.transformers:skainet-transformers-runtime-kllama") implementation("sk.ainet.transformers:skainet-transformers-agent") @@ -41,7 +41,7 @@ Or in Maven (Maven needs the `-jvm` classifier suffix on platform artifacts): sk.ainet.transformers skainet-transformers-bom - 0.31.1 + 0.32.0 pom import diff --git a/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc b/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc index cb9ebf5..fbe5d0d 100644 --- a/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc +++ b/docs/modules/ROOT/pages/tutorials/llama3-tool-calling.adoc @@ -52,7 +52,7 @@ The pieces you need live in three modules: [source,kotlin] ---- dependencies { - implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.31.1")) + implementation(platform("sk.ainet.transformers:skainet-transformers-bom:0.32.0")) implementation("sk.ainet.transformers:skainet-transformers-runtime-kllama") implementation("sk.ainet.transformers:skainet-transformers-agent") diff --git a/gradle.properties b/gradle.properties index a1bd7ef..eb1c82f 100644 --- a/gradle.properties +++ b/gradle.properties @@ -1,5 +1,5 @@ GROUP=sk.ainet.transformers -VERSION_NAME=0.31.1 +VERSION_NAME=0.32.0 POM_DESCRIPTION=SKaiNET-transformers diff --git a/gradle/libs.versions.toml b/gradle/libs.versions.toml index 462b011..d7c4c71 100644 --- a/gradle/libs.versions.toml +++ b/gradle/libs.versions.toml @@ -1,5 +1,5 @@ [versions] -skainet = "0.31.0" +skainet = "0.32.0" agp = "9.2.1" jacksonDatabind = "2.22.0" jsonSchemaValidator = "3.0.4" diff --git a/llm-core/api/jvm/llm-core.api b/llm-core/api/jvm/llm-core.api index aecfb28..2275bca 100644 --- a/llm-core/api/jvm/llm-core.api +++ b/llm-core/api/jvm/llm-core.api @@ -531,56 +531,6 @@ public final class sk/ainet/apps/llm/weights/LlamaSafeTensorsNameResolver : sk/a public fun resolve (Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String; } -public abstract interface class sk/ainet/lang/nn/dsl/ATTENTION : sk/ainet/lang/nn/dsl/NetworkDslItem { - public abstract fun kvCache (III)V - public abstract fun kvCache (Lsk/ainet/lang/nn/transformer/KVCache;)V - public abstract fun rope (IILsk/ainet/lang/nn/transformer/RoPEMode;FLsk/ainet/lang/nn/transformer/RoPEScaling;FF)V - public static synthetic fun rope$default (Lsk/ainet/lang/nn/dsl/ATTENTION;IILsk/ainet/lang/nn/transformer/RoPEMode;FLsk/ainet/lang/nn/transformer/RoPEScaling;FFILjava/lang/Object;)V -} - -public final class sk/ainet/lang/nn/dsl/ATTENTION$DefaultImpls { - public static synthetic fun rope$default (Lsk/ainet/lang/nn/dsl/ATTENTION;IILsk/ainet/lang/nn/transformer/RoPEMode;FLsk/ainet/lang/nn/transformer/RoPEScaling;FFILjava/lang/Object;)V -} - -public final class sk/ainet/lang/nn/dsl/AttentionImpl : sk/ainet/lang/nn/dsl/ATTENTION { - public fun (Lsk/ainet/context/ExecutionContext;IIIZZZDLjava/lang/Float;ZZLjava/lang/String;Ljava/lang/Integer;Lkotlin/reflect/KClass;)V - public synthetic fun (Lsk/ainet/context/ExecutionContext;IIIZZZDLjava/lang/Float;ZZLjava/lang/String;Ljava/lang/Integer;Lkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun create ()Lsk/ainet/lang/nn/transformer/MultiHeadAttention; - public fun getExecutionContext ()Lsk/ainet/context/ExecutionContext; - public fun kvCache (III)V - public fun kvCache (Lsk/ainet/lang/nn/transformer/KVCache;)V - public fun rope (IILsk/ainet/lang/nn/transformer/RoPEMode;FLsk/ainet/lang/nn/transformer/RoPEScaling;FF)V -} - -public final class sk/ainet/lang/nn/dsl/TransformerDslKt { - public static final fun embedding (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;)V - public static final fun embedding (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;)V - public static synthetic fun embedding$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;ILjava/lang/Object;)V - public static synthetic fun embedding$default (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;ILjava/lang/Object;)V - public static final fun geGluFFN (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;)V - public static final fun geGluFFN (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;)V - public static synthetic fun geGluFFN$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;ILjava/lang/Object;)V - public static synthetic fun geGluFFN$default (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;ILjava/lang/Object;)V - public static final fun multiHeadAttention (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IIIZZFZLjava/lang/String;Ljava/lang/Integer;Lkotlin/jvm/functions/Function1;)V - public static final fun multiHeadAttention (Lsk/ainet/lang/nn/dsl/StageImpl;IIIZZZFLjava/lang/Float;ZZLjava/lang/String;Ljava/lang/Integer;Lkotlin/jvm/functions/Function1;)V - public static synthetic fun multiHeadAttention$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IIIZZFZLjava/lang/String;Ljava/lang/Integer;Lkotlin/jvm/functions/Function1;ILjava/lang/Object;)V - public static synthetic fun multiHeadAttention$default (Lsk/ainet/lang/nn/dsl/StageImpl;IIIZZZFLjava/lang/Float;ZZLjava/lang/String;Ljava/lang/Integer;Lkotlin/jvm/functions/Function1;ILjava/lang/Object;)V - public static final fun residual (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;)V - public static final fun residual (Lsk/ainet/lang/nn/dsl/StageImpl;)V - public static final fun rmsNorm (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IFLjava/lang/String;Z)V - public static final fun rmsNorm (Lsk/ainet/lang/nn/dsl/StageImpl;IFLjava/lang/String;Z)V - public static synthetic fun rmsNorm$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IFLjava/lang/String;ZILjava/lang/Object;)V - public static synthetic fun rmsNorm$default (Lsk/ainet/lang/nn/dsl/StageImpl;IFLjava/lang/String;ZILjava/lang/Object;)V - public static final fun swiGluFFN (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;)V - public static final fun swiGluFFN (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;)V - public static synthetic fun swiGluFFN$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;ILjava/lang/Object;)V - public static synthetic fun swiGluFFN$default (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;ILjava/lang/Object;)V - public static final fun xielu (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;Ljava/lang/String;)V - public static final fun xielu (Lsk/ainet/lang/nn/dsl/StageImpl;Ljava/lang/String;)V - public static synthetic fun xielu$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;Ljava/lang/String;ILjava/lang/Object;)V - public static synthetic fun xielu$default (Lsk/ainet/lang/nn/dsl/StageImpl;Ljava/lang/String;ILjava/lang/Object;)V -} - public abstract interface class sk/ainet/lang/nn/dsl/decoder/DecoderModelMetadata { public abstract fun getBlockCount ()I public abstract fun getBosTokenId ()I @@ -596,272 +546,3 @@ public abstract interface class sk/ainet/lang/nn/dsl/decoder/DecoderModelMetadat public abstract fun getVocabSize ()I } -public final class sk/ainet/lang/nn/layers/Embedding : sk/ainet/lang/nn/DualModule, sk/ainet/lang/nn/topology/ModuleParameters { - public static final field Companion Lsk/ainet/lang/nn/layers/Embedding$Companion; - public fun (IILsk/ainet/lang/tensor/Tensor;Ljava/lang/Integer;Ljava/lang/String;)V - public synthetic fun (IILsk/ainet/lang/tensor/Tensor;Ljava/lang/Integer;Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun (Lsk/ainet/context/ExecutionContext;Lkotlin/reflect/KClass;Lsk/ainet/lang/nn/layers/EmbeddingParams;Ljava/lang/String;FFLkotlin/random/Random;)V - public synthetic fun (Lsk/ainet/context/ExecutionContext;Lkotlin/reflect/KClass;Lsk/ainet/lang/nn/layers/EmbeddingParams;Ljava/lang/String;FFLkotlin/random/Random;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun forward (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; - public final fun forward ([ILsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; - public final fun forward ([JLsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; - public final fun forwardAny (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;Z)Lsk/ainet/lang/tensor/Tensor; - public static synthetic fun forwardAny$default (Lsk/ainet/lang/nn/layers/Embedding;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;ZILjava/lang/Object;)Lsk/ainet/lang/tensor/Tensor; - public final fun getEmbeddingDim ()I - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public final fun getNumEmbeddings ()I - public final fun getPaddingIdx ()Ljava/lang/Integer; - public fun getParams ()Ljava/util/List; -} - -public final class sk/ainet/lang/nn/layers/Embedding$Companion { -} - -public final class sk/ainet/lang/nn/layers/EmbeddingAdapter : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { - public fun (Lsk/ainet/lang/nn/layers/Embedding;)V - public final fun getEmbedding ()Lsk/ainet/lang/nn/layers/Embedding; - public final fun getEmbeddingDim ()I - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public final fun getNumEmbeddings ()I - public fun getParams ()Ljava/util/List; -} - -public final class sk/ainet/lang/nn/layers/EmbeddingParams { - public fun (IILjava/lang/Integer;Ljava/lang/Float;Z)V - public synthetic fun (IILjava/lang/Integer;Ljava/lang/Float;ZILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun component1 ()I - public final fun component2 ()I - public final fun component3 ()Ljava/lang/Integer; - public final fun component4 ()Ljava/lang/Float; - public final fun component5 ()Z - public final fun copy (IILjava/lang/Integer;Ljava/lang/Float;Z)Lsk/ainet/lang/nn/layers/EmbeddingParams; - public static synthetic fun copy$default (Lsk/ainet/lang/nn/layers/EmbeddingParams;IILjava/lang/Integer;Ljava/lang/Float;ZILjava/lang/Object;)Lsk/ainet/lang/nn/layers/EmbeddingParams; - public fun equals (Ljava/lang/Object;)Z - public final fun getEmbeddingDim ()I - public final fun getMaxNorm ()Ljava/lang/Float; - public final fun getNumEmbeddings ()I - public final fun getPaddingIdx ()Ljava/lang/Integer; - public final fun getScaleGradByFreq ()Z - public fun hashCode ()I - public fun toString ()Ljava/lang/String; -} - -public abstract interface class sk/ainet/lang/nn/normalization/FusedRmsNormOps { - public abstract fun fusedRmsNorm (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;F)Lsk/ainet/lang/tensor/Tensor; -} - -public final class sk/ainet/lang/nn/normalization/RMSNormalization : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { - public fun ([IDLjava/lang/String;Lsk/ainet/lang/tensor/Tensor;ZLkotlin/reflect/KClass;)V - public synthetic fun ([IDLjava/lang/String;Lsk/ainet/lang/tensor/Tensor;ZLkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun forward (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public fun getParams ()Ljava/util/List; -} - -public final class sk/ainet/lang/nn/transformer/AppendKVCache : sk/ainet/lang/nn/transformer/KVCache { - public fun (IIILjava/lang/String;)V - public synthetic fun (IIILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun getPosition ()I - public fun reset ()V - public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; -} - -public final class sk/ainet/lang/nn/transformer/GeGLUFFN : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { - public fun (IILjava/lang/String;Lkotlin/reflect/KClass;)V - public synthetic fun (IILjava/lang/String;Lkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun getDim ()I - public final fun getHiddenDim ()I - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public fun getParams ()Ljava/util/List; -} - -public abstract class sk/ainet/lang/nn/transformer/KVCache : sk/ainet/lang/nn/Module { - public fun (IIILjava/lang/String;)V - public synthetic fun (IIILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun getHeadDim ()I - public final fun getMaxSeqLen ()I - public fun getModules ()Ljava/util/List; - public final fun getNKVHeads ()I - public fun getName ()Ljava/lang/String; - public abstract fun getPosition ()I - protected fun onForward (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; - public abstract fun reset ()V - public abstract fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; -} - -public final class sk/ainet/lang/nn/transformer/LayerScalarMul : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { - public fun ()V - public fun (Ljava/lang/String;Lkotlin/reflect/KClass;)V - public synthetic fun (Ljava/lang/String;Lkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public fun getParams ()Ljava/util/List; -} - -public final class sk/ainet/lang/nn/transformer/LinearProjectionKt { - public static final fun linearProject (Lsk/ainet/lang/tensor/ops/TensorOps;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;)Lsk/ainet/lang/tensor/Tensor; -} - -public final class sk/ainet/lang/nn/transformer/MultiHeadAttention : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { - public fun (IIIZZZDLjava/lang/Float;ZZLjava/lang/String;Lsk/ainet/lang/nn/transformer/RoPE;Lsk/ainet/lang/nn/transformer/KVCache;Ljava/lang/Integer;Ljava/lang/Integer;Lkotlin/reflect/KClass;)V - public synthetic fun (IIIZZZDLjava/lang/Float;ZZLjava/lang/String;Lsk/ainet/lang/nn/transformer/RoPE;Lsk/ainet/lang/nn/transformer/KVCache;Ljava/lang/Integer;Ljava/lang/Integer;Lkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun forward (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; - public final fun getAttentionScale ()Ljava/lang/Float; - public final fun getBias ()Z - public final fun getCausal ()Z - public final fun getDim ()I - public final fun getHeadDim ()I - public final fun getKNorm ()Lsk/ainet/lang/nn/normalization/RMSNormalization; - public final fun getKvCache ()Lsk/ainet/lang/nn/transformer/KVCache; - public final fun getKvDim ()I - public fun getModules ()Ljava/util/List; - public final fun getNHeads ()I - public final fun getNKVHeads ()I - public fun getName ()Ljava/lang/String; - public fun getParams ()Ljava/util/List; - public final fun getQDim ()I - public final fun getQNorm ()Lsk/ainet/lang/nn/normalization/RMSNormalization; - public final fun getQkNorm ()Z - public final fun getQkNormEps ()D - public final fun getQkNormUnitOffset ()Z - public final fun getRope ()Lsk/ainet/lang/nn/transformer/RoPE; - public final fun getSlidingWindow ()Ljava/lang/Integer; - public final fun getVNormNoScale ()Z - public final fun setKvCache (Lsk/ainet/lang/nn/transformer/KVCache;)V - public final fun setRope (Lsk/ainet/lang/nn/transformer/RoPE;)V -} - -public final class sk/ainet/lang/nn/transformer/MultiHeadAttentionDiag { - public static final field INSTANCE Lsk/ainet/lang/nn/transformer/MultiHeadAttentionDiag; - public final fun getShouldDumpThisCall ()Z - public final fun setShouldDumpThisCall (Z)V -} - -public final class sk/ainet/lang/nn/transformer/OwnerReadOnlyKVCache : sk/ainet/lang/nn/transformer/KVCache { - public fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;Ljava/lang/String;)V - public synthetic fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun getDelegate ()Lsk/ainet/lang/nn/transformer/PositionalKVCache; - public fun getPosition ()I - public fun reset ()V - public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; -} - -public final class sk/ainet/lang/nn/transformer/PaddedSharedPositionalKVCache : sk/ainet/lang/nn/transformer/KVCache { - public fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;ILjava/lang/String;)V - public synthetic fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;ILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun getDelegate ()Lsk/ainet/lang/nn/transformer/PositionalKVCache; - public final fun getLayerHeadDim ()I - public fun getPosition ()I - public fun reset ()V - public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; -} - -public final class sk/ainet/lang/nn/transformer/PositionalKVCache : sk/ainet/lang/nn/transformer/KVCache { - public fun (IIILjava/lang/String;)V - public synthetic fun (IIILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun getPosition ()I - public fun reset ()V - public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; -} - -public final class sk/ainet/lang/nn/transformer/ResidualAdd : sk/ainet/lang/nn/Module { - public fun ()V - public fun (Ljava/lang/String;)V - public synthetic fun (Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public final fun getSavedInput ()Lsk/ainet/lang/tensor/Tensor; - public final fun setSavedInput (Lsk/ainet/lang/tensor/Tensor;)V -} - -public final class sk/ainet/lang/nn/transformer/RoPE : sk/ainet/lang/nn/Module { - public fun (IIFLsk/ainet/lang/nn/transformer/RoPEMode;Lsk/ainet/lang/nn/transformer/RoPEScaling;FFLjava/lang/String;)V - public synthetic fun (IIFLsk/ainet/lang/nn/transformer/RoPEMode;Lsk/ainet/lang/nn/transformer/RoPEScaling;FFLjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun forward (Lsk/ainet/lang/tensor/Tensor;ILsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; - public final fun getHeadDim ()I - public final fun getMaxSeqLen ()I - public final fun getMode ()Lsk/ainet/lang/nn/transformer/RoPEMode; - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public final fun getPartialRotaryFactor ()F - public final fun getRotaryDim ()I - public final fun getScaling ()Lsk/ainet/lang/nn/transformer/RoPEScaling; - public final fun getScalingFactor ()F -} - -public final class sk/ainet/lang/nn/transformer/RoPEMode : java/lang/Enum { - public static final field INTERLEAVED Lsk/ainet/lang/nn/transformer/RoPEMode; - public static final field SPLIT_HALF Lsk/ainet/lang/nn/transformer/RoPEMode; - public static fun getEntries ()Lkotlin/enums/EnumEntries; - public static fun valueOf (Ljava/lang/String;)Lsk/ainet/lang/nn/transformer/RoPEMode; - public static fun values ()[Lsk/ainet/lang/nn/transformer/RoPEMode; -} - -public final class sk/ainet/lang/nn/transformer/RoPEScaling : java/lang/Enum { - public static final field NONE Lsk/ainet/lang/nn/transformer/RoPEScaling; - public static final field PROPORTIONAL Lsk/ainet/lang/nn/transformer/RoPEScaling; - public static fun getEntries ()Lkotlin/enums/EnumEntries; - public static fun valueOf (Ljava/lang/String;)Lsk/ainet/lang/nn/transformer/RoPEScaling; - public static fun values ()[Lsk/ainet/lang/nn/transformer/RoPEScaling; -} - -public final class sk/ainet/lang/nn/transformer/SharedKVCache : sk/ainet/lang/nn/transformer/KVCache { - public fun (Lsk/ainet/lang/nn/transformer/KVCache;Ljava/lang/String;)V - public synthetic fun (Lsk/ainet/lang/nn/transformer/KVCache;Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun getDelegate ()Lsk/ainet/lang/nn/transformer/KVCache; - public fun getPosition ()I - public fun reset ()V - public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; -} - -public final class sk/ainet/lang/nn/transformer/SharedPositionalKVCache : sk/ainet/lang/nn/transformer/KVCache { - public fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;Ljava/lang/String;)V - public synthetic fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun getDelegate ()Lsk/ainet/lang/nn/transformer/PositionalKVCache; - public fun getPosition ()I - public fun reset ()V - public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; -} - -public final class sk/ainet/lang/nn/transformer/SlidingWindowKVCache : sk/ainet/lang/nn/transformer/KVCache { - public fun (IIIILjava/lang/String;)V - public synthetic fun (IIIILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun getPosition ()I - public final fun getWindow ()I - public fun reset ()V - public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; -} - -public final class sk/ainet/lang/nn/transformer/SwiGLUFFN : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { - public fun (IILjava/lang/String;)V - public synthetic fun (IILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun getDim ()I - public final fun getHiddenDim ()I - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public fun getParams ()Ljava/util/List; -} - -public final class sk/ainet/lang/nn/transformer/VoidDense : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { - public fun (Ljava/lang/String;IILkotlin/reflect/KClass;)V - public synthetic fun (Ljava/lang/String;IILkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public final fun getInDim ()I - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public final fun getOutDim ()I - public fun getParams ()Ljava/util/List; -} - -public final class sk/ainet/lang/nn/transformer/XIELUActivation : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { - public fun ()V - public fun (Ljava/lang/String;)V - public synthetic fun (Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V - public fun getModules ()Ljava/util/List; - public fun getName ()Ljava/lang/String; - public fun getParams ()Ljava/util/List; -} - diff --git a/llm-inference/llama/api/jvm/llama.api b/llm-inference/llama/api/jvm/llama.api index 8151263..314193e 100644 --- a/llm-inference/llama/api/jvm/llama.api +++ b/llm-inference/llama/api/jvm/llama.api @@ -245,6 +245,10 @@ public final class sk/ainet/models/llama/LlamaNetworkLoader$WeightsProvider$Safe public fun toString ()Ljava/lang/String; } +public final class sk/ainet/models/llama/LlamaPackedWeightsKt { + public static final fun convertLlamaWeightsPacked (Lsk/ainet/models/llama/DecoderGgufWeights;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/models/llama/DecoderGgufWeights; +} + public final class sk/ainet/models/llama/LlamaRuntime : sk/ainet/apps/llm/DecoderRuntime, sk/ainet/models/llama/LlamaRuntimeInterface { public static final field DEFAULT_BOS_TOKEN I public fun (Lsk/ainet/context/ExecutionContext;Lsk/ainet/models/llama/LlamaRuntimeWeights;Lsk/ainet/models/llama/AttentionBackend;Lkotlin/reflect/KClass;FLkotlin/random/Random;Lsk/ainet/models/llama/GraphAccelerator;)V diff --git a/transformer-core/api/jvm/transformer-core.api b/transformer-core/api/jvm/transformer-core.api new file mode 100644 index 0000000..97870ad --- /dev/null +++ b/transformer-core/api/jvm/transformer-core.api @@ -0,0 +1,324 @@ +public abstract interface class sk/ainet/lang/nn/dsl/ATTENTION : sk/ainet/lang/nn/dsl/NetworkDslItem { + public abstract fun kvCache (III)V + public abstract fun kvCache (Lsk/ainet/lang/nn/transformer/KVCache;)V + public abstract fun rope (IILsk/ainet/lang/nn/transformer/RoPEMode;FLsk/ainet/lang/nn/transformer/RoPEScaling;FF)V + public static synthetic fun rope$default (Lsk/ainet/lang/nn/dsl/ATTENTION;IILsk/ainet/lang/nn/transformer/RoPEMode;FLsk/ainet/lang/nn/transformer/RoPEScaling;FFILjava/lang/Object;)V +} + +public final class sk/ainet/lang/nn/dsl/ATTENTION$DefaultImpls { + public static synthetic fun rope$default (Lsk/ainet/lang/nn/dsl/ATTENTION;IILsk/ainet/lang/nn/transformer/RoPEMode;FLsk/ainet/lang/nn/transformer/RoPEScaling;FFILjava/lang/Object;)V +} + +public final class sk/ainet/lang/nn/dsl/AttentionImpl : sk/ainet/lang/nn/dsl/ATTENTION { + public fun (Lsk/ainet/context/ExecutionContext;IIIZZZDLjava/lang/Float;ZZLjava/lang/String;Ljava/lang/Integer;Lkotlin/reflect/KClass;)V + public synthetic fun (Lsk/ainet/context/ExecutionContext;IIIZZZDLjava/lang/Float;ZZLjava/lang/String;Ljava/lang/Integer;Lkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun create ()Lsk/ainet/lang/nn/transformer/MultiHeadAttention; + public fun getExecutionContext ()Lsk/ainet/context/ExecutionContext; + public fun kvCache (III)V + public fun kvCache (Lsk/ainet/lang/nn/transformer/KVCache;)V + public fun rope (IILsk/ainet/lang/nn/transformer/RoPEMode;FLsk/ainet/lang/nn/transformer/RoPEScaling;FF)V +} + +public final class sk/ainet/lang/nn/dsl/TransformerDslKt { + public static final fun embedding (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;)V + public static final fun embedding (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;)V + public static synthetic fun embedding$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;ILjava/lang/Object;)V + public static synthetic fun embedding$default (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;ILjava/lang/Object;)V + public static final fun geGluFFN (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;)V + public static final fun geGluFFN (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;)V + public static synthetic fun geGluFFN$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;ILjava/lang/Object;)V + public static synthetic fun geGluFFN$default (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;ILjava/lang/Object;)V + public static final fun multiHeadAttention (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IIIZZFZLjava/lang/String;Ljava/lang/Integer;Lkotlin/jvm/functions/Function1;)V + public static final fun multiHeadAttention (Lsk/ainet/lang/nn/dsl/StageImpl;IIIZZZFLjava/lang/Float;ZZLjava/lang/String;Ljava/lang/Integer;Lkotlin/jvm/functions/Function1;)V + public static synthetic fun multiHeadAttention$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IIIZZFZLjava/lang/String;Ljava/lang/Integer;Lkotlin/jvm/functions/Function1;ILjava/lang/Object;)V + public static synthetic fun multiHeadAttention$default (Lsk/ainet/lang/nn/dsl/StageImpl;IIIZZZFLjava/lang/Float;ZZLjava/lang/String;Ljava/lang/Integer;Lkotlin/jvm/functions/Function1;ILjava/lang/Object;)V + public static final fun residual (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;)V + public static final fun residual (Lsk/ainet/lang/nn/dsl/StageImpl;)V + public static final fun rmsNorm (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IFLjava/lang/String;Z)V + public static final fun rmsNorm (Lsk/ainet/lang/nn/dsl/StageImpl;IFLjava/lang/String;Z)V + public static synthetic fun rmsNorm$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IFLjava/lang/String;ZILjava/lang/Object;)V + public static synthetic fun rmsNorm$default (Lsk/ainet/lang/nn/dsl/StageImpl;IFLjava/lang/String;ZILjava/lang/Object;)V + public static final fun swiGluFFN (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;)V + public static final fun swiGluFFN (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;)V + public static synthetic fun swiGluFFN$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;IILjava/lang/String;ILjava/lang/Object;)V + public static synthetic fun swiGluFFN$default (Lsk/ainet/lang/nn/dsl/StageImpl;IILjava/lang/String;ILjava/lang/Object;)V + public static final fun xielu (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;Ljava/lang/String;)V + public static final fun xielu (Lsk/ainet/lang/nn/dsl/StageImpl;Ljava/lang/String;)V + public static synthetic fun xielu$default (Lsk/ainet/lang/nn/dsl/NeuralNetworkDslImpl;Ljava/lang/String;ILjava/lang/Object;)V + public static synthetic fun xielu$default (Lsk/ainet/lang/nn/dsl/StageImpl;Ljava/lang/String;ILjava/lang/Object;)V +} + +public final class sk/ainet/lang/nn/layers/Embedding : sk/ainet/lang/nn/DualModule, sk/ainet/lang/nn/topology/ModuleParameters { + public static final field Companion Lsk/ainet/lang/nn/layers/Embedding$Companion; + public fun (IILsk/ainet/lang/tensor/Tensor;Ljava/lang/Integer;Ljava/lang/String;)V + public synthetic fun (IILsk/ainet/lang/tensor/Tensor;Ljava/lang/Integer;Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun (Lsk/ainet/context/ExecutionContext;Lkotlin/reflect/KClass;Lsk/ainet/lang/nn/layers/EmbeddingParams;Ljava/lang/String;FFLkotlin/random/Random;)V + public synthetic fun (Lsk/ainet/context/ExecutionContext;Lkotlin/reflect/KClass;Lsk/ainet/lang/nn/layers/EmbeddingParams;Ljava/lang/String;FFLkotlin/random/Random;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun forward (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; + public final fun forward ([ILsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; + public final fun forward ([JLsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; + public final fun forwardAny (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;Z)Lsk/ainet/lang/tensor/Tensor; + public static synthetic fun forwardAny$default (Lsk/ainet/lang/nn/layers/Embedding;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;ZILjava/lang/Object;)Lsk/ainet/lang/tensor/Tensor; + public final fun getEmbeddingDim ()I + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public final fun getNumEmbeddings ()I + public final fun getPaddingIdx ()Ljava/lang/Integer; + public fun getParams ()Ljava/util/List; +} + +public final class sk/ainet/lang/nn/layers/Embedding$Companion { +} + +public final class sk/ainet/lang/nn/layers/EmbeddingAdapter : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { + public fun (Lsk/ainet/lang/nn/layers/Embedding;)V + public final fun getEmbedding ()Lsk/ainet/lang/nn/layers/Embedding; + public final fun getEmbeddingDim ()I + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public final fun getNumEmbeddings ()I + public fun getParams ()Ljava/util/List; +} + +public final class sk/ainet/lang/nn/layers/EmbeddingParams { + public fun (IILjava/lang/Integer;Ljava/lang/Float;Z)V + public synthetic fun (IILjava/lang/Integer;Ljava/lang/Float;ZILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun component1 ()I + public final fun component2 ()I + public final fun component3 ()Ljava/lang/Integer; + public final fun component4 ()Ljava/lang/Float; + public final fun component5 ()Z + public final fun copy (IILjava/lang/Integer;Ljava/lang/Float;Z)Lsk/ainet/lang/nn/layers/EmbeddingParams; + public static synthetic fun copy$default (Lsk/ainet/lang/nn/layers/EmbeddingParams;IILjava/lang/Integer;Ljava/lang/Float;ZILjava/lang/Object;)Lsk/ainet/lang/nn/layers/EmbeddingParams; + public fun equals (Ljava/lang/Object;)Z + public final fun getEmbeddingDim ()I + public final fun getMaxNorm ()Ljava/lang/Float; + public final fun getNumEmbeddings ()I + public final fun getPaddingIdx ()Ljava/lang/Integer; + public final fun getScaleGradByFreq ()Z + public fun hashCode ()I + public fun toString ()Ljava/lang/String; +} + +public abstract interface class sk/ainet/lang/nn/normalization/FusedRmsNormOps { + public abstract fun fusedRmsNorm (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;F)Lsk/ainet/lang/tensor/Tensor; +} + +public final class sk/ainet/lang/nn/normalization/RMSNormalization : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { + public fun ([IDLjava/lang/String;Lsk/ainet/lang/tensor/Tensor;ZLkotlin/reflect/KClass;)V + public synthetic fun ([IDLjava/lang/String;Lsk/ainet/lang/tensor/Tensor;ZLkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun forward (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public fun getParams ()Ljava/util/List; +} + +public final class sk/ainet/lang/nn/transformer/AppendKVCache : sk/ainet/lang/nn/transformer/KVCache { + public fun (IIILjava/lang/String;)V + public synthetic fun (IIILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun getPosition ()I + public fun reset ()V + public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; +} + +public final class sk/ainet/lang/nn/transformer/GeGLUFFN : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { + public fun (IILjava/lang/String;Lkotlin/reflect/KClass;)V + public synthetic fun (IILjava/lang/String;Lkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun getDim ()I + public final fun getHiddenDim ()I + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public fun getParams ()Ljava/util/List; +} + +public abstract class sk/ainet/lang/nn/transformer/KVCache : sk/ainet/lang/nn/Module { + public fun (IIILjava/lang/String;)V + public synthetic fun (IIILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun getHeadDim ()I + public final fun getMaxSeqLen ()I + public fun getModules ()Ljava/util/List; + public final fun getNKVHeads ()I + public fun getName ()Ljava/lang/String; + public abstract fun getPosition ()I + protected fun onForward (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; + public abstract fun reset ()V + public abstract fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; +} + +public final class sk/ainet/lang/nn/transformer/LayerScalarMul : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { + public fun ()V + public fun (Ljava/lang/String;Lkotlin/reflect/KClass;)V + public synthetic fun (Ljava/lang/String;Lkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public fun getParams ()Ljava/util/List; +} + +public final class sk/ainet/lang/nn/transformer/LinearProjectionKt { + public static final fun linearProject (Lsk/ainet/lang/tensor/ops/TensorOps;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;)Lsk/ainet/lang/tensor/Tensor; +} + +public final class sk/ainet/lang/nn/transformer/MultiHeadAttention : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { + public fun (IIIZZZDLjava/lang/Float;ZZLjava/lang/String;Lsk/ainet/lang/nn/transformer/RoPE;Lsk/ainet/lang/nn/transformer/KVCache;Ljava/lang/Integer;Ljava/lang/Integer;Lkotlin/reflect/KClass;)V + public synthetic fun (IIIZZZDLjava/lang/Float;ZZLjava/lang/String;Lsk/ainet/lang/nn/transformer/RoPE;Lsk/ainet/lang/nn/transformer/KVCache;Ljava/lang/Integer;Ljava/lang/Integer;Lkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun forward (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; + public final fun getAttentionScale ()Ljava/lang/Float; + public final fun getBias ()Z + public final fun getCausal ()Z + public final fun getDim ()I + public final fun getHeadDim ()I + public final fun getKNorm ()Lsk/ainet/lang/nn/normalization/RMSNormalization; + public final fun getKvCache ()Lsk/ainet/lang/nn/transformer/KVCache; + public final fun getKvDim ()I + public fun getModules ()Ljava/util/List; + public final fun getNHeads ()I + public final fun getNKVHeads ()I + public fun getName ()Ljava/lang/String; + public fun getParams ()Ljava/util/List; + public final fun getQDim ()I + public final fun getQNorm ()Lsk/ainet/lang/nn/normalization/RMSNormalization; + public final fun getQkNorm ()Z + public final fun getQkNormEps ()D + public final fun getQkNormUnitOffset ()Z + public final fun getRope ()Lsk/ainet/lang/nn/transformer/RoPE; + public final fun getSlidingWindow ()Ljava/lang/Integer; + public final fun getVNormNoScale ()Z + public final fun setKvCache (Lsk/ainet/lang/nn/transformer/KVCache;)V + public final fun setRope (Lsk/ainet/lang/nn/transformer/RoPE;)V +} + +public final class sk/ainet/lang/nn/transformer/MultiHeadAttentionDiag { + public static final field INSTANCE Lsk/ainet/lang/nn/transformer/MultiHeadAttentionDiag; + public final fun getShouldDumpThisCall ()Z + public final fun setShouldDumpThisCall (Z)V +} + +public final class sk/ainet/lang/nn/transformer/MultiHeadAttentionKt { + public static final fun getMhaStatSink ()Lkotlin/jvm/functions/Function2; + public static final fun setMhaStatSink (Lkotlin/jvm/functions/Function2;)V +} + +public final class sk/ainet/lang/nn/transformer/OwnerReadOnlyKVCache : sk/ainet/lang/nn/transformer/KVCache { + public fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;Ljava/lang/String;)V + public synthetic fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun getDelegate ()Lsk/ainet/lang/nn/transformer/PositionalKVCache; + public fun getPosition ()I + public fun reset ()V + public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; +} + +public final class sk/ainet/lang/nn/transformer/PaddedSharedPositionalKVCache : sk/ainet/lang/nn/transformer/KVCache { + public fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;ILjava/lang/String;)V + public synthetic fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;ILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun getDelegate ()Lsk/ainet/lang/nn/transformer/PositionalKVCache; + public final fun getLayerHeadDim ()I + public fun getPosition ()I + public fun reset ()V + public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; +} + +public final class sk/ainet/lang/nn/transformer/PositionalKVCache : sk/ainet/lang/nn/transformer/KVCache { + public fun (IIILjava/lang/String;)V + public synthetic fun (IIILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun getPosition ()I + public fun reset ()V + public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; +} + +public final class sk/ainet/lang/nn/transformer/ResidualAdd : sk/ainet/lang/nn/Module { + public fun ()V + public fun (Ljava/lang/String;)V + public synthetic fun (Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public final fun getSavedInput ()Lsk/ainet/lang/tensor/Tensor; + public final fun setSavedInput (Lsk/ainet/lang/tensor/Tensor;)V +} + +public final class sk/ainet/lang/nn/transformer/RoPE : sk/ainet/lang/nn/Module { + public fun (IIFLsk/ainet/lang/nn/transformer/RoPEMode;Lsk/ainet/lang/nn/transformer/RoPEScaling;FFLjava/lang/String;)V + public synthetic fun (IIFLsk/ainet/lang/nn/transformer/RoPEMode;Lsk/ainet/lang/nn/transformer/RoPEScaling;FFLjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun forward (Lsk/ainet/lang/tensor/Tensor;ILsk/ainet/context/ExecutionContext;)Lsk/ainet/lang/tensor/Tensor; + public final fun getHeadDim ()I + public final fun getMaxSeqLen ()I + public final fun getMode ()Lsk/ainet/lang/nn/transformer/RoPEMode; + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public final fun getPartialRotaryFactor ()F + public final fun getRotaryDim ()I + public final fun getScaling ()Lsk/ainet/lang/nn/transformer/RoPEScaling; + public final fun getScalingFactor ()F +} + +public final class sk/ainet/lang/nn/transformer/RoPEMode : java/lang/Enum { + public static final field INTERLEAVED Lsk/ainet/lang/nn/transformer/RoPEMode; + public static final field SPLIT_HALF Lsk/ainet/lang/nn/transformer/RoPEMode; + public static fun getEntries ()Lkotlin/enums/EnumEntries; + public static fun valueOf (Ljava/lang/String;)Lsk/ainet/lang/nn/transformer/RoPEMode; + public static fun values ()[Lsk/ainet/lang/nn/transformer/RoPEMode; +} + +public final class sk/ainet/lang/nn/transformer/RoPEScaling : java/lang/Enum { + public static final field NONE Lsk/ainet/lang/nn/transformer/RoPEScaling; + public static final field PROPORTIONAL Lsk/ainet/lang/nn/transformer/RoPEScaling; + public static fun getEntries ()Lkotlin/enums/EnumEntries; + public static fun valueOf (Ljava/lang/String;)Lsk/ainet/lang/nn/transformer/RoPEScaling; + public static fun values ()[Lsk/ainet/lang/nn/transformer/RoPEScaling; +} + +public final class sk/ainet/lang/nn/transformer/SharedKVCache : sk/ainet/lang/nn/transformer/KVCache { + public fun (Lsk/ainet/lang/nn/transformer/KVCache;Ljava/lang/String;)V + public synthetic fun (Lsk/ainet/lang/nn/transformer/KVCache;Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun getDelegate ()Lsk/ainet/lang/nn/transformer/KVCache; + public fun getPosition ()I + public fun reset ()V + public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; +} + +public final class sk/ainet/lang/nn/transformer/SharedPositionalKVCache : sk/ainet/lang/nn/transformer/KVCache { + public fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;Ljava/lang/String;)V + public synthetic fun (Lsk/ainet/lang/nn/transformer/PositionalKVCache;Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun getDelegate ()Lsk/ainet/lang/nn/transformer/PositionalKVCache; + public fun getPosition ()I + public fun reset ()V + public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; +} + +public final class sk/ainet/lang/nn/transformer/SlidingWindowKVCache : sk/ainet/lang/nn/transformer/KVCache { + public fun (IIIILjava/lang/String;)V + public synthetic fun (IIIILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun getPosition ()I + public final fun getWindow ()I + public fun reset ()V + public fun update (Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/lang/tensor/Tensor;Lsk/ainet/context/ExecutionContext;)Lkotlin/Pair; +} + +public final class sk/ainet/lang/nn/transformer/SwiGLUFFN : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { + public fun (IILjava/lang/String;)V + public synthetic fun (IILjava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun getDim ()I + public final fun getHiddenDim ()I + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public fun getParams ()Ljava/util/List; +} + +public final class sk/ainet/lang/nn/transformer/VoidDense : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { + public fun (Ljava/lang/String;IILkotlin/reflect/KClass;)V + public synthetic fun (Ljava/lang/String;IILkotlin/reflect/KClass;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public final fun getInDim ()I + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public final fun getOutDim ()I + public fun getParams ()Ljava/util/List; +} + +public final class sk/ainet/lang/nn/transformer/XIELUActivation : sk/ainet/lang/nn/Module, sk/ainet/lang/nn/topology/ModuleParameters { + public fun ()V + public fun (Ljava/lang/String;)V + public synthetic fun (Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V + public fun getModules ()Ljava/util/List; + public fun getName ()Ljava/lang/String; + public fun getParams ()Ljava/util/List; +} + diff --git a/transformer-core/build.gradle.kts b/transformer-core/build.gradle.kts index 28eaa16..feff4d7 100644 --- a/transformer-core/build.gradle.kts +++ b/transformer-core/build.gradle.kts @@ -5,6 +5,9 @@ plugins { alias(libs.plugins.kotlinMultiplatform) alias(libs.plugins.androidMultiplatformLibrary) alias(libs.plugins.vanniktech.mavenPublish) + // Track the public API of the NN primitives here (they live in this module since the 0.31.1 + // extraction). Before this, they were only listed in the stale llm-core.api re-export. + alias(libs.plugins.binary.compatibility.validator) } // Framework NN primitives (attention, KV-cache family, embedding, norms, RoPE, FFNs) extracted from