release: SKaiNET-transformers 0.32.0#197
Merged
Merged
Conversation
Ships the real-GGUF Llama eager path (packed NATIVE_OPTIMIZED) and unblocks StableHLO/IREE export for Llama-family models (traceable interleaved RoPE). Against engine 0.32.0. - VERSION_NAME 0.31.1 -> 0.32.0; engine pin skainet 0.31.0 -> 0.32.0. - CHANGELOG: [0.32.0] section (NATIVE_OPTIMIZED Llama / fused decode-attention / traceable RoPE / packed-embedding gather fix). - README: Current release + "What's new in 0.32.0". - antora samples: BOM coordinate 0.31.1 -> 0.32.0 (getting-started-java, llama3-tool-calling). Features (since 0.31.1): ccbd87e NATIVE_OPTIMIZED Llama, 3791f88 fused decode-attention, 019b049 traceable interleaved RoPE. Verified: transformer-core + llm-inference:llama compile against published engine 0.32.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
apiCheck was red: llm-core.api still listed the NN primitives that the 0.31.1 transformer-core extraction moved out, and transformer-core had no API validation at all — so those public types were tracked nowhere. - Add binary-compatibility-validator to transformer-core; apiDump creates transformer-core/api/jvm/transformer-core.api (the 30 moved primitives: attention, KV-cache family, embedding, norms, RoPE, FFNs, linear projection). - Regenerate llm-core.api (drop the moved primitives — they're now tracked in transformer-core, not lost). - llama.api: + convertLlamaWeightsPacked (the 0.32.0 NATIVE_OPTIMIZED feature). apiCheck now passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR 3 of 3 — IREE/TinyLlama fix-stack (merge in order)
Stacked on #196. Base will auto-retarget to
developonce #196 merges. Merge this last.Commits
a91f57dchore(release): prepare 0.32.0 — bumpsVERSION_NAME0.31.1→0.32.0 (gradle.properties,libs.versions.toml), CHANGELOG, README, docs068647fchore(release): sync API dumps for 0.32.0 + validate transformer-coreNotes
developadvanced since this branch was cut (dependabot bumps + kvcache fix fix(kvcache): trace-faithful PositionalKVCache.update (#763) #193). Test-merge into currentdevelopis clean —libs.versions.tomlauto-merges, no conflicts../gradlew apiCheckafter the rebase/retarget in case develop's kvcache change shifted the public API surface vs. the dumps generated here.Stack (merge order)
Merge with a merge-commit / rebase (do NOT squash).