release: 0.32.4 — streaming detokenization preserves word spaces#767
Merged
Conversation
Tokenizer.decodeToken(id): per-token streaming decode that keeps each
SentencePiece piece's leading word-boundary space (llama.cpp token_to_piece
semantics), so a generation loop decoding one token at a time no longer runs
words together ("the process" -> "theprocess"). SentencePieceTokenizer
overrides it to skip the sequence-level addSpacePrefix strip; adds
decode(ids, stripLeadingSpace). Backward-compatible (decode(IntArray) unchanged).
Version bump + CHANGELOG/README/docs version snippets -> 0.32.4. antora.yml is
version: ~ (branch-tracked). skainet-io-core is not API-tracked, no dump change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
📖 Documentation Preview The documentation has been built successfully for this PR. Generated Files:
Artifacts:
This comment will be updated automatically when the PR is updated. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Merges the 0.32.4 release back into develop. The tag
0.32.4is published to Maven Central (deployment succeeded); this brings the release commit onto develop so the version (0.32.4) and CHANGELOG/docs reflect the release.What 0.32.4 ships
Fix: streaming detokenization preserves word-boundary spaces. A generation loop that decodes one token at a time (
decode(tokenId)) ran words together ("the process"→"theprocess") because the single-token path delegated to the sequence-levelSentencePieceTokenizer.decode(IntArray), whoseaddSpacePrefixleading-space strip is only correct once per sequence.Tokenizer.decodeToken(id)— new interface method, default= decode(intArrayOf(id))(backward-compatible); gives the upstream tokenizer the streaming single-token decode it lacked.SentencePieceTokenizeroverridesdecodeTokento decode without the leading strip (llama.cpptoken_to_piecesemantics); addsdecode(ids, stripLeadingSpace).decode(IntArray)behaviour unchanged."Helloworld"behaviour.Fixes correct-but-spaceless output in every streaming consumer (kllama, agent loops, any
decode(Int)caller).Notes
skainet-io-coreis not API-tracked, so no.apidump change is needed.version: ~(branch-tracked).