Skip to content

Update new FTS tokenizer language support#272

Merged
prrao87 merged 1 commit into
mainfrom
codex/fts-tokenizer-docs
Jun 12, 2026
Merged

Update new FTS tokenizer language support#272
prrao87 merged 1 commit into
mainfrom
codex/fts-tokenizer-docs

Conversation

@prrao87

@prrao87 prrao87 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

  • document ICU as the bundled mixed-language FTS tokenizer and link to the canonical ICU reference
  • clarify Jieba, Lindera, Korean ko-dic, and model-home expectations
  • distinguish base tokenization from stemming and stop-word language filters
  • remove wide mode from the FTS index page

Validation

  • Ran local mint dev to verify

@prrao87 prrao87 changed the title Document FTS tokenizer language support Update new FTS tokenizer language support Jun 12, 2026
@prrao87 prrao87 merged commit 600a380 into main Jun 12, 2026
1 check passed
@prrao87 prrao87 deleted the codex/fts-tokenizer-docs branch June 12, 2026 20:16
@mintlify

mintlify Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
lancedb-bcbb4faf 🔴 Failed Jun 12, 2026, 8:32 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant