docs(advance): add Add a New Speculative Decoding Method guide#4639
Closed
Oxygen56 wants to merge 1 commit into
Closed
docs(advance): add Add a New Speculative Decoding Method guide#4639Oxygen56 wants to merge 1 commit into
Oxygen56 wants to merge 1 commit into
Conversation
Add a step-by-step documentation guide explaining how to add a new speculative decoding method to LMDeploy. Covers the SPEC_PROPOSERS registry system, BaseSpecProposer base class, the 3-tuple get_outputs contract, registration and import requirements, build_model override patterns, and a shipping checklist. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Collaborator
|
It covers the same part as in #4589. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The PyTorch engine has a clean plug-in surface for speculative decoding (
BaseSpecProposer+SPEC_PROPOSERSregistry inlmdeploy/pytorch/spec_decode/proposers/base.py), and four shipped methods register against it:eagle,eagle3,deepseek_mtp,qwen3_5_mtp. The user-facingdocs/en/advance/spec_decoding.mdteaches usage of those four names but never explains how to add a fifth, so users have asked the question externally.Closes (partially) the docs side of #1738 and #4530.
Modification
Adds
docs/en/advance/spec_decoding_new_method.mdand a toctree entry for it indocs/en/index.rst, right next tospec_decoding.md.The page covers:
methodstring triad.build_specdecode_proposerentry point and whyproposers/__init__.pymust import the new class.BaseSpecProposeralready provides so contributors don't re-implement weight loading, draft forward, decoding-input update, or fallbacks.MyMethod(BaseSpecProposer)skeleton with@SPEC_PROPOSERS.register_module(name='my_method').get_outputs(draft token ids,model_metas,target_hidden_states).build_model, illustrated with the two in-tree precedents (Qwen3_5MTPshares the target embeddings;Eagle3swaps embeddings conditionally and widensget_target_hidden_size).No code changes. All snippets and references point to symbols that exist in
lmdeploy/pytorch/spec_decode/proposers/.BC-breaking
None — docs only.