Skip to content

Fix dequant_mixed#4657

Open
irexyc wants to merge 2 commits into
InternLM:mainfrom
irexyc:fix_dequant_mixed
Open

Fix dequant_mixed#4657
irexyc wants to merge 2 commits into
InternLM:mainfrom
irexyc:fix_dequant_mixed

Conversation

@irexyc

@irexyc irexyc commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Motivation

Fix model(/mnt/cfs/chenxin/Qwen3.6-27B-AWQ) loading and inference.

https://www.modelscope.cn/models/tclf90/Qwen3.6-27B-AWQ/file/view/master/config.json?status=1

Copilot AI review requested due to automatic review settings June 8, 2026 07:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes TurboMind model loading/inference for an AWQ checkpoint (Qwen3.6-27B-AWQ) by correcting dequantization behavior when weights/zeros are not in the expected packed int32 form, and by improving dtype auto-resolution for configs that wrap the language model under text_config.

Changes:

  • Update AWQFormat.dequant to dequantize both packed (int32) and already-unpacked tensor layouts.
  • When dtype="auto", resolve dtype against hf_model_cfg.text_config when present (common for VLM/aggregate configs).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
lmdeploy/turbomind/weight_format.py Fixes AWQ dequantization to support non-int32 (already-unpacked) tensors, preventing incorrect calls into dequantize_gemm.
lmdeploy/turbomind/converter.py Ensures dtype="auto" consults text_config for the language model dtype when the HF config is an aggregate wrapper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lmdeploy/turbomind/converter.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants