Skip to content

refactor: simplify multimodal preprocessing expansion#4663

Open
CUHKSZzxy wants to merge 3 commits into
InternLM:mainfrom
CUHKSZzxy:refactor/preprocess-utils
Open

refactor: simplify multimodal preprocessing expansion#4663
CUHKSZzxy wants to merge 3 commits into
InternLM:mainfrom
CUHKSZzxy:refactor/preprocess-utils

Conversation

@CUHKSZzxy

Copy link
Copy Markdown
Collaborator

Summary

  • Split bundled multimodal expansion into image, video, and audio helpers.
  • Make bundled video expansion explicit for per-video and per-frame offset layouts.
  • Keep Qwen3 Omni whole-video items and Qwen3VL/Qwen3.5 per-frame items aligned with the one-span MultiModalData contract.

Validation

  • Focused VLM preprocessing tests passed.
  • Qwen3 Omni processor tests passed.
  • Qwen3.5 single-video pipeline smoke passed with a local cached model/media setup.
  • Pre-commit hooks passed for the touched file.

Assistance

Assisted with Codex + GPT-5.5 xHigh Fast

Copilot AI review requested due to automatic review settings June 9, 2026 10:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors get_expanded_mm_items multimodal preprocessing to split bundled HF-processor outputs into dedicated image/video/audio expansion helpers, and clarifies the intended offset layouts for whole-video vs per-frame video items to keep MultiModalData’s one-span-per-item contract.

Changes:

  • Added _expand_bundled_image_items, _expand_bundled_video_items, and _expand_bundled_audio_items helpers and delegated bundled expansion to them.
  • Made video expansion distinguish “one offset per video” vs “one offset per frame” paths more explicitly.
  • Simplified non-bundled video_second_per_grid scalar extraction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lmdeploy/vl/model/preprocess_utils.py
Comment thread lmdeploy/vl/model/preprocess_utils.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants