[AMD] perf: enable FlyDSL w4a16 MoE for Kimi INT4 by amd-asalykov · Pull Request #1777 · SemiAnalysisAI/InferenceX

amd-asalykov · 2026-06-15T10:20:06Z

Replace default triton w4a16 MoE kernel with more performant FlyDSL implementation for Kimi INT4 MI355X

Note

Low Risk
Benchmark and serving-flag changes only; no application auth or data paths. Risk is limited to reproducibility and CI cost from expanded sweeps and a nightly container pin.

Overview
Updates the Kimi K2.5 INT4 vLLM MI355X benchmark to use FlyDSL for w4a16 MoE instead of the default Triton path, and pins a digest-suffixed ROCm nightly image (vllm-openai-rocm:nightly-b8336c3…).

The runner script kimik2.5_int4_mi355x.sh adds --moe-backend flydsl and a compilation pass that sets fuse_allreduce_rms to false. CI config expands the fixed-seq-len sweep: concurrency up to 128 (from 64) and an additional TP=4 row for both 1k/1k and 8k/1k scenarios.

perf-changelog.yaml records the config-key change for PR #1777.

^{Reviewed by Cursor Bugbot for commit be23347. Bugbot is set up for automated code reviews on this repo. Configure here.}

enable FlyDSL MoE for Kimi int4

9f2165e

amd-asalykov requested a review from a team June 15, 2026 10:20

github-project-automation Bot added this to InferenceMAX Board Jun 15, 2026

amd-asalykov requested review from 1am9trash, billishyahao, chunfangamd, seungrokj and yctseng0211 as code owners June 15, 2026 10:20

add PR link

907dee9

seungrokj reviewed Jun 15, 2026

View reviewed changes

Comment thread .github/configs/amd-master.yaml Outdated

seungrokj reviewed Jun 15, 2026

View reviewed changes

Comment thread perf-changelog.yaml

update

be23347

chunfangamd added the full-sweep-enabled label Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] perf: enable FlyDSL w4a16 MoE for Kimi INT4#1777

[AMD] perf: enable FlyDSL w4a16 MoE for Kimi INT4#1777
amd-asalykov wants to merge 3 commits into
SemiAnalysisAI:mainfrom
amd-asalykov:flydsl-moe

amd-asalykov commented Jun 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

amd-asalykov commented Jun 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amd-asalykov commented Jun 15, 2026 •

edited by cursor Bot

Loading