examples/qwen3: replace ark.rope with pure-torch RoPE fallback#266
Open
chhwang wants to merge 10 commits into
Open
examples/qwen3: replace ark.rope with pure-torch RoPE fallback#266chhwang wants to merge 10 commits into
chhwang wants to merge 10 commits into
Conversation
added 2 commits
June 12, 2026 09:39
…ence-test helper, and microbench helper under examples/qwen3/
… with equivalence test and microbench against torch reference
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #266 +/- ##
=======================================
Coverage 85.70% 85.70%
=======================================
Files 129 129
Lines 6457 6457
=======================================
Hits 5534 5534
Misses 923 923 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
added 8 commits
June 12, 2026 11:28
… with equivalence test and microbench against torch reference
… with equivalence test and microbench against torch reference
… with equivalence test and microbench against torch reference
…ttn, base main): diagnose the failing test(s) from the CI log and fix the root cause in the attention component or test code
…en ark_rmsnorm input to 2D before the composed graph to avoid the 4D shape-dependent cudaErrorMisalignedAddress crash at (1,4,128,32); fall back to torch for QK-norm or RoPE if 2D also fails
…en ark_rmsnorm input to 2D before the composed graph to avoid the 4D shape-dependent cudaErrorMisalignedAddress crash at (1,4,128,32); fall back to torch for QK-norm or RoPE if 2D also fails
…main): replace composed ARK RMSNorm graph with torch-based RMSNorm for QK-norm to avoid the upstream executor crash; keep ARK only for ark.rope; run black to fix formatting
…ope produces wrong output at 4D shape (1,4,128,32) in CI — flatten x and freqs to 2D before calling ark.rope then reshape back, or replace with a pure-torch RoPE fallback; update test_rope and test_attention_prefill accordingly
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
examples/qwen3: replace ark.rope with pure-torch RoPE fallback
ark.ropeproduces wrong output at 4-D shape (1,4,128,32) — max diff 6.9vs tolerance 0.005 — due to the same upstream composed-graph planner bug
that affected rmsnorm.
Replace with
torch_rope: complex-view multiply in fp32, matching thereference
apply_ropeexactly. Addprecompute_torch_rope_freqs(complex64format). Update
test_ropeandtest_attention_prefillto exercise thetorch code path. Retain
ark_ropeandark_rmsnormdormant forre-enablement in Q6 after the upstream fix lands.
Q4 now ships a verified attention pipeline (QK-norm + RoPE + GQA attention)
with torch-only ops. ARK-native kernels deferred to Q6.