Skip to content

examples/qwen3: replace ark.rope with pure-torch RoPE fallback#266

Open
chhwang wants to merge 10 commits into
mainfrom
qwen3-q4-attn
Open

examples/qwen3: replace ark.rope with pure-torch RoPE fallback#266
chhwang wants to merge 10 commits into
mainfrom
qwen3-q4-attn

Conversation

@chhwang

@chhwang chhwang commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

examples/qwen3: replace ark.rope with pure-torch RoPE fallback

ark.rope produces wrong output at 4-D shape (1,4,128,32) — max diff 6.9
vs tolerance 0.005 — due to the same upstream composed-graph planner bug
that affected rmsnorm.

Replace with torch_rope: complex-view multiply in fp32, matching the
reference apply_rope exactly. Add precompute_torch_rope_freqs (complex64
format). Update test_rope and test_attention_prefill to exercise the
torch code path. Retain ark_rope and ark_rmsnorm dormant for
re-enablement in Q6 after the upstream fix lands.

Q4 now ships a verified attention pipeline (QK-norm + RoPE + GQA attention)
with torch-only ops. ARK-native kernels deferred to Q6.

ark-dev added 2 commits June 12, 2026 09:39
…ence-test helper, and microbench helper under examples/qwen3/
… with equivalence test and microbench against torch reference
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.70%. Comparing base (c257202) to head (1653cee).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #266   +/-   ##
=======================================
  Coverage   85.70%   85.70%           
=======================================
  Files         129      129           
  Lines        6457     6457           
=======================================
  Hits         5534     5534           
  Misses        923      923           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ark-dev added 8 commits June 12, 2026 11:28
… with equivalence test and microbench against torch reference
… with equivalence test and microbench against torch reference
… with equivalence test and microbench against torch reference
…ttn, base main): diagnose the failing test(s) from the CI log and fix the root cause in the attention component or test code
…en ark_rmsnorm input to 2D before the composed graph to avoid the 4D shape-dependent cudaErrorMisalignedAddress crash at (1,4,128,32); fall back to torch for QK-norm or RoPE if 2D also fails
…en ark_rmsnorm input to 2D before the composed graph to avoid the 4D shape-dependent cudaErrorMisalignedAddress crash at (1,4,128,32); fall back to torch for QK-norm or RoPE if 2D also fails
…main): replace composed ARK RMSNorm graph with torch-based RMSNorm for QK-norm to avoid the upstream executor crash; keep ARK only for ark.rope; run black to fix formatting
…ope produces wrong output at 4D shape (1,4,128,32) in CI — flatten x and freqs to 2D before calling ark.rope then reshape back, or replace with a pure-torch RoPE fallback; update test_rope and test_attention_prefill accordingly
@chhwang chhwang changed the title ark-dev: Add ARK GQA attention component (QK-norm, RoPE, causal mask) with equivalence test and microbench against torch reference examples/qwen3: replace ark.rope with pure-torch RoPE fallback Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant