fix(ai-cache): preserve non-text content in exact key and L2 bypass by shreemaan-abhishek · Pull Request #13654 · apache/apisix

shreemaan-abhishek · 2026-07-03T07:11:02Z

Description

ai-protocols' get_messages() flattens structured message content (images, tool calls) down to plain text. That silently broke ai-cache fidelity at both cache layers:

L1 exact key — key.fingerprint hashes the get_messages output. After flattening, a [{text}, {image_url}] prompt and a plain-text prompt carrying the same text collapse to the identical string, so they collide on one L1 key and the multimodal request is wrongly served the text-only response.
L2 semantic bypass — window_has_nontext only tripped on type(content) == "table", which no longer exists after flattening, so multimodal prompts stopped bypassing the semantic layer and could hit a same-text vector.

Fix

key.lua — fold the raw body.messages into the L1 fingerprint so a text+image prompt stays distinct from a text-only one. (params already preserves every other body field verbatim, so input-based protocols were never affected; only messages lost fidelity.)
semantic.lua — replace window_has_nontext with body_has_nontext, which scans the raw body and bypasses L2 whenever any block is non-text. It also guards non-table message items and content shaped as a single block object.

Why it looked flaky

t/plugin/ai-cache-semantic.t TEST 64 asserts a multimodal prompt is a MISS. Whether the collision surfaces depends on TEST 63's async log-phase L2 write-back racing TEST 64's read (--- wait: 0.5), so it reproduces intermittently. It reproduces deterministically against a local RediSearch.

Tests

t/plugin/ai-cache-semantic.t — TEST 64 (end-to-end MISS) now passes; added TEST 66 unit-testing the malformed-input handling. Full file green (215 tests), luacheck clean.

Checklist:

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change (N/A: internal bugfix, no behavior/schema change)
I have verified that this change is backward compatible

get_messages() flattens structured message content (images, tool calls) to plain text, which silently broke ai-cache fidelity at both layers: - L1 exact key: key.fingerprint hashes the flattened messages, so a text+image prompt collides with a text-only one carrying the same text and is wrongly served the text-only response. - L2 semantic bypass: window_has_nontext only tripped on table content, which no longer exists after flattening, so multimodal prompts stopped bypassing the semantic layer and could hit a same-text vector. Fold the raw messages into the L1 fingerprint, and scan the raw body for non-text blocks to drive the L2 bypass. body_has_nontext also guards non-table message items and content shaped as a single block object. t/plugin/ai-cache-semantic.t TEST 64 covers the end-to-end MISS; TEST 66 unit-tests the malformed-input handling.

membphis

Reviewed the patch and did not find merge-blocking issues.

get_messages() also drops assistant tool_calls (and legacy function_call), so two prompts with identical text but different tool calls could collide on one L2 cell and return a wrong HIT. Treat non-empty tool_calls / function_call as non-text in body_has_nontext, same as media blocks. Extend TEST 66 to cover them.

Copilot

Pull request overview

This PR fixes ai-cache correctness for multimodal and tool-call prompts after ai-protocols began flattening structured messages content to plain text, which could cause L1 key collisions and allow unsafe L2 semantic hits across distinct prompts.

Changes:

Extend the L1 exact-cache fingerprinting to incorporate raw body.messages, preventing text-only vs text+media collisions.
Replace the L2 “non-text” gate with body_has_nontext() that scans the raw request body for non-text blocks and tool/function call state before allowing semantic lookup.
Add a focused unit-style test to validate body_has_nontext() behavior, including malformed input tolerance.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
`apisix/plugins/ai-cache/key.lua`	Incorporates raw `messages` into the L1 fingerprint to preserve multimodal/tool-call fidelity.
`apisix/plugins/ai-cache/semantic.lua`	Adds `body_has_nontext()` and uses it to bypass L2 when prompts include non-text/tool-call state.
`t/plugin/ai-cache-semantic.t`	Adds coverage for `body_has_nontext()` detection and malformed input handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

shreemaan-abhishek · 2026-07-03T09:36:33Z

    local repr = build_repr(ctx, body, client_messages(ctx, body))
+    -- client_messages() (get_messages) flattens structured content to plain text,
+    -- so an exact fingerprint of it alone would let a text+image prompt collide
+    -- with a text-only one. Fold the raw messages in to keep them distinct.
+    repr.raw_messages = body.messages
    return hex_digest(core.json.canonical_encode(repr))


can be handled in a later PR, we need to merge this PR soon to fix the failing CI.

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Jul 3, 2026

membphis previously approved these changes Jul 3, 2026

View reviewed changes

shreemaan-abhishek dismissed membphis’s stale review via 1bf079c July 3, 2026 08:23

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jul 3, 2026

style(ai-cache): localize next to satisfy lj-releng lint

ba9aff6

shreemaan-abhishek requested a review from Copilot July 3, 2026 09:16

AlinsRan approved these changes Jul 3, 2026

View reviewed changes

Copilot started reviewing on behalf of shreemaan-abhishek July 3, 2026 09:18 View session

nic-6443 approved these changes Jul 3, 2026

View reviewed changes

Copilot AI reviewed Jul 3, 2026

View reviewed changes

membphis approved these changes Jul 3, 2026 •

edited

Loading

View reviewed changes

shreemaan-abhishek merged commit b83f323 into apache:master Jul 3, 2026
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ai-cache): preserve non-text content in exact key and L2 bypass#13654

fix(ai-cache): preserve non-text content in exact key and L2 bypass#13654
shreemaan-abhishek merged 3 commits into
apache:masterfrom
shreemaan-abhishek:fix/ai-cache-multimodal-key

shreemaan-abhishek commented Jul 3, 2026

Uh oh!

membphis left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

shreemaan-abhishek Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

shreemaan-abhishek commented Jul 3, 2026

Description

Fix

Why it looked flaky

Tests

Uh oh!

membphis left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

shreemaan-abhishek Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants