fix(ai-cache): preserve non-text content in exact key and L2 bypass#13654
Merged
shreemaan-abhishek merged 3 commits intoJul 3, 2026
Merged
Conversation
get_messages() flattens structured message content (images, tool calls) to plain text, which silently broke ai-cache fidelity at both layers: - L1 exact key: key.fingerprint hashes the flattened messages, so a text+image prompt collides with a text-only one carrying the same text and is wrongly served the text-only response. - L2 semantic bypass: window_has_nontext only tripped on table content, which no longer exists after flattening, so multimodal prompts stopped bypassing the semantic layer and could hit a same-text vector. Fold the raw messages into the L1 fingerprint, and scan the raw body for non-text blocks to drive the L2 bypass. body_has_nontext also guards non-table message items and content shaped as a single block object. t/plugin/ai-cache-semantic.t TEST 64 covers the end-to-end MISS; TEST 66 unit-tests the malformed-input handling.
membphis
previously approved these changes
Jul 3, 2026
membphis
left a comment
Member
There was a problem hiding this comment.
Reviewed the patch and did not find merge-blocking issues.
get_messages() also drops assistant tool_calls (and legacy function_call), so two prompts with identical text but different tool calls could collide on one L2 cell and return a wrong HIT. Treat non-empty tool_calls / function_call as non-text in body_has_nontext, same as media blocks. Extend TEST 66 to cover them.
AlinsRan
approved these changes
Jul 3, 2026
nic-6443
approved these changes
Jul 3, 2026
There was a problem hiding this comment.
Pull request overview
This PR fixes ai-cache correctness for multimodal and tool-call prompts after ai-protocols began flattening structured messages content to plain text, which could cause L1 key collisions and allow unsafe L2 semantic hits across distinct prompts.
Changes:
- Extend the L1 exact-cache fingerprinting to incorporate raw
body.messages, preventing text-only vs text+media collisions. - Replace the L2 “non-text” gate with
body_has_nontext()that scans the raw request body for non-text blocks and tool/function call state before allowing semantic lookup. - Add a focused unit-style test to validate
body_has_nontext()behavior, including malformed input tolerance.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
apisix/plugins/ai-cache/key.lua |
Incorporates raw messages into the L1 fingerprint to preserve multimodal/tool-call fidelity. |
apisix/plugins/ai-cache/semantic.lua |
Adds body_has_nontext() and uses it to bypass L2 when prompts include non-text/tool-call state. |
t/plugin/ai-cache-semantic.t |
Adds coverage for body_has_nontext() detection and malformed input handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
105
to
110
| local repr = build_repr(ctx, body, client_messages(ctx, body)) | ||
| -- client_messages() (get_messages) flattens structured content to plain text, | ||
| -- so an exact fingerprint of it alone would let a text+image prompt collide | ||
| -- with a text-only one. Fold the raw messages in to keep them distinct. | ||
| repr.raw_messages = body.messages | ||
| return hex_digest(core.json.canonical_encode(repr)) |
Contributor
Author
There was a problem hiding this comment.
can be handled in a later PR, we need to merge this PR soon to fix the failing CI.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
ai-protocols'get_messages()flattens structured message content (images, tool calls) down to plain text. That silently brokeai-cachefidelity at both cache layers:key.fingerprinthashes theget_messagesoutput. After flattening, a[{text}, {image_url}]prompt and a plain-text prompt carrying the same text collapse to the identical string, so they collide on one L1 key and the multimodal request is wrongly served the text-only response.window_has_nontextonly tripped ontype(content) == "table", which no longer exists after flattening, so multimodal prompts stopped bypassing the semantic layer and could hit a same-text vector.Fix
key.lua— fold the rawbody.messagesinto the L1 fingerprint so a text+image prompt stays distinct from a text-only one. (paramsalready preserves every other body field verbatim, soinput-based protocols were never affected; onlymessageslost fidelity.)semantic.lua— replacewindow_has_nontextwithbody_has_nontext, which scans the raw body and bypasses L2 whenever any block is non-text. It also guards non-table message items andcontentshaped as a single block object.Why it looked flaky
t/plugin/ai-cache-semantic.tTEST 64 asserts a multimodal prompt is a MISS. Whether the collision surfaces depends on TEST 63's async log-phase L2 write-back racing TEST 64's read (--- wait: 0.5), so it reproduces intermittently. It reproduces deterministically against a local RediSearch.Tests
t/plugin/ai-cache-semantic.t— TEST 64 (end-to-end MISS) now passes; added TEST 66 unit-testing the malformed-input handling. Full file green (215 tests), luacheck clean.Checklist: