fix(ai-protocols): flatten structured message content in the protocol layer#13634
Merged
nic-6443 merged 7 commits intoJul 2, 2026
Merged
Conversation
OpenAI Chat allows messages[].content to be either a plain string or an
array of typed parts (e.g. [{type="text", text="..."}]). The plugin
collected msg.content as-is and then called table.concat, raising
"invalid value (table) ... for 'concat'" and returning 500 whenever an
inspected message carried array content.
Flatten each message's content (string, or the text parts of an array)
before concatenation, matching how the protocol adapters already extract
text. Add regression tests for array content in conversation history, in
the latest user message, and mixed text/non-text parts.
There was a problem hiding this comment.
Pull request overview
This PR fixes a Lua runtime crash in the ai-prompt-guard plugin when inspecting OpenAI Chat-style messages that use structured (array) messages[].content, by flattening text parts before concatenation.
Changes:
- Add a helper to flatten message
content(string or typed-parts array) into plain text prior totable.concat. - Update prompt aggregation to use the new flattening helper.
- Add regression tests covering structured
contentin history and the latest user message (including mixed text + non-text parts).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
apisix/plugins/ai-prompt-guard.lua |
Flattens structured message content into text before concatenation to prevent table.concat runtime errors. |
t/plugin/ai-prompt-guard.t |
Adds tests ensuring structured content is scanned/denied correctly without crashing under different match modes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add a case where the deny word lives only in a non-text part (image_url) to lock in that only text parts are inspected. Also declare `local type = type` so the content-flatten helper passes the localize-globals style check.
…_messages
Move the OpenAI Chat structured-content handling out of ai-prompt-guard and
into openai-chat.get_messages, so it returns canonical string content like
every other adapter (anthropic, bedrock, responses, embeddings) already does.
get_messages previously returned body.messages verbatim, so a message whose
content is an array of typed parts (e.g. [{type="text", text="..."}]) leaked a
table to consumers. ai-prompt-guard then hit "invalid value (table) ... for
'concat'" (500). Flattening in the protocol layer keeps protocol details out of
the plugins: ai-prompt-guard, ai-lakera-guard and ai-cache all get flattened
text without duplicating the logic.
openai-chat.get_messages now flattens content like the other adapters, so no adapter returns body.messages verbatim anymore. normalize_messages stays as idempotent defense-in-depth; reword the comment to match.
…lize_messages openai-chat.get_messages now flattens content to a string like every other adapter, so normalize_messages no longer needs its own text-part extraction (it only ever runs on get_messages output). Reduce it to the Lakera-specific filtering: keep role-tagged messages with non-empty string content.
…oss adapters Make the protocol layer the single place that flattens structured message content into plain strings, removing the duplicated flatten loops that the get_messages fix would otherwise leave scattered across adapters: - openai-chat / anthropic-messages: get_messages reuses the existing append_message_text helper instead of re-implementing the loop - bedrock-converse: add append_block_texts, replacing six copies of the content-block text extraction - openai-responses: add append_item_text shared by extract_request_content, extract_user_content and get_messages; this also fixes get_messages silently dropping structured (array) input content parts Add an ai-prompt-guard test for Responses API structured content.
… contract
get_messages always returns a table of {role, content} tables it constructs
itself, with content already flattened to a string, so normalize_messages
keeps only the two live filters: turns without a role (adapters pass the
client role through verbatim) and empty content that Lakera /v2/guard has
nothing to scan. The re-copy into a fresh table is also unnecessary.
membphis
approved these changes
Jul 2, 2026
AlinsRan
approved these changes
Jul 2, 2026
shreemaan-abhishek
approved these changes
Jul 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
ai-prompt-guardreturns 500 when a chat message'scontentis a structured array instead of a plain string.OpenAI Chat Completions allows
messages[].contentto be either a string or an array of typed parts, e.g.:{ "role": "user", "content": [ { "type": "text", "text": "hello" } ] }The root cause is in the protocol layer:
openai-chat.get_messages()returnedbody.messagesverbatim, so a table-valuedcontentleaked to consumers.ai-prompt-guardthen concatenates message content withtable.concat, which raises:The canonical
{role, content}contract is thatget_messagesreturns content already flattened to a plain string. This PR makes every adapter honor that contract and keeps the flattening in exactly one place per adapter:get_messagesnow reuses the adapter's existingappend_message_texthelper instead of re-implementing the flatten loop.append_block_texts, replacing six copies of the content-block text extraction acrossextract_response_text,extract_request_content,extract_user_contentandget_messages.append_item_textshared byextract_request_content,extract_user_contentandget_messages. This also fixesget_messagessilently dropping structured (array) input content parts — the same class of bug, previously masked because it dropped the content instead of crashing.Non-text parts (e.g.
image_url) are dropped, consistent across all adapters. Everyget_messagesconsumer (ai-prompt-guard,ai-lakera-guard,ai-cache) then receives flattened text without re-implementing the extraction;ai-lakera-guard.normalize_messagesis reduced to filtering empty turns.Regression tests cover structured content in conversation history, the latest user message, mixed text/non-text parts (openai-chat), and structured input parts (Responses API); they fail before the fix and pass after. The
ai-prompt-guard,ai-lakera-guard, andai-cachesuites are green locally.Which issue(s) this PR fixes:
Fixes #
Checklist