Skip to content

fix: prevent panic on multi-byte UTF-8 in SQL obfuscation#11792

Open
AruneshDwivedi wants to merge 1 commit into
deepflowio:mainfrom
AruneshDwivedi:fix/sql-obfuscate-utf8-char-boundary
Open

fix: prevent panic on multi-byte UTF-8 in SQL obfuscation#11792
AruneshDwivedi wants to merge 1 commit into
deepflowio:mainfrom
AruneshDwivedi:fix/sql-obfuscate-utf8-char-boundary

Conversation

@AruneshDwivedi
Copy link
Copy Markdown

Problem

When SQL obfuscation (obfuscate_protocols) is enabled for MySQL or PostgreSQL,
and the captured SQL contains multi-byte UTF-8 characters (Chinese, Japanese, etc.),
the deepflow-agent panics with:

panicked at src/flow_generator/protocol_logs/sql/sql_obfuscate.rs:97:37:
byte index 1 is not a char boundary; it is inside '?' (bytes 0..1)

This causes the agent to enter CrashLoopBackOff, breaking all observability
data collection on that node.

Root Cause

When the SQL tokenizer returns an error location, the code computes a
byte_offset from the line/column position. This offset can land in the
middle of a multi-byte UTF-8 character. Slicing a str at a non-char-boundary
index is undefined behavior and causes a panic in Rust.

Fix

Use str::floor_char_boundary() (stabilized in Rust 1.73) to clamp the
byte offset to the nearest valid character boundary before slicing.

Testing

Reproduction: enable obfuscate_protocols for MySQL/PostgreSQL and run
queries containing multi-byte UTF-8 characters. Before the fix, the agent
panics. After the fix, SQL is correctly obfuscated.

Fixes #11791

When the SQL tokenizer returns an error location that falls in the
middle of a multi-byte UTF-8 character (e.g., Chinese, Japanese),
byte_offset would not be at a char boundary, causing a panic at
&sql[..byte_offset].

Use str::floor_char_boundary() to clamp the offset to the nearest
valid char boundary before slicing.

Fixes deepflowio#11791
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jun 6, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] SQL 脱敏模块处理多字节 UTF-8 字符时 agent 崩溃(CrashLoopBackOff)

2 participants