Describe the bug
python-markdown2 can spend unbounded CPU in its inline HTML tokenizer when rendering attacker-controlled Markdown containing repeated malformed tag fragments. At the pinned commit, the public markdown2.markdown() path can pass that text to _sorta_html_tokenize_re.split() in lib/markdown2.py, and a roughly 60 KB input deterministically reaches the local one-second timeout oracle. Applications that render untrusted Markdown synchronously can therefore have request workers tied up by a single document.
To Reproduce
INT-regex-markdown2-html-tokenizer-redos.zip
See attached file.
Expected behavior
Tag-branch regex fails to match the malformed fragment in linear time.
Debug info
For more details, see README.md of attached file.
Describe the bug
python-markdown2 can spend unbounded CPU in its inline HTML tokenizer when rendering attacker-controlled Markdown containing repeated malformed tag fragments. At the pinned commit, the public
markdown2.markdown()path can pass that text to_sorta_html_tokenize_re.split()inlib/markdown2.py, and a roughly 60 KB input deterministically reaches the local one-second timeout oracle. Applications that render untrusted Markdown synchronously can therefore have request workers tied up by a single document.To Reproduce
INT-regex-markdown2-html-tokenizer-redos.zip
See attached file.
Expected behavior
Tag-branch regex fails to match the malformed fragment in linear time.
Debug info
For more details, see README.md of attached file.