An error in the HTMLPurifier_URI->validate() allowed for
an attacker craft a specially formed URI that, once processed by HTML
Purifier, was an active JavaScript URI. If a user clicked on the malicious
link, or used a browser that automatically evaluates JavaScript URIs in
image tags, an attacker could execute arbitrary JavaScript in the context
of the website the HTML was served on.
This vulnerability was reported via full disclosure by Gareth Heyes, and brought to the attention of the vendor by CrYpTiC_MauleR. No active exploits are currently known.
Fix
This vulnerability was fixed in HTML Purifier 3.1.0 and 2.1.4. No hot-patch is currently available.
Details
In accordance to RFC 3986,
a relative URI with the same scheme name as the
base URI is discouraged, but allowed for backwards-compatibility.
As HTML Purifier's goal is to produce standards-compliance in all aspects
of its output, HTML Purifier converts such URIs to their
correct form by removing the scheme. Thus, http:dir/dir2
becomes dir/dir2.
Doing this bypasses HTML Purifier's safeguards against JavaScript
URIs. During the parsing of normal URIs, a URI is parsed
and its scheme extracted from the original. Thus, a normal
javascript:xss() is identified to have a javascript
scheme and is removed. Any of the common bypasses to this, such as
java\nscript are avoided because HTML Purifier
does not recognize the scheme from its list of allowed schemes. However, once
parsing and this initial scheme check is performed, parsing is not
performed again.
Removal of the scheme causes a URI like http:javascript:xss()
to become javascript:xss(), and now javascript is
the new scheme, although in the original, javascript:xss() was
the path.
| Key | Original | New | Intended |
|---|---|---|---|
| Scheme | http | javascript | |
| Path | javascript:xss() | xss() | javascript:xss() |
The appropriate fix can be determined by figuring out how to convert the
last column into a URI that will be parsed into the same form.
Obviously, simple concatenation doesn't work; the key is percent encoding
the path. Instead of javascript:xss(), javascript%3Axss()
should be used.
HTML Purifier's fix also percent-encodes any other reserved character in each segment of a URI. This was actually a previously identified section of relaxed standards compliance, and strictly enforcing the rules eliminated the vulnerability.
History
The vulnerability was reported on March 25, 2008, although not directly to the vendor. A patch was committed to the public repository on May 13, 2008, ostensibly as a “revamp [of] URI handling of percent encoding and validation.” HTML Purifier 3.1.0 was released on May 18, 2008. This was the first security vulnerability in HTML Purifier's core, and the second in all of HTML Purifier's history.
We would have strongly preferred if Gareth Heyes had contacted us through
private channels before publically disclosing the vulnerability. We actually
did not realize that the post was illustrating vulnerabilities with
HTML Purifier until CrYpTiC_MauleR asked why the exploit worked on
May 13, 2007 (an http:javascript: doesn't actually work by itself; HTML Purifier
must munge off the http scheme to activate the attack.) This accounts in
part for the large discrepancy between the first disclosure,
and the committing of a fix. Still, we greatly appreciate Gareth Heyes' report
and sincerely hope that he will continue to help weed out bugs in HTML Purifier.
We apologize for not crediting him immediately in the changelog.
Since full disclosure is generally a good idea, just not before the vendor has gotten a chance to release a fix (please don't be afraid to use it to light a fire under our butts and get a security bug fixed), we've released this document along with the next point release of HTML Purifier, hopefully having given projects and end users enough time to upgrade their installations. We hope to do this for all future vulnerabilities in HTML Purifier. Especially for the two which were fixed in the most recent point release.