Welcome! » Log In » Create A New Profile

Bug in 'A' tag correction.

Posted by Joel 
Bug in 'A' tag correction.
February 28, 2013 10:40PM

Hello, I believe I've discovered a bug (in 4.5.0) in regards to correction of improperly formatted URLs.

Given the following code:

<a href="http://www.example.com>example</a>

HTML Purifier will correct it to:

<a href="http://www.example.com"></a>

It properly adds in the missing quotation mark, but then strangely, removes the text that is enclosed in the 'A' tag.

Thanks for making a fine product.

Re: Bug in 'A' tag correction.
March 01, 2013 12:46AM

Try setting %Core.LexerImpl to DirectLex

Re: Bug in 'A' tag correction.
March 02, 2013 09:00AM

So is this the desired default behavior?

The same behavior is displayed with the on site demo.

Re: Bug in 'A' tag correction.
March 02, 2013 02:55PM

It's not really HTML Purifier's fault; it's the fault of the PHP DOM extension which we do to do parsing. DirectLex is a pure PHP parser, so it's slower but means we can handle more edge cases like this.

Your Email:


HTML input is enabled. Make sure you escape all HTML and angled brackets with &lt; and &gt;.

Auto-paragraphing is enabled. Double newlines will be converted to paragraphs; for single newlines, use the pre tag.

Allowed tags: a, abbr, acronym, b, blockquote, caption, cite, code, dd, del, dfn, div, dl, dt, em, i, ins, kbd, li, ol, p, pre, s, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, var.

For inputting literal code such as HTML and PHP for display, use CDATA tags to auto-escape your angled brackets, and pre to preserve newlines:

Place code here

Power users, you can hide this notice with:

.htmlpurifier-help {display:none;}