Bug in 'A' tag correction.

Posted by Joel 
Bug in 'A' tag correction.
February 28, 2013 10:40PM

Hello, I believe I've discovered a bug (in 4.5.0) in regards to correction of improperly formatted URLs.

Given the following code:

<a href="http://www.example.com>example</a>

HTML Purifier will correct it to:

<a href="http://www.example.com"></a>

It properly adds in the missing quotation mark, but then strangely, removes the text that is enclosed in the 'A' tag.

Thanks for making a fine product.

Re: Bug in 'A' tag correction.
March 01, 2013 12:46AM

Try setting %Core.LexerImpl to DirectLex

Re: Bug in 'A' tag correction.
March 02, 2013 09:00AM

So is this the desired default behavior?

The same behavior is displayed with the on site demo.

Re: Bug in 'A' tag correction.
March 02, 2013 02:55PM

It's not really HTML Purifier's fault; it's the fault of the PHP DOM extension which we do to do parsing. DirectLex is a pure PHP parser, so it's slower but means we can handle more edge cases like this.

