Welcome! » Log In » Create A New Profile

Linkify bug

Posted by nAS 
nAS
Linkify bug
May 20, 2013 10:32AM

Linkify directive doesn't work correctly when using no-breaking space or comma.

Example: http://bit.ly/11RAUaU

Proposed fix:

Linkify.php line 24 change from:

$bits = preg_split(&#039;#((?:https?|ftp)://[^\s\&#039;"<>()]+)#S&#039;, $token->data, -1, PREG_SPLIT_DELIM_CAPTURE);

to:

$bits = preg_split(&#039;#((?:https?|ftp)://[^\s\&#039;",<>()]+)#Su&#039;, $token->data, -1, PREG_SPLIT_DELIM_CAPTURE);
Re: Linkify bug
May 21, 2013 04:29PM

Comma is a valid character in URLs, the non-breaking space change seems reasonable though. Perhaps a compromise we can make is only break on comma if it is immediately followed by a space.

Re: Linkify bug
May 21, 2013 04:59PM

Hmm, on further thought, I think I don't care. Committed.

nAS
Re: Linkify bug
May 21, 2013 06:25PM

That's exactly how I was thinking. I know that comma is valid in URL but all browsers encodes it as %2C. And I guess that it is more common use case that user wants to share multiple URLs separated by commas than sharing URL with unescaped comma.

It is possible to do better heuristic but I think that this approach is good enough.

Thank you for committing this, HTML Purifier is by far best HTML filtering tool!

Author:
Your Email:

Subject:

HTML input is enabled. Make sure you escape all HTML and angled brackets with &lt; and &gt;.

Auto-paragraphing is enabled. Double newlines will be converted to paragraphs; for single newlines, use the pre tag.

Allowed tags: a, abbr, acronym, b, blockquote, caption, cite, code, dd, del, dfn, div, dl, dt, em, i, ins, kbd, li, ol, p, pre, s, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, var.

For inputting literal code such as HTML and PHP for display, use CDATA tags to auto-escape your angled brackets, and pre to preserve newlines:

<pre><![CDATA[
Place code here
]]></pre>

Power users, you can hide this notice with:

.htmlpurifier-help {display:none;}

Message: