|
[Feature] linkifying URLs April 02, 2007 10:09PM |
Registered: 6 years ago Posts: 2 |
Hi! I was just checking out the roadmap [hp.jpsband.org] and it said you were planning in the 2.0 release for linkifying URLs. I just want to say that would be a EXTREMELY helpful feature because that is one of the biggest nightmares I have had to deal with. I spent over a week on it and I still couldn't figure it out. Its easy to try and linkify text with a simple regex but tons of problems arise if the text it detects already happens to be a link and the fix for it to be done right is complicated.
I don't know if these will come in handy for because you have a better way of parsing it, but when you do it here are some links for code done for it:
(one of the better ones) [code.iamcal.com]
(other ones) [www.zend.com] [www.truerwords.net] [www.coffee2code.com]
Oh yeah, these options would be nice for it: If you set it to detect Everything it will pick up: google.com, coolsite.us, blah.net (major domain extensions) If you set it to detect Strict it will pick up only: http ://www.google.com, http ://coolsite.us, or http: //blah.net
And options that allow you to truncate the displayed URL, or just show domain, or no mangling of the displayed URL.
Edited 3 time(s). Last edit at 04/04/2007 12:05PM by Ambush Commander.
|
Re: linkifying URLs April 02, 2007 10:20PM |
Admin Registered: 6 years ago Posts: 2,632 |
Yeah, it's tough (there's a reason why it's marked COMPLEX). What I was planning on doing was hooking in a token filter, which would scan Text tokens for URI-like constructs, and linkify them. It's simpler than auto-paragraphing, to be sure, but still a bear.
I suppose this could be implemented with regexps. What you'll need to do is create a regular expression that globs up things that look like URIs, as well as things around them that might indicate that they are inside a tag (quotes and gt/lt signs come to mind). Then, do a preg_replace_callback(), where the callback function analyzes the matches and determines whether or not to do the replacement, or simply return $matches[0]. Complicated, to be sure. I'll bump up the legit approach on my priority list.
HTML Purifier, Standards Compliant HTML Filtering
|
Re: linkifying URLs April 03, 2007 02:02PM |
Registered: 6 years ago Posts: 43 |
I wonder if 'linkifying' of URLs, when the feature arrives, could be optionally turned off (or on).
Some may say that HTMLPurifier should do only what its name suggests - purify HTML, correcting input data only to the extent that is needed, and that parsing text to linkify, which ? might interfere with the functionality of the larger software that HTMLPurifier is embedded in, should not be its goal.
|
Re: linkifying URLs April 03, 2007 02:05PM |
Admin Registered: 6 years ago Posts: 2,632 |
Ah yes, it wouldn't be enabled by default. Definitely no. This is part of the reason why it's so late in the changelog: it has nothing to do with filtering per-say. "Beyond HTML" sums it up nicely: there are certain features not part of HTML that are very convenient, and are easier to implement within HTML Purifier.
HTML Purifier, Standards Compliant HTML Filtering