Posted by Vaibhav Kaushal 
Vaibhav Kaushal
May 30, 2012 05:05AM


I was wondering if there is a way to extract elements from the purified text. Something like:


$strPurified = $purifier->Purify($DirtyHtml); $arrElements = HTMLPurifier::Extract($strPurifier, 'a,img,b');


and then use something like this:


$strFirstLinkInText = $arrElements['a'][0];


Wouldn't that be a great addition? Since HTMLPurifier already is able to completely tear apart HTML and rejoin it, this would be a great addition for implementing some functionality on the server side which normally we should not want be done on the client side.

Regards, Vaibhav

May 30, 2012 09:41AM

Unfortunately not; you could just use DOM.

July 30, 2012 08:56AM

You can already manipulate HTML in fairly powerful ways with HTML Purifier if you customise it (example: removing empty <a> tags). See if that general approach helps you? You can probably emulate most things you'd like by customising HTML Purifier (which you can do without patching the library, might I add - it's easy to inject new classes into it).

