Welcome! » Log In » Create A New Profile

using HTMLPurifier on all unsecured data

Posted by songoku 
using HTMLPurifier on all unsecured data
July 17, 2011 09:19AM

hi i have finaly come up with a plan to take care of XSS. of corse, 99% of that plan is to use HTMLPurifier, but still, i wanted to let users input any character they want and at the same time not display any HTML so i came up with this:

$cleanData = htmlspecialchars($purifier->purify(dirtyData), ENT_QUOTES, 'UTF-8');

it all seems to work well. im getting xss free clean data, all characters from user's input and no html being displayed. but are there any objections to this way of dealing with data using HTMLPurifier? i would love to hear some if so as i am not the developer kind of guy and this simple solution seems a little to simple to me.

Re: using HTMLPurifier on all unsecured data
July 17, 2011 09:25AM

There's no good reason to run HTML Purifier in this case.

Re: using HTMLPurifier on all unsecured data
July 17, 2011 06:17PM

hi

so what's your view on how to display XSS free data while letting users input any character without using HTMLPurifier?

i thought HTMLPurifier was just that, a XSS cleaner.

Re: using HTMLPurifier on all unsecured data
July 17, 2011 07:05PM

Yes, but only if you want to treat the user input as HTML. If it's not HTML (just plain text), keeping it secured is pretty easy.

Re: using HTMLPurifier on all unsecured data
July 17, 2011 08:42PM

Yes, but only if you want to treat the user input as HTML. If it's not HTML (just plain text), keeping it secured is pretty easy.

so how do i keep data secure without something like HTMLPurifier, while at the same time letting users inputs any character?

i thought XSS attacks where simple as calling something like xss(); in a link.

regards

Re: using HTMLPurifier on all unsecured data
July 24, 2011 10:37PM

I suggest taking a read at http://mit.edu/~ezyang/Public/iap/intro-to-was.html

Essentially, you need to know what kind of data you're handling, so that you know how to properly validate it, as well as escape it properly when you put it into its final context. If you are handling URLs, an HTML filter does you no good.

Re: using HTMLPurifier on all unsecured data
August 03, 2011 09:31PM
Essentially, you need to know what kind of data you're handling, so that you know how to properly validate it, as well as escape it properly when you put it into its final context.

hi i've been reading up on xss attacks and i came back to this solution; passing all data through HTMLPurifier, in order to allow any character to be printed as plain text.

please let me know why this is not a good solution to avoid xss attacks:

use HTML Purifier on all outputted data, and after, use htmlspecialchars. now i don't need to validate anything, and i can allow any character to be printed, as i intend it to, and i'm free of XSS patterns which get cleaned up by HTML Purifier.

please let me know why this is not a good solution, because i cannot see the downside.

thanks.

Re: using HTMLPurifier on all unsecured data
August 03, 2011 11:28PM

Suppose you adopt this strategy. Furthermore, suppose you allow users to submit link URLs. A user submits the URL 'javascript:alert("Boom!")'. HTML Purifier doesn't change this at all, since this is perfectly valid HTML. You get XSSed.

You must must must think about what data your strings represent, and also what contexts they will be put in. Otherwise you will get it wrong.

Re: using HTMLPurifier on all unsecured data
August 05, 2011 09:21PM

Suppose you allow users to submit link URLs. A user submits the URL 'javascript:alert("Boom!")'. HTML Purifier doesn't change this at all, since this is perfectly valid HTML. You get XSSed.

hi. thanks. i see now the better way to handle XSS rather than pass everything through HTMLPurifier.

but i just tried <a href='javascript:alert("Boom!")'>click</a> in your demo, and HTMLPurifier caught it. what's up with that?

i'm now thinking of passing everything that is active html (urls included) through HTMLPurifier before inserting the data into the database. would you say that this is a better way to handle xss in urls with HTMLPurifier?

Re: using HTMLPurifier on all unsecured data
August 05, 2011 09:32PM

Rene Magritte has a rather famous painting, which is subtitled “Ceci n'est pas une pipe.”

http://en.wikipedia.org/wiki/File:MagrittePipe.jpg

At first glance, it seems rather odd. Of course it’s a pipe! But it’s not; rather, it’s a *representation* of a pipe, it is physically canvas and paint, or, in the case of your laptop screen, pixels of red, green and blue. The actual pipe is a different thing. Once you separate representation from the actual object, the distinction is clear, and it is indeed true: a painting of a pipe is not a pipe.

So, what does this have to do with HTML Purifier? Like the painting and the physical picture, there is a difference between the actual physical URL, and the representation of a URL inside an HTML document. HTML Purifier works fine on HTML documents, doing the right thing to URLs represented in the document. It works on the painting. But if you give HTML Purifier the actual object, just a URL string, it will do something nonsensical. It doesn't make sense.

Re: using HTMLPurifier on all unsecured data
August 05, 2011 10:28PM

okay. i got that. thanks

Re: using HTMLPurifier on all unsecured data
August 06, 2011 05:44PM

although, why do i need HTMLPurifier to clear the URL that appears on the HTML page itself, when the URL itself (the object) still contains XSS? If i use an escape function when printing the URL to the page, i don't need HTMLPurifier, is that correct?

<a href="url/with/xss"> escaped(url) </a>

am i correct to think that HTMLPurifier is not needed here?

Re: using HTMLPurifier on all unsecured data
August 06, 2011 05:45PM

Can you give an example? I don't understand the question.

Re: using HTMLPurifier on all unsecured data
August 06, 2011 05:51PM

yes i edited the post above to include an example

Re: using HTMLPurifier on all unsecured data
August 06, 2011 05:53PM

Yeah, I can do that in HTML Purifier fine. It's not a problem. javascript:alert('Bang!')

Re: using HTMLPurifier on all unsecured data
August 06, 2011 06:25PM
i assume you did $purifier->purify(javascript:alert('Bang!')), right?

like the following

<a href='http:// google.com'>$purifier->purify(javascript:alert('Bang!'))</a>

?

woudn't it be the same thing using htmlspecialchars, instead
<a href='http:// google.com'>htmlspecialchars(javascript:alert('Bang!'))</a>

?

Re: using HTMLPurifier on all unsecured data
August 06, 2011 06:29PM

Yes. HTML Purifier just uses htmlspecialchars on text. That bit is simple. The tricky bit is handling HTML tags.

Re: using HTMLPurifier on all unsecured data
August 06, 2011 06:38PM

okay, i did not know that HTMLPurifier utilizes htmlspecialchars

the tricky part you mention, it's tricky to handle HTML tags inside the HREF attribute

is that what you mean?

Re: using HTMLPurifier on all unsecured data
August 06, 2011 07:14PM

Yes, and other things too. (note that what's inside HREF is not HTML)

Re: using HTMLPurifier on all unsecured data
August 07, 2011 05:11PM

yes, i guess i am going to need a validation function instead of HTMLPurifier then. thanks for bringing that to my attention and for all the help you provided me.regards

Sorry, you do not have permission to post/reply in this forum.