Welcome! » Log In » Create A New Profile

htmLawed comparison

Posted by patnaik 
htmLawed comparison
March 02, 2008 03:33AM

Thank you very much Ambush Commander for your review. You have indeed spent a lot of time delving into htmLawed. I hope you will keep the review updated. In the comparison table, htmLawed can be 'XSS-safe (user defined)' with 'almost' proper nesting, etc.

I hope those who read the review will not be swayed by the serious words it so casually uses, and will carefully examine the points you raise. I also hope one doesn't write off the largely negative review as uninformed, supercilious, self-righteous, biased or grudgeful!

the '<table><td>' test case: I am looking into it. If you are aware of other test cases that fail, let me know.

whitelist for attributes: Attributes, besides being black-listed through 'deny_attribute', can be white-listed using the $spec argument.

XSS-safe 'out of tin': The htmLawed documentation is very clear and elaborate. Web application developers/administrators have to configure htmLawed only once, and it is so easy to disallow 'unsafe' tags and attributes; see the docs.

javascript: protocol: Like above.

HTML5 support: I am correcting that mistake in the documentation. If there is anything else about HTMLPurifier in the htmLawed documentation that is incorrect, do let me know.

Re: htmLawed comparison
March 03, 2008 02:10AM

Version 1.0.3 of htmLawed has been released.

* The table-td bug is fixed

* A new $config magic parameter that auto-adjusts other parameters has been added for easily configuring htmLawed for 'safe' HTML

* Documentation changes

Thanks to A.C. for the pointers!

Re: htmLawed comparison
March 03, 2008 11:58PM

Re: your exhortations to read the specs: An application like htmLawed cannot be written without reading the specs, AC! htmLawed currently doesn't check for empty content or element ordering and clearly states this in the documentation, a recommended reading for all before posting smile

One can also rightly infer from the section on htmLawed's limitations that htmLawed's main aim is to provide developers/web-site administrators a highly configurable and fast utility to limit HTML markup, for maintaining a page layout or style and/or reducing security vulnerabilities.

Checking for full standard-compliance is not the priority for this little utility. Some aspects of the HTML standards are simply less important (like tfoot before tbody): not sticking to them does not affect page security, or rendering by user-agents. Then there is the added overhead of dealing with such non-compliant code: should both tfoot and tbody be removed (what then of their rows), should the script just swap them, and so on.

For standard-sticklers, htmLawed will never be perfect. For them, there is Tidy and clones. For others, htmLawed may be perfect.

Thanks again for reviewing htmLawed. Prospective users will definitely be influenced by it.

PS: In the comparison table, what differentiates 'almost' from 'no'? Also, may be worth mentioning supported element sets. One big reason forcing me to develop htmLawed was missing support for form, script, etc., in HTMLPurifier (speed and memory usage was secondary).

Re: htmLawed comparison
March 04, 2008 12:12AM

Since you've closed the topic on your forum, I'll respond here.

I didn't understand the fact that htmLawed doesn't claim to achieve full standards-compliance. I suppose I was mislead by statements like, "The lawing in of input text is needed to ensure that HTML code in the text is standard-compliant." If I read carefully, I can see that htmLawed only claims to make HTML "more secure and standard-compliant."

I'll agree that standards-compliance isn't the top priority. However, the XSS issues are.

In the comparison table, what differentiates 'almost' from 'no'?

'Almost' is used when there are a few minor bugs that could easily be fixed to make it 'yes.' If there are numerous bugs(?), and the author expresses that they have no intention of fixing them, I put down no.

Also, may be worth mentioning supported element sets. One big reason forcing me to develop htmLawed was missing support for form, script, etc., in HTMLPurifier (speed and memory usage was secondary).

HTML Purifier supports every element and attribute that can be safely put in a document. No more, no less. (Weeeell, except for maybe a few really obscure attributes that no browsers support). While HTML Purifier does have some degree of "Tidy" style functionality using %HTML.Trusted, it's primary focus has and always will be security. This also means no forms, for phishing reasons, and no script, for obvious reasons.

But even then, HTML Purifier's architecture makes it really easy to add support for these elements (indeed, when trusted mode is on, script tags are allowed). But it's on the back-burner for now, in favor of better PHP5 support and better documentation.

Sorry, you do not have permission to post/reply in this forum.