Welcome! » Log In » Create A New Profile

CSS?

Posted by Brett 
Brett
CSS?
August 02, 2007 09:38PM

Hello,

Any means or plans to add the means for purifying CSS when injected into a stylesheet--i.e., when removing the <style> tags is not sufficient (i.e., removing javascript protocol from url(), etc.)?

thanks, Brett

Edited 1 time(s). Last edit at 08/02/2007 11:13PM by Ambush Commander.

Re: CSS?
August 02, 2007 11:14PM

Plans, yes. Means, no. Legitimate token based CSS parsing, and, by extension, parsing of style-sheets is scheduled for version 3, although work on it will probably start earlier.

Brett
Re: CSS?
August 03, 2007 06:17AM

Hi,

Thanks.

I've recently come across what may be different than what you need to develop, but it seems to do the trick for me for CSS parsing (I recently joined the developers to make a few changes of my own and the project manager is looking for anyone to take over the project) and maybe it could be of some help to you in getting started?

http://csstidy.sourceforge.net/

By the way, thanks for developing/working on something which was is so essential...

best wishes, Brett

Re: CSS?
August 03, 2007 08:44AM

CSS Tidy suffers from "Not Invented Here" syndrome, also, its GPL license makes it difficult to integrate with HTML Purifier, which is LGPL. The general approach it takes, however, is basically what we need.

Brett
Re: CSS?
August 03, 2007 04:35PM

Well, I can't cure the "Not Invented Here" syndrome (assuming it ought to be cured), but I just changed the license of CSS Tidy to LGPL with the approval of Flo (owner of the project). He still has to change the SF display of the license type, but you'll see that all the files use LGPL. :) So feel free to borrow away...

Re: CSS?
August 03, 2007 04:39PM

O.o That's pretty awesome. We might as well use it then. :-)

ericmn
Re: CSS?
November 09, 2007 07:36PM

So whats the progress on mergin it with CSS Tidy?

Re: CSS?
November 10, 2007 01:32AM

HTML Purifier's development is in maintenance while PHP 4 is about to be deprecated. Once 2008 rolls around, development of new features will begin to pick up again.

Re: CSS?
December 12, 2007 05:56PM

For those following this thread, experimental support for extracting <style> blocks and cleaning them has been added to trunk (PHP5-only) (thanks Chris for paying for this development!). Check out a copy of HTML Purifier repository and use this to test (note that CSSTidy must be available on your system):

<?php
// change these two paths as necessary
require_once &#039;class.csstidy.php&#039;;
require_once &#039;class.csstidy_print.php&#039;;
require_once &#039;HTMLPurifier/Filter/ExtractStyleBlocks.php&#039;;

$purifier = new HTMLPurifier();
$purifier->addFilter(new HTMLPurifier_Filter_ExtractStyleBlocks());
$text = $purifier->purify(
  &#039;<style>.foo{text-align:left;bogus:foo;}</style>&#039;.
  &#039;<span class="foo">a</span>&#039;
);
print_r($text);
// &#039;<span class="foo">a</span>&#039;
print_r($purifier->context->get(&#039;StyleBlocks&#039;));
/*
array
(
    0 => &#039;.class {
text-align:left;
}&#039;
)
*/
Re: CSS?
December 13, 2007 08:14AM

Wonderful news!

Within CSSTidy, the file data.inc.php has some groupings of allowable CSS which might, I think, should be easily manipulated by functions (if not themselves grouped into modules) to remove groups of allowable values (e.g., if people didn't want certain colors allowed, certain properties (e.g., display which could hide elements), etc., as I saw discussed on one of your ideas pages).

Oh, also had the idea while looking at the units to have a function optionally auto-convert inches to cm, etc. :)

take care, Brett

Re: CSS?
December 13, 2007 09:35AM

Also, do you have plans to add support for the style attribute too?

Re: CSS?
December 13, 2007 12:59PM
Within CSSTidy, the file data.inc.php has some groupings of allowable CSS which might, I think, should be easily manipulated by functions (if not themselves grouped into modules) to remove groups of allowable values (e.g., if people didn't want certain colors allowed, certain properties (e.g., display which could hide elements), etc., as I saw discussed on one of your ideas pages).

I noticed this. However, as of right now HTML Purifier is completely bypassing CSSTidy's validation/optimization logic, so all these validations are done on HTML Purifier's side (there's no way of guaranteeing the security of CSSTidy's validation either). I am using CSSTidy almost solely for its parsing capabilities. It'll be interesting to see what from this file we can reuse, but it's non-OOP design makes it more difficult to jive with HTML Purifier.

Oh, also had the idea while looking at the units to have a function optionally auto-convert inches to cm, etc. :)

An interesting idea, although it sounds a bit strange to me. I could easily hack it into HTMLPurifier_AttrDef_ChildDef_CSS_Length, but I can't see any practical use for it.

Also, do you have plans to add support for the style attribute too?

If I do this, CSSTidy will need to be distributed with HTML Purifier, since style attribute validation is integral to HTML Purifier. I still need to analyze whether or not the parser can be used to parse style strings, and not actual style sheets.

Re: CSS?
February 18, 2008 01:42PM

Just FYI, when using get('StyleBlocks'), if the CSS is wrapped in the usual &amp;lt;!-- blah --&amp;gt;, the StyleBlock output has &quot;\3C !--&quot; tacked on the front, with no corresponding close comment.

-- hugh

Re: CSS?
February 18, 2008 07:58PM

Indeed. This is a bug; I'll have a fix soon.

Re: CSS?
February 18, 2008 08:29PM

Realized I hadn't responded to this way back...

Within CSSTidy, the file data.inc.php has some groupings of allowable CSS which might, I think, should be easily manipulated by functions (if not themselves grouped into modules) to remove groups of allowable values (e.g., if people didn't want certain colors allowed, certain properties (e.g., display which could hide elements), etc., as I saw discussed on one of your ideas pages).

I noticed this. However, as of right now HTML Purifier is completely bypassing CSSTidy's validation/optimization logic, so all these validations are done on HTML Purifier's side (there's no way of guaranteeing the security of CSSTidy's validation either).

You meant CSSTidy doesn't also guarantee security, right? I take it you're referring to bogus image insertions, etc.? Can we do something about that?

I am using CSSTidy almost solely for its parsing capabilities. It'll be interesting to see what from this file we can reuse, but it's non-OOP design makes it more difficult to jive with HTML Purifier.

Not sure what you mean by non-OOP. It does use classes and objects and the like... Do you mean it doesn't fully take advantage of OOP features? If so, which ones?

Oh, also had the idea while looking at the units to have a function optionally auto-convert inches to cm, etc. :)

An interesting idea, although it sounds a bit strange to me. I could easily hack it into HTMLPurifier_AttrDef_ChildDef_CSS_Length, but I can't see any practical use for it.

No, this is not a pressing need. But some might prefer to work with a particular standard (and the rest of us in the world, (i.e., US, Liberia, and Myanmar) should already be using metric!)).

Also, do you have plans to add support for the style attribute too?

If I do this, CSSTidy will need to be distributed with HTML Purifier, since style attribute validation is integral to HTML Purifier. I still need to analyze whether or not the parser can be used to parse style strings, and not actual style sheets.

To parse a style string, couldn't you just encapsulate the whole string in a dummy class selector or the like and then strip it?

Re: CSS?
February 18, 2008 10:58PM
You meant CSSTidy doesn't also guarantee security, right? I take it you're referring to bogus image insertions, etc.? Can we do something about that?

That, and more. CSSTidy doesn't guarantee standards-compliance with the CSS spec; HTML Purifier does. :-)

Not sure what you mean by non-OOP. It does use classes and objects and the like... Do you mean it doesn't fully take advantage of OOP features? If so, which ones?

OOP is not simply using classes and objects; it's a design philosophy. Consequently, the question what "features" aren't being used is a little nonsensical. Polymorphism is a biggy, though.

No, this is not a pressing need. But some might prefer to work with a particular standard (and the rest of us in the world, (i.e., US, Liberia, and Myanmar) should already be using metric!)).

True. But practically speaking, HTML Purifier's output is meant for output-only (and not re-editing by the person), so a conversion would introduce errors in scaling without any tangible benefit.

To parse a style string, couldn't you just encapsulate the whole string in a dummy class selector or the like and then strip it?

The problem with this approach is if the user passes something like text-align:left;}more-css:properties;. In theory, this is harmless in terms of security, but the user will end up loosing the rest the CSS string even though we should be able to keep it.

We've encountered this problem previously with HTMLPurifier_Lexer_DOMLex; this lexer operates by wrapping everything in HTML/BODY/DIV tags, so if a user has a stray closing div tag, the rest of the document mysteriously disappears.

Sorry, you do not have permission to post/reply in this forum.