Welcome! » Log In » Create A New Profile


Posted by Brett 
August 03, 2007 06:36AM


I came across your page about Purifier's "Tidy" features, which looks great.

In case it might help you, I recently compiled a list of features needing cleaning that were missing from HTMLTidy. See this page and in so doing also did something which might still enhance your own Tidy features: listed potential "ultra-cleaned" attributes and elements which while not deprecated even in XHTML Strict 1.0 or XHTML 1.1,are still presentational. I also

  1. Proposed XHTML Module support (sound familiar?),
  2. Referred to proprietary elements and attributes needing dropping or conversion (though your whitelist approach may not need this):
  3. Requested transparency in documentation of what is cleaned and how:
  4. Added to a feature request for the kind of filtering you apparently have already addressed:
  5. Requested the API allow one to easily extract the CSS output when done in clean mode:

What I like about your program's Tidy feature (though I haven't had a chance to really get into it yet) is that I can just use your PHP API to make the desired changes and transparently so. Very nice. It is so logical to combine filtering and tidying (though I presume your program won't do everything Tidy does like remove duplicate ids?).

take care, Brett

Edited 2 time(s). Last edit at 08/03/2007 08:31AM by Ambush Commander.

Re: Tidy
August 03, 2007 08:39AM
  1. Already implemented, see HTMLModuleManager and HTMLModule/
  2. Already implemented, see HTMLModule_Legacy
  3. Already implemented, see printDefinition. Note that cleaning rules are configuration-dependent
  4. That's the whole point of HTML Purifier! :-)
  5. I've had someone else request this feature; it's on the roadmap (I think, if not, I'll add it)

If you'd like to read more about the Tidy feature, please consult the Tidy documentation. Our program produces 100% standards-compliant HTML without Tidy on (so yes, duplicate ids are caught); Tidy simply makes sure that some deprecated elements get converted into standards-compliant alternatives.

Re: Tidy
August 03, 2007 01:37PM

Great... I appreciate your pointing me in the right direction... Actually, I made those posts before I had looked carefully at HTML Purifier to know it did any Tidying, so I was mostly just sharing FYI (and in case you had feedback on those points)...

But besides getting carried away with sharing this with you (and how I was happy to find that you did in fact already implement many of these things), my main point was actually to give you a listing which might be used toward implementing the "ultra-clean" concept whereby one could even restrict/convert presentational attributes further than even XHTML Strict/1.1 does. (I know, as one person responded, there are arguments for keeping in <small>, etc., but firstly most use of the use of the tags was probably presentational, and secondly, I think one could use XML dialects, alternate choices like <fine> or whatever--anyways, the suggestion was to be able to ultra-clean as an option, not as a requirement or default.)

Again, wonderful work... Nice to see work on things which all web developers really need...

best wishes, Brett

Re: Tidy
August 05, 2007 07:12PM

If you do end up going with a CSS purifier, might you also consider this approach in your equivalent of Tidy's 'clean' mode?


thanks, Brett

Sorry, you do not have permission to post/reply in this forum.