It's never happened before; HTML Purifier is now having its first ever release candidate! Not since the beta-days has such a momentous event occurred!
All joking aside, there have been some serious changes in the way HTML Purifier is loaded on the developer-side; needless to say, I will not be surprised if you see a fat Fatal error when you drop in this release candidate. Some of them are intentional, some of them are not (well, I hope not!) I need your help to iron out bugs stemming from different system configurations before I do the official release.
If you are a new user, you can treat this version as stable and use it normally. The main uncertainty is with regards to upgrade paths.
So, if you are willing and ready, grab a copy, and then read on...
Download
Autoloading
Autoloading is singularly the largest architectural change in HTML Purifier, and under certain circumstances, can give you a hefty performance boost too (not using the autoloader, but hold onto that thought for a moment). Previously, HTML Purifier loaded everything it needed from HTMLPurifier.php. Things have changed a little. I've investigated this thoroughly, and the following cases will require some user intervention:
You're a PEAR user
Previously, I told you to use this code:
require_once 'HTMLPurifier.php';
This will no longer be sufficient, because it doesn't register HTML Purifier's autoloader. Replace the line with:
require_once 'HTMLPurifier.auto.php';
You included HTMLPurifier.php directly
Follow the same instructions as a PEAR user.
You are already using autoloading, and are on a version of PHP earlier than 5.1.2
In early versions of PHP 5, there was no way to register multiple autoload
handlers (with spl_autoload_register). You will need to
manually modify your autoloader to get HTML Purifier to play nice with it.
Suppose your autoload function looks like this:
function __autoload($class) {
require str_replace('_', '/', $class) . '.php';
return true;
}
A modified version with HTML Purifier would look like this:
function __autoload($class) {
if (HTMLPurifier_Bootstrap::autoload($class)) return true;
require str_replace('_', '/', $class) . '.php';
return true;
}
Make sure you call HTMLPurifier_Bootstrap::autoload() first,
because it will ignore class names that aren't prefixed with HTMLPurifier.
You are already using autoloading, and are on PHP 5.1.2+
Congratulations; you probably won't need to make any modifications.
However, it's worth taking a look whether or not you are using
__autoload or spl_autoload_register. If it's the
former, you may want to consider adding this line of code to your
application:
spl_autoload_register('__autoload');
This is a good idea because spl_autoload_register overrides
any __autoload function, so if a misbehaving library (not HTML Purifier,
of course!) registers its
own autoloader function, yours will mysteriously stop working. You are
required to do this if your autoloader is defined after
HTML Purifier's autoloader is called.
Some extra notes
With those modifications, your HTML Purifier installation should not be fatally error'ing out. If it is, please post in the Support forums and I'll try to help and figure it out.
If you've got things working, and would like to try some of the newest features out, check out the following files:
- HTMLPurifier.includes.php
- This is the performance-friendly file I was talking about earlier. If you use this, you don't need the autoloader at all—just swap 'auto' with 'includes'. The downside is that if you are using any non-standard classes, you'll need to include them manually.
- HTMLPurifier.kses.php
- On the prompting of Lukasz Pilorz, I wrote a little wrapper for HTML Purifier using the kses interface. It's pretty neat and works with kses's configuration parameters, so check it out if you've got some legacy code you want to migrate.
- HTMLPurifier.safe-includes.php
- This is the not-so-performance-friendly counterpart of HTMLPurifier.includes.php. On the plus side, however, it doesn't need autoload, and it can be included from anywhere with impunity.
Filters
The interface for registering filters changed slightly. You may have noticed
some E_USER_WARNINGs emitting from code that looks like:
$purifier = new HTMLPurifier(); require_once 'HTMLPurifier/Filter/YouTube.php'; $purifier->addFilter(new HTMLPurifier_Filter_YouTube());
We've replaced addFilter() with some new configuration directives.
Combined with autoloading, the above code turns into:
$config = HTMLPurifier_Config::createDefault();
$config->set('Filter', 'YouTube', true);
$purifier = new HTMLPurifier($config);
If you're using a custom filter, you'll need some slightly different code:
$config = HTMLPurifier_Config::createDefault();
$config->set('Filter', 'Custom', array(
new YourCustomFilter()
));
$purifier = new HTMLPurifier($config);
Everything else...
There may be a few miscellaneous warnings left. If your error-reporting level includes notices, you might see HTML Purifier complaining about the usage of deprecated aliases. Don't worry: I'm not going to remove those aliases, but from a performance standpoint it's a good idea to convert the old directive to the new directive.
From there, it gets highly internal. If you've been making custom modules
for yourself, please note that the signature of
HTMLPurifier_HTMLModule->addElement() has changed; there is
no more $safe parameter. However, there was no
$safe parameter to begin with in
HTMLPurifier_HTMLDefinition->addElement(), so users of that
method don't have to worry about this change. For the curious, this change
is indicative of the shift from element-based safety to module-based
safety. Once I implement more elements and attributes for trusted mode,
there will be more documentation for this.
Finally, the static methods in HTMLPurifier_ConfigSchema
were deprecated. They probably still work, although they're not being
actively tested now. If you need to add custom configuration to HTML
Purifier, retrieve a copy of the schema using
HTMLPurifier_ConfigSchema::instance() and then operating
on it using the add*() methods. Some of the method
signatures have changed, most notably there's an extra
$allowsNull parameter after $type in
add(). Extensible configuration
is somewhat an unknown, so if you have definitive use-cases you'd like to
share with me and influence the architecture of this, please say so.
Please do not add your own files to the schema/
directory unless you plan on submitting your changes for incorporation
with the core. For information on how this subsystem works, check out
the documentation
on Config Schema.
New features!
Thanks for putting up with all that backwards-compatibility documentation! Now we get to the fun stuff: new features. The new features are mostly all configuration directives:
- %HTML.Proprietary
- Enables some proprietary HTML elements like
marquee. - %CSS.AllowImportant - Enables the !important selector in CSS code, most useful in conjunction with the ExtractStyleBlocks filter.
- %CSS.AllowTricky
- Enables some possibly mischevious CSS properties, namely
displayandvisibility - %CSS.AllowedProperties - Allows you to control which CSS properties you would like to allow
- %HTML.ForbiddenAttributes - Allows you to blacklist certain HTML attributes
- %HTML.ForbiddenElements - Allows you to blacklist certain HTML elements
As usual, see the NEWS for a full list of enhancements and bugfixes.