Documentation

HTML Purifier's documentation is organized by topic. New users should read the INSTALL file that comes with your HTML Purifier download. Any questions about HTML Purifier can be asked at the support forums (no registration required!)

For First-Time users

The basic code for getting HTML Purifier setup is very simple:

require_once '/path/to/HTMLPurifier.auto.php';

$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify($dirty_html);

Replace $dirty_html with the HTML you want to purify and use $clean_html instead. While HTML Purifier has a lot of configuration knobs, the default configuration of HTML Purifier is quite safe and should work for many users.

It's highly recommended to take a look at the full install documentation for more information, as it will give advice on how to make sure HTML Purifier's output is matches your page's character encoding

For Advanced Users

P.S. HTML Purifier's source code is well documented and very readable. If a question of your isn't answered by any of the above resources, go to the source! (Or ask in the forums.)

For Contributors

As is with any open source project, HTML Purifier always is looking for developers, writers and other folks willing to lend a hand. There are any number of things to work on! Please, take a moment to find out how you can help out this project.

Frequently Asked Questions

What does %HTML.Allowed mean?

The percent-dot format is a shorthand for HTML Purifier's configuration directives. It takes the form of %Namespace.Directive. For practical purposes, %HTML.Allowed translates into the following PHP code:

$config->set('HTML', 'Allowed', $value);

My attributes are mysteriously disappearing!

You've probably got magic quotes turned on, which is interfering with the single and double-quotes in HTML attributes. The usual way to fix this is with some runtime code or an ini tweak. Be sure not to introduce any SQL injection vulnerabilities!

How do I prevent foreign characters like ä and   from turning into ä?

This usually means that HTML Purifier is parsing your code as UTF-8, but your output encoding is something else. Read up this document on UTF-8 to learn how to fix this. (Short answer: use %Core.Encoding or switch to UTF-8.)

I can't use the target or name attribute in my a tags!

The target attribute has been deprecated for a long time, so I highly recommend you look at other ways of, say, opening new windows when you click a link (my favorites are “Don't do it!” or, if you must, JavaScript) But if you must, the %Attr.AllowedFrameTargets directive is what you are looking for.

The name attribute is dependent on IDs being enabled. See this document on enabling user IDs for more information.

Is HTML Purifier slow?

HTML Purifier isn't exactly light or speedy; this is a tradeoff for the power and security the library affords. You can combat this by reading Speeding up HTML Purifier or using the standalone version.

Miscellaneous