HTMLPurifier 4.4.0
HTMLPurifier Documentation

HTML Purifier is an HTML filter that will take an arbitrary snippet of HTML and rigorously test, validate and filter it into a version that is safe for output onto webpages. It achieves this by:

  1. Lexing (parsing into tokens) the document,
  2. Executing various strategies on the tokens:
    1. Removing all elements not in the whitelist,
    2. Making the tokens well-formed,
    3. Fixing the nesting of the nodes, and
    4. Validating attributes of the nodes; and
  3. Generating HTML from the purified tokens.

However, most users will only need to interface with the HTMLPurifier and HTMLPurifier_Config.