Allowing name attribute for anchor tags
December 12, 2007 08:52AM

I'm trying to allow named anchors but can't get it working, heres how I set things up:

<pre> $this->purify_config = array ( 'HTML' => array ( 'Doctype' => 'XHTML 1.0 Strict', 'TidyLevel' => 'light', 'AllowedAttributes' => array('a.name' => true), 'DefinitionID' => 'my definition set', 'DefinitionRev' => 1, ), /* do not cache definitions while testing */ 'Core' => array ( 'DefinitionCache' => null, ), ),

$purifier_config = HTMLPurifier_Config::createDefault();

foreach ($this->purify_config as $domain => $config) { foreach ($config as $key => $val) { $purifier_config->set($domain, $key, $val); } }

$def =& $purifier_config->getHTMLDefinition(true); $def->addAttribute('a', 'name', 'Text');

$purifier = new HTMLPurifier($purifier_config); </pre>

and the output I get:

Value before <pre> <p> foo, bar </p> <p> p</p>

<a name="foo"></a>named anchor</pre> </pre>

Value after <pre> <p> foo, bar </p> <p> p</p> <a></a>named anchor</pre>

If I omit the addAttribute I get a warning about the attribute not being supported.

Re: Allowing name attribute for anchor tags
December 12, 2007 03:09PM

Please read this documents on the risks of allowing IDs.

There are a number of inefficiencies in your code, try this instead:

$this->purify_config = array
(
    'HTML' => array
    (
        'Doctype' => 'XHTML 1.0 Strict',
        // insert your ID config here, see the above doc
    ),
),
$purifier = new HTMLPurifier($this->purify_config);
Re: Allowing name attribute for anchor tags
December 13, 2007 06:46AM

Ah, I can just pass the same multidimensional array directly to purifier, how handy...

Noticed that if I use the transitional Doctype I can get this working using similar approach as outlined in my first post.

(the anchors must have the name set, this way I can have namespaced ids and names as they were input by the editor)

for future reference, the following code:

<pre><![CDATA[ <?php error_reporting(E_ALL); require_once('HTMLPurifier.php'); $purify_config = array ( 'HTML' => array ( 'DefinitionID' => 'myset', 'DefinitionRev' => 1, 'Doctype' => 'XHTML 1.0 Transitional', 'TidyLevel' => 'light', 'EnableAttrID' => true, ), 'Attr' => array ( 'IDPrefix' => 'user_', 'AllowedFrameTargets' => array ( '_blank', '_self', '_top', ), ), 'Cache' => array ( 'SerializerPath' => '/tmp/htmlpurifier', ), /* do not cache definitions while testing */ 'Core' => array ( 'DefinitionCache' => null, ), );

$purifier = new HTMLPurifier($purify_config);

$def =& $purifier->config->getHTMLDefinition(true); $def->addAttribute('a', 'name', 'Text');

$data = <<<EOD <p id="floz"> foo, bar </p> <p>named p <a href="http://www.fi" target="_blank" id="baz">www.fi _blank</a> </p> <p><a href="http://www.fi" target="hiddenframe">www.fi hiddenframe</a> </p>

<a name="bar"></a>named anchor EOD;

echo "before\n===\n{$data}\n===\n";

$purified = $purifier->purify($data);

echo "after\n===\n{$purified}\n===\n";

?> ]]></pre>

Outputs

<pre><![CDATA[ before === <p id="floz"> foo, bar </p> <p>named p <a href="http://www.fi" target="_blank" id="baz">www.fi _blank</a> </p> <p><a href="http://www.fi" target="hiddenframe">www.fi hiddenframe</a> </p>

<a name="bar"></a>named anchor === after === <p id="user_floz"> foo, bar </p> <p>named p <a href="http://www.fi" target="_blank" id="user_baz">www.fi _blank</a> </p> <p><a href="http://www.fi">www.fi hiddenframe</a> </p>

<a name="bar"></a>named anchor ===

]]></pre>

Re: Allowing name attribute for anchor tags
December 13, 2007 01:12PM

Please replace

$def->addAttribute('a', 'name', 'Text');

with

$def->addAttribute('a', 'name', 'ID');

so that it uses the same namespace as id=""

Author:
Your Email:

Subject:

HTML input is enabled. Make sure you escape all HTML and angled brackets with < and >.

Auto-paragraphing is enabled. Double newlines will be converted to paragraphs; for single newlines, use the pre tag.

Allowed tags: a, abbr, acronym, b, blockquote, caption, cite, code, dd, del, dfn, div, dl, dt, em, i, ins, kbd, li, ol, p, pre, s, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, var.

For inputting literal code such as HTML and PHP for display, use CDATA tags to auto-escape your angled brackets, and pre to preserve newlines:

<pre><![CDATA[
Place code here
]]></pre>

Power users, you can hide this notice with:

.htmlpurifier-help {display:none;}

Message: