<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel>
        <title>Forums - Internals</title>
        <description>Discussion about development and new features for HTML Purifier.</description>
        <link>http://htmlpurifier.org/phorum/list.php?5</link>
        <lastBuildDate>Thu, 09 Sep 2010 04:41:10 -0700</lastBuildDate>
        <generator>Phorum 5.2.11</generator>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,4820,4820#msg-4820</guid>
            <title>Extending few classes (5 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,4820,4820#msg-4820</link>
            <description><![CDATA[<p>Hello everyone,</p>

<p>Moodle is using a modified version of HTML Purifier - you can see a diff between Moodle and vanilla version below.
This is a bit problematic for me as that forces me to use (in Debian, I'm a Maintainer) the version bundled with Moodle - and I'd prefer to use your version, that is already packaged for Debian.
The adjustments that Moodle did are necessary, to re-implement them I would need to extend few classes: 
* HTMLPurifier_AttrDef_Lang
* HTMLPurifier_HTMLModule_Text
* HTMLPurifier_HTMLModule_XMLCommonAttributes</p>

<p>Could you suggest some way of having HTML Purifier to use the extended classes (how could I inject them). Of course I would like to do it without modifying/patching the HTML Purifier code.</p>

<p>cheers,
Tomek</p>

<pre>
diff -ru vanilla/HTMLPurifier/AttrDef/Lang.php moodle/HTMLPurifier/AttrDef/Lang.php
--- vanilla/HTMLPurifier/AttrDef/Lang.php	2010-06-01 04:22:39.000000000 +0100
+++ moodle/HTMLPurifier/AttrDef/Lang.php	2010-05-22 01:04:23.000000000 +0100
@@ -9,6 +9,10 @@
 
     public function validate($string, $config, $context) {
 
+// moodle change - we use special lang strings unfortunatelly
+        return preg_replace('/[^0-9a-zA-Z_-]/', '', $string);
+// moodle change end
+
         $string = trim($string);
         if (!$string) return false;
 
diff -ru vanilla/HTMLPurifier/HTMLModule/Text.php moodle/HTMLPurifier/HTMLModule/Text.php
--- vanilla/HTMLPurifier/HTMLModule/Text.php	2010-06-01 04:22:39.000000000 +0100
+++ moodle/HTMLPurifier/HTMLModule/Text.php	2010-05-22 01:04:23.000000000 +0100
@@ -45,6 +45,13 @@
         $this-&gt;addElement('span', 'Inline', 'Inline', 'Common');
         $this-&gt;addElement('br',   'Inline', 'Empty',  'Core');
 
+        // Moodle specific elements - start
+        $this-&gt;addElement('nolink',  'Inline', 'Flow');
+        $this-&gt;addElement('tex',     'Inline', 'Flow');
+        $this-&gt;addElement('algebra', 'Inline', 'Flow');
+        $this-&gt;addElement('lang',    'Inline', 'Flow', 'I18N');
+        // Moodle specific elements - end
+        
         // Block Phrasal --------------------------------------------------
         $this-&gt;addElement('address',     'Block', 'Inline', 'Common');
         $this-&gt;addElement('blockquote',  'Block', 'Optional: Heading | Block | List', 'Common', array('cite' =&gt; 'URI') );
diff -ru vanilla/HTMLPurifier/HTMLModule/XMLCommonAttributes.php moodle/HTMLPurifier/HTMLModule/XMLCommonAttributes.php
--- vanilla/HTMLPurifier/HTMLModule/XMLCommonAttributes.php	2010-06-01 04:22:39.000000000 +0100
+++ moodle/HTMLPurifier/HTMLModule/XMLCommonAttributes.php	2010-05-22 01:04:23.000000000 +0100
@@ -5,9 +5,11 @@
     public $name = 'XMLCommonAttributes';
 
     public $attr_collections = array(
+/* moodle comment - xml:lang breaks our multilang
         'Lang' =&gt; array(
             'xml:lang' =&gt; 'LanguageCode',
         )
+*/
     );
 }
 
diff -ru vanilla/HTMLPurifier/Lexer.php moodle/HTMLPurifier/Lexer.php
--- vanilla/HTMLPurifier/Lexer.php	2010-06-01 04:22:39.000000000 +0100
+++ moodle/HTMLPurifier/Lexer.php	2010-07-06 01:04:03.000000000 +0100
@@ -252,8 +252,10 @@
     public function normalize($html, $config, $context) {
 
         // normalize newlines to \n
-        $html = str_replace("\r\n", "\n", $html);
-        $html = str_replace("\r", "\n", $html);
+        if ($config-&gt;get('Output.Newline')!=="\n") {
+            $html = str_replace("\r\n", "\n", $html);
+            $html = str_replace("\r", "\n", $html);
+        }
 
         if ($config-&gt;get('HTML.Trusted')) {
             // escape convoluted CDATA
</pre>]]></description>
            <dc:creator>Tomasz Muras</dc:creator>
            <category>Internals</category>
            <pubDate>Tue, 24 Aug 2010 10:12:16 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,4692,4692#msg-4692</guid>
            <title>Stop stripping tags that are outside of the body.. (1 reply)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,4692,4692#msg-4692</link>
            <description><![CDATA[<p>We are trying to figure out how to make sure html, head, meta, style, title, aren't stripped.</p>

<p>In an effort to stop them from being removed by HTMLPurifier we used the addElement method.  We seem to be able to get them all to work EXCEPT HEAD.</p>

<pre>
                        $oDef = $oConfig-&gt;getHTMLDefinition(true);

                        $oDef-&gt;addElement(
                                'style', // name
                                false, // content set
                                'Optional: #PCDATA', // allowed children
                                'Common', // attribute collection
                                array( // attributes
                                        'type' =&gt; 'CDATA',
                                ));

                        $oDef-&gt;addElement(
                                'title', // name
                                false, // content set
                                'Optional: #PCDATA', // allowed children
                                'I18N', // attribute collection
                                array( // attributes
                                ));

                        $oDef-&gt;addElement(
                                'meta', // name
                                false, // content set
                                'Empty', // allowed children
                                'I18N', // attribute collection
                                array( // attributes
                                        'http-equiv' =&gt; 'CDATA',
                                        'name' =&gt; 'CDATA',
                                        'content' =&gt; 'CDATA',
                                        'scheme' =&gt; 'CDATA',
                                ));

                        $oDef-&gt;addElement(
                                'head', // name
                                false, // content set
                                'Optional: Flow | #PCDATA | title | style | meta', // allowed children
                                'Common', // attribute collection
                                array( // attributes
                                ));

                        $oDef-&gt;addElement(
                                'body', // name
                                false, // content set
                                'Optional: Flow | #PCDATA | Inline', // allowed children
                                'Common', // attribute collection
                                array( // attributes
                                ));

                        $html = $oDef-&gt;addElement(
                                'html',  // name
                                false, // content set
                                'Optional: Flow | #PCDATA | head | body | title | style | meta', // allowed children
                                'Common', // attribute collection
                                array( // attributes
#                                       'action*' =&gt; 'URI',
#                                       'method' =&gt; 'Enum#get|post',
#                                       'name' =&gt; 'ID'
                                ));
                        $html-&gt;excludes = array('html'=&gt;true);

</pre>

<p>You may ask why we have title, style, meta inside the html.  It's because the HEAD isn't working yet.  So in the meantime we put them there since they seem to render even though they are children of HEAD.</p>]]></description>
            <dc:creator>Chris Altman</dc:creator>
            <category>Internals</category>
            <pubDate>Fri, 18 Jun 2010 05:52:35 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,4672,4672#msg-4672</guid>
            <title>Ruleset validation. (3 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,4672,4672#msg-4672</link>
            <description><![CDATA[<p>HTMLPurifier does not do a good job of validating it's configuration and handling unexpected values gracefully. In some cases, HTMLPurifier can terminate abruptly if its configuration is not set properly.</p>

<p>Would it be possible to provide an API to validate added rules?  I didnt see a bug tracker to log the request, so I apologize if this is the wrong place.</p>

<p>Drak</p>]]></description>
            <dc:creator>Drak</dc:creator>
            <category>Internals</category>
            <pubDate>Sat, 19 Jun 2010 18:27:07 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,4640,4640#msg-4640</guid>
            <title>Non Latin Domains (4 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,4640,4640#msg-4640</link>
            <description><![CDATA[<p>Hello,
I searched the docs and the forum, but I didn't find any information about the non-latin domains. I tested with several domains and they all were removed by the htmlpurifier (tested with 4.0.0, 4.1.0, 4.1.1).</p>

<p><a href="http://non-latin-domain">Test</a> results in <a href="http://">Test</a></p>

<p>Does the library support non-latin domains or is this feature on your to-do list?
Any ideas for quick fix of the issue?</p>

<p>Thanks!</p>

<p>Georgi</p>]]></description>
            <dc:creator>Georgi</dc:creator>
            <category>Internals</category>
            <pubDate>Fri, 13 Aug 2010 07:47:44 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,4616,4616#msg-4616</guid>
            <title>Should Purifier do a double run? (3 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,4616,4616#msg-4616</link>
            <description><![CDATA[<p>With HTML Purifier set to remove empty and to remove spans without attributes,</p>

<pre>

&lt;p&gt;&lt;span style="font-family: "&gt;
&lt;p align="left"&gt;Installation and Testing of the Electrical &amp; Instrumentation&lt;/p&gt;
&lt;p align="left"&gt;Works&lt;/p&gt;
&lt;p align="left"&gt;&amp;bull; Installation of Primary &amp; Secondary Containment&lt;/p&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;
&lt;p&gt; &lt;/p&gt;

</pre>

<p>Produces the following purified output:</p>

<pre>

&lt;p&gt;
&lt;/p&gt;&lt;p align="left"&gt;Installation and Testing of the Electrical &amp;amp; Instrumentation&lt;/p&gt;
&lt;p align="left"&gt;Works&lt;/p&gt;
&lt;p align="left"&gt;• Installation of Primary &amp;amp; Secondary Containment&lt;/p&gt;


</pre>

<p>The purified output is valid, of course, but it still contains an empty element.</p>

<p>If you run that through again, it's further purified, to remove the empty paragraph:</p>

<pre>

&lt;p align="left"&gt;Installation and Testing of the Electrical &amp;amp; Instrumentation&lt;/p&gt;
&lt;p align="left"&gt;Works&lt;/p&gt;
&lt;p align="left"&gt;• Installation of Primary &amp;amp; Secondary Containment&lt;/p&gt;

</pre>

<p>Should Purifier run on a loop, repeatedly purifying until no changes are made to the HTML string?</p>

<p>TRiG.</p>]]></description>
            <dc:creator>TRiG</dc:creator>
            <category>Internals</category>
            <pubDate>Wed, 26 May 2010 10:39:14 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,4591,4591#msg-4591</guid>
            <title>Custom filters on DOM level (5 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,4591,4591#msg-4591</link>
            <description><![CDATA[<p>If I understand correctly (please say if I don't), HTML Purifier parses the DOM tree with DOMDocument. Most of the manipulations operate on this DOM tree, before it is turned back into a html string.</p>

<p>Custom filters, on the other hand (subclasses of HTMLPurifier_Filter) operate on html strings with regular expressions.</p>

<p>Would it be possible to add an interface for custom filters that operate directly on the DOM tree? I imagine this could be more powerful than regex filtering.</p>

<p>The parsing is a quite expensive operation (right?), so I guess it's a good idea to only do it once.</p>

<p>See also
<a href="http://drupal.org/node/808868">http://drupal.org/node/808868</a></p>]]></description>
            <dc:creator>donquixote</dc:creator>
            <category>Internals</category>
            <pubDate>Tue, 25 May 2010 13:46:30 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,4571,4571#msg-4571</guid>
            <title>Maximum execution time with certain input (3 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,4571,4571#msg-4571</link>
            <description><![CDATA[<p>Hi,</p>

<p>HTMLpurifier 4.1.0 works perfectly here, except for a few special input cases, where it enters a neverending loop:</p>

<pre>
require_once 'htmlpurifier/library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);
$t = '&lt;i&gt;&lt;ul&gt;&lt;/ul&gt;&lt;/i&gt;';
$p = $purifier-&gt;purify($t);
</pre>

<p>You can replace the I tag with B or STRONG and it will still run forever.</p>

<p>This is running on PHP 5.3.2.</p>

<p>Best regards
  Lars</p>]]></description>
            <dc:creator>Lars</dc:creator>
            <category>Internals</category>
            <pubDate>Mon, 17 May 2010 23:27:01 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,4542,4542#msg-4542</guid>
            <title>Possible background-position bug (w/ fix) (5 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,4542,4542#msg-4542</link>
            <description><![CDATA[<p>Hi,</p>

<p>If you put background-position: "center 0px", the purifier turns it into "0px center" (verified in the demo).</p>

<pre>
&lt;div style="height: 500px; width: 500px;background-position: center 0px; background-image: url(logo.jpg);&gt;&lt;/div&gt;
</pre>

<p>From here:
<a href="http://www.w3schools.com/css/pr_background-position.asp">http://www.w3schools.com/css/pr_background-position.asp</a>
</p>

<pre>
The first value is the horizontal position and the second value is the vertical.
</pre>

<p>I think this is because the background-position code just classifies any value of center as being either a vertical or horizontal measurement, and does not take into account if the center was the first or second word it encountered.</p>

<p>I modified HTMLPurifier_AttrDef_CSS_BackgroundPosition to classify center values as either "ch" or "cv" instead of just "c".</p>

<p>This is my modified code:</p>

<pre>
class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
{

    protected $length;
    protected $percentage;

    public function __construct() {
        $this-&gt;length     = new HTMLPurifier_AttrDef_CSS_Length();
        $this-&gt;percentage = new HTMLPurifier_AttrDef_CSS_Percentage();
    }

    public function validate($string, $config, $context) {
        $string = $this-&gt;parseCDATA($string);
        $bits = explode(' ', $string);
		
        $keywords = array();
        $keywords['h'] = false; // left, right
        $keywords['v'] = false; // top, bottom
        $keywords['ch'] = false; // center (first word)
        $keywords['cv'] = false; // center (second word)
        $measures = array();

        $i = 0;

        $lookup = array(
            'top' =&gt; 'v',
            'bottom' =&gt; 'v',
            'left' =&gt; 'h',
            'right' =&gt; 'h',
            'center' =&gt; 'c'
        );

        foreach ($bits as $bit) {
            if ($bit === '') continue;

            // test for keyword
            $lbit = ctype_lower($bit) ? $bit : strtolower($bit);
            if (isset($lookup[$lbit])) {
                $status = $lookup[$lbit];
				if($status == 'c'){
					if(!$i)
                		$status = 'ch';
					else
                		$status = 'cv';
				}
				$keywords[$status] = $lbit;
                $i++;
            }

            // test for length
            $r = $this-&gt;length-&gt;validate($bit, $config, $context);
            if ($r !== false) {
                $measures[] = $r;
                $i++;
            }

            // test for percentage
            $r = $this-&gt;percentage-&gt;validate($bit, $config, $context);
            if ($r !== false) {
                $measures[] = $r;
                $i++;
            }

        }
		
        if (!$i) return false; // no valid values were caught

        $ret = array();

        // first keyword
		// fix "center 0px" being turned into "0px center"
		/* Original code had a bug that changes "center 0px" to "0px center"
		   because it first takes from measures before c keywords
		*/
		/*
        if     ($keywords['h'])     $ret[] = $keywords['h'];
        elseif (count($measures))   $ret[] = array_shift($measures);
        elseif ($keywords['c']) {
            $ret[] = $keywords['c'];
            $keywords['c'] = false; // prevent re-use: center = center center
        }

        if     ($keywords['v'])     $ret[] = $keywords['v'];
        elseif (count($measures))   $ret[] = array_shift($measures);
        elseif ($keywords['c'])     $ret[] = $keywords['c'];
		*/
		
        if     ($keywords['h'])     $ret[] = $keywords['h'];
        elseif ($keywords['ch']) { //get ch before measurements
            $ret[] = $keywords['ch'];
            $keywords['cv'] = false; // prevent re-use: center = center center
        }
        elseif (count($measures))   $ret[] = array_shift($measures);

        if     ($keywords['v'])     $ret[] = $keywords['v'];
        elseif ($keywords['cv'])     $ret[] = $keywords['cv'];
        elseif (count($measures))   $ret[] = array_shift($measures);

        if (empty($ret)) return false;
        return implode(' ', $ret);

    }

}
</pre>

<p>
Thanks for all your support!</p>]]></description>
            <dc:creator>tethers</dc:creator>
            <category>Internals</category>
            <pubDate>Wed, 05 May 2010 16:01:10 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,3804,3804#msg-3804</guid>
            <title>Trying to create an injector to interpret custom tags (13 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,3804,3804#msg-3804</link>
            <description><![CDATA[<p>So I'm making a site for my friend who is computer literate but only knows the absolute basics of web design (to be fair, I don't have THAT much knowledge, but I've been able to figure things out so far.</p>

<p>Anyways, right now I want to set it up so that she can do something like </p>

<pre>&lt;flash src="file.swf" height="250" width="300"&gt;</pre><p> on her update page and it will get replaced with a SWFObject call (or something along those lines).</p>

<p>There isn't exactly a lot of documentation I have found on screwing with the guts of HTMLpurifier instead of working with what's already there, so I'm working a lot on guesswork and trying to interpret the code as written. I've already made one successful hack, making AutoParagraph fire on single newlines instead of double newlines (not a functionality everyone needs, but important for this particular site) but this is certainly MUCH more involved :X</p>

<p>What I have so far (not much yet &lt;.&lt;;;;) is
</p>

<pre>
&lt;?php
class HTMLPurifier_Injector_AutoFlashTag extends HTMLPurifier_Injector
{

    public $name = 'AutoFlashTag';
    public $needed = array('flash' =&gt; array('src','height','width');

    public function handleElement(&amp;$token) {
        $src = $token-&gt;start-&gt;attr['src'];
        $height = $token-&gt;start-&gt;attr['src'];
        $width = $token-&gt;start-&gt;attr['src'];
</pre>

<p>but to move on I need to know some things that I can't seem to just figure out.</p>

<p>First, since I'm using a non-existent tag, I'm sure I need to create a definition for it somewhere or else it will just get deleted or escaped before I even get this far, right? How do I do this and insure that the tag is allowed? What needs to change in my $config declarations? Since the page will ALSO have non-trusted user input (for example, a comment system) I need to be able to either have this tag (and then get rid of it) or not have this tag on different $configs.</p>

<p>As for actually continuing the process, the way I see it there's a couple approaches I can take. If there's a way to delete the tag entirely and insert a string where it was that would be the easiest, but if not then there's a couple things that can be done but each are substantially more complex.</p>

<p>Finally, once the injector is made, how do I actually hook it in so that it can be used? Without being able to use it I can't expiriment and try to solve the problem myself.</p>

<p>Thanks for your help, I realize I'm asking kind of a lot here!</p>]]></description>
            <dc:creator>Pip</dc:creator>
            <category>Internals</category>
            <pubDate>Thu, 16 Jul 2009 12:51:08 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,3719,3719#msg-3719</guid>
            <title>why aren't single quotes turned into &amp;#39;? (3 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,3719,3719#msg-3719</link>
            <description><![CDATA[<p>Wouldn't you agree that most of which is purified through HTML purifier gets put in a database? SQL queries can easily be injected if single quotes aren't escaped. So why isn't this something HTML Purifier offers? It even converts &amp;#39; back into '. Why? It even does it with the doctype changed to XHTML 1.1. I have to put str_replace('\'', ''', $text) after $text goes through the purifier.</p>]]></description>
            <dc:creator>Roly</dc:creator>
            <category>Internals</category>
            <pubDate>Tue, 05 Jan 2010 13:06:43 -0800</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,3711,3711#msg-3711</guid>
            <title>Return an entire HTML document from a fragment (4 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,3711,3711#msg-3711</link>
            <description><![CDATA[<p>Hi there.</p>

<p>I don't think this already exists, so I'm putting it in as a suggestion for now. I'd like to index HTML fragments with Zend_Search_Lucene, and from what I gather, it requires a whole HTML document (with metadata). I thought it would be nice to have a method that passes back a fragment of HTML as a valid full document, with title and metadata included. I feel that would be a useful feature, but I'm not sure of the other applications for this method.</p>

<p>All the best.</p>]]></description>
            <dc:creator>TheFuzzy0ne</dc:creator>
            <category>Internals</category>
            <pubDate>Sun, 14 Feb 2010 13:29:08 -0800</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,3402,3402#msg-3402</guid>
            <title>HTML Purifier Version 4.0 (3 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,3402,3402#msg-3402</link>
            <description><![CDATA[<p>Hi,</p>

<p>can you name a date, at witch  the version 4.0 will get stable?</p>]]></description>
            <dc:creator>A. Klein</dc:creator>
            <category>Internals</category>
            <pubDate>Fri, 24 Apr 2009 11:35:32 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,3021,3021#msg-3021</guid>
            <title>Phalanger (6 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,3021,3021#msg-3021</link>
            <description><![CDATA[<p>Hello,</p>

<p>Is there any version of HTMLPurifier that can be compiled with Phalanger?</p>]]></description>
            <dc:creator>DragosP</dc:creator>
            <category>Internals</category>
            <pubDate>Fri, 13 Mar 2009 10:34:49 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2773,2773#msg-2773</guid>
            <title>Altering the Strategy (1 reply)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2773,2773#msg-2773</link>
            <description><![CDATA[<p>I see that HTMLPurifier uses the strategy pattern for the main strategy sequence and the base strategy is a composite strategy "Core" that hardcode the following strategies:
RemoveForeignElements, MakeWellFormed, FixNesting, ValidateAttributes.</p>

<p>I have use cases where I only need to run MakeWellFormed AND ValidateAttributes and I already checked that changing the "Core" strategy fix my issue.</p>

<p>I looking into the code but it seems there is no way to alter the strategy because strategy is set during initialization and it is a private property.</p>

<p>Am I missing anything?</p>

<p>Is it possible to add some way to alter the strategy object without altering the core htmlpurifier code?</p>]]></description>
            <dc:creator>bago</dc:creator>
            <category>Internals</category>
            <pubDate>Mon, 15 Dec 2008 13:02:47 -0800</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2663,2663#msg-2663</guid>
            <title>Adding &lt;b&gt; and &lt;i&gt; to Tidy (3 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2663,2663#msg-2663</link>
            <description><![CDATA[<p>I wanted users in a forum of mine to be able to use &lt;b&gt; and &lt;i&gt;, yet they should be converted to &lt;strong&gt; and &lt;em&gt; and I therefore did not want to allow &lt;b&gt; and &lt;i&gt;, they should only be converted.</p>

<p>I added the lines</p>

<pre>
$r['b'] = new HTMLPurifier_TagTransform_Simple('strong');
$r['i'] = new HTMLPurifier_TagTransform_Simple('em');
</pre>

<p>into XHTMLAndHTML4.php in the Tidy HTMLModule, and in my script through</p>

<pre>
$config-&gt;set('HTML', 'TidyAdd', 'b');
$config-&gt;set('HTML', 'TidyAdd', 'i');
</pre>

<p>these additional Tidy rules are enforced - which works fine.</p>

<p>I wanted to suggest to maybe add these rules as optional (only gets used if specifically chosen through TidyAdd) to Tidy, also a few other HTML tags like &lt;big&gt; or &lt;small&gt; could be added to help replace tags which should better not be used nowadays, although &lt;b&gt; and &lt;i&gt; are by far the most commonly still used (especially by users). These tags are not deprecated, that's why it should only be optional, or it could also be an own module independent of Tidy.</p>

<p>If something like this already exists than I apologize for bringing it up (and would be glad to hear where to find it) - I just thought it would be great to have this feature included without having to edit the library itself :-) And I did not know of any other way to easily convert one tag to another with HTMLPurifier (which is probably also not a common use case).</p>]]></description>
            <dc:creator>Iquito</dc:creator>
            <category>Internals</category>
            <pubDate>Wed, 26 May 2010 11:07:05 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2585,2585#msg-2585</guid>
            <title>Attr.IDAllowPrefix (no replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2585,2585#msg-2585</link>
            <description><![CDATA[<p>I just thought of a simpler way to allow some id tags to be reinjected by the injectors... how about a new configuration option, Attr.IDAllowPrefix which would specify a prefix that would be left in the output when found.  I suspect that would be fairly easy to add.</p>]]></description>
            <dc:creator>notromda</dc:creator>
            <category>Internals</category>
            <pubDate>Sat, 25 Oct 2008 09:48:15 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2527,2527#msg-2527</guid>
            <title>Test failures on Mac (20 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2527,2527#msg-2527</link>
            <description><![CDATA[<p>Here you go... 
</p>

<pre>
All HTML Purifier tests on PHP 5.2.6
1) Equal expectation fails at character 0 with [Invalid encoding utf8] and [] -&gt; Expected error not caught
        in test_convertToUTF8_spuriousEncoding
        in HTMLPurifier_EncoderTest
2) Identical expectation [Integer: 1] fails with [Integer: -1] because [Integer: 1] differs from [Integer: -1] by 2 at [/Users/dgm/workspace/htmlpurifier/tests/HTMLPurifier/LengthTest.php line 68]
        in testCompareTo
        in HTMLPurifier_LengthTest
FAILURES!!!
Test cases run: 195/195, Passes: 2426, Failures: 2, Exceptions: 0
</pre>

<p> I'm testing on a mac, fwiw.</p>]]></description>
            <dc:creator>notromda</dc:creator>
            <category>Internals</category>
            <pubDate>Fri, 24 Oct 2008 22:33:51 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2519,2519#msg-2519</guid>
            <title>display URL source (35 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2519,2519#msg-2519</link>
            <description><![CDATA[<p>I'd like to change A tags to display both the src url and the text somehow, whether by showing it all at once, or even calling a jquery plugin as a tooltip.  Is there an easy way to set this up?</p>]]></description>
            <dc:creator>notromda</dc:creator>
            <category>Internals</category>
            <pubDate>Thu, 13 Nov 2008 08:55:03 -0800</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2507,2507#msg-2507</guid>
            <title>Remove tags without attribute (29 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2507,2507#msg-2507</link>
            <description><![CDATA[<p>Hello,</p>

<p>I would like to know if there is a way to remove a tag without attribute.</p>

<p>For example:
I want to keep </p>

<pre>&lt;span class="myClass"&gt;</pre><p> but I don't want </p>

<pre>&lt;span&gt;</pre><p> to avoid span tags in my content with no utility.</p>

<p>Is-there a way to do that ?</p>

<p>Best regards.</p>]]></description>
            <dc:creator>xkobal</dc:creator>
            <category>Internals</category>
            <pubDate>Thu, 27 Aug 2009 17:43:12 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2406,2406#msg-2406</guid>
            <title>playlist.com filter (18 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2406,2406#msg-2406</link>
            <description><![CDATA[<p>Playlist.com allows users to look up music and add it to a music player playlist.</p>

<p>Is there a filter for this website out for your Html purifier?</p>

<p>It would not be very hard it is similar to the youtube filter.
I am just not experienced enough to go about creating it.</p>

<p>All it must do is look for the playlist id number under the embed line.
I can supply sample code if needed.</p>

<p>I have looked through the youtube.php and it is rather simple .. Sorta. 
Does of one or two things it is doing that I haven't seen in php coding before. 
Such as the \1 action.</p>

<p>Second question.
How do I go about adding support for style?
Tried config line but I wasn't sure how to set it up. The doc file you have doesn't really clarify it well.</p>

<p>Edit: Just noticed this might be a the wrong forum. For this topic.</p>

<p>Not so much the first question but the second question yes. Sorry about that.</p>]]></description>
            <dc:creator>quicksnail</dc:creator>
            <category>Internals</category>
            <pubDate>Wed, 15 Oct 2008 08:41:57 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2274,2274#msg-2274</guid>
            <title>Current state of embeded content (13 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2274,2274#msg-2274</link>
            <description><![CDATA[<p>Hi, I am interested in working on a feature that allows users to:</p>

<p>1.) embed swf files</p>

<p>and more specifically</p>

<p>2.) embed the swf file of the JW Flv player (to show flv movies hosted on my server).</p>

<p>
I have been searching through the forums and I trying to find out how much of this is already in place (as not to re-invent the wheel). I have found various classes already made, but I don't know which, if any, is suited for my purpose:</p>

<p>1.) SafeEmbed and SafeObject</p>

<p>2.) The second post here: <a href="http://htmlpurifier.org/phorum/read.php?2,1102,1102">http://htmlpurifier.org/phorum/read.php?2,1102,1102</a>
 If this one works, do I need to apply the injector patch seen here:</p>

<p><a href="http://htmlpurifier.org/phorum/read.php?3,921,946#msg-946">http://htmlpurifier.org/phorum/read.php?3,921,946#msg-946</a></p>

<p>3.) Maybe a modified version of the youtube class you made?</p>

<p>
Thanks for getting me started in the right direction.</p>]]></description>
            <dc:creator>Tethers</dc:creator>
            <category>Internals</category>
            <pubDate>Mon, 15 Mar 2010 11:33:07 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2270,2270#msg-2270</guid>
            <title>Trying to improve movie filter configuration (5 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2270,2270#msg-2270</link>
            <description><![CDATA[<p>Hi Edward, I'm currently trying to improve on the movie filter in purifier,</p>

<p>I have so far created a new filter file called Movie.php (in this file i have based on YouTube.php)</p>

<pre>
class HTMLPurifier_Filter_Movie extends HTMLPurifier_Filter
{
    
    public $name = 'Movie';
    
    public function preFilter($html, $config, $context, $movieurl, $height, $width) {
        $pre_regex = '#&lt;object[^&gt;]+&gt;.+?'.$movieurl.'([A-Za-z0-9\-_]+).+?&lt;/object&gt;#s';
        $pre_replace = '&lt;span class="movie-embed"&gt;\1&lt;/span&gt;';
        return preg_replace($pre_regex, $pre_replace, $html);
    }
    
    public function postFilter($html, $config, $context, $movieurl, $height, $width) {
        $post_regex = '#&lt;span class="movie-embed"&gt;([A-Za-z0-9\-_]+)&lt;/span&gt;#';
        $post_replace = '&lt;object width="'.$width.'" height="'.$height.'" '.
            'data="'.$movieurl.'\1"&gt;'.
            '&lt;param name="movie" value="'.$movieurl.'\1"&gt;&lt;/param&gt;'.
            '&lt;param name="wmode" value="transparent"&gt;&lt;/param&gt;'.
            ''.
            '&lt;/object&gt;';
        return preg_replace($post_regex, $post_replace, $html);
    }
    
}
</pre>

<p>as you can see i've added $movieurl, $height, &amp; $width flags to the functions. </p>

<p>I'm thinking i can create a new config() option in purifier, where we can set the url, height &amp; width from the config script, and then pass these to the filter.</p>

<p>something like an extra option in config..</p>

<pre>
$Config-&gt;set('HTML', 'MovieURL', 'url to movie');
$Config-&gt;set('HTML', 'MovieHeight', 425);
$Config-&gt;set('HTML', 'MovieWidth', 380);
</pre>

<p>these options could then be passed to the filter that requires them such as when 
$Config-&gt;set('Filter', 'Movie', true); is enabled.</p>

<p>I'm not so sure on how to add extra config options though to purifier though. wondering if you could provide some help in that regard</p>

<p>thanks</p>

<p>Vaughan</p>]]></description>
            <dc:creator>vaughan</dc:creator>
            <category>Internals</category>
            <pubDate>Sun, 07 Sep 2008 21:32:34 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2268,2268#msg-2268</guid>
            <title>Error collecting/validation (no replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2268,2268#msg-2268</link>
            <description><![CDATA[<p>The main feature we're planning to ship with HTML Purifier 3.2 is improved error collection and reporting. The plan is to make things more line/column oriented, so every error will be bound to a specific line/column on the document, which in theory you should be able to click and have your textarea move automatically to.</p>

<p>This is a request for mockups for two different types of interfaces; a non-JavaScript interface, and a JavaScript interface.</p>

<p>The non-JavaScript interface would be a normal listing of errors. If we look at the W3C validator, we can immediately rip and steal some ideas:</p>

<blockquote><div><em>Line 4, Column 5</em>: error text</div>
<pre>&lt;body<strong><u>&gt;</u></strong>Context HTML&lt;/body&gt;</pre>
<div>Explanation text</div></blockquote>

<p>I must admit, however, that I haven't used the validator very much and, when I have, found it's interface clunky for fixing errors in my documents. Also, with a little more context-sensitivity we can give much more helpful messages than W3C; HTML Purifier knows about the most common remedies to problems and can suggest them accordingly to users.</p>

<p>The JavaScript interface might be a regular text editor, but with icons and highlighted lines in places where there were errors. A user could click the icon to bring up the error console, and then make the appropriate changes and dismiss it. Admittedly, I don't know how feasible this is.</p>

<p>Think big, think grand. We'll pare things down with technical considerations later.</p>]]></description>
            <dc:creator>Ambush Commander</dc:creator>
            <category>Internals</category>
            <pubDate>Tue, 02 Sep 2008 11:23:38 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2267,2267#msg-2267</guid>
            <title>HTML5 (7 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2267,2267#msg-2267</link>
            <description><![CDATA[<p>As HTML5 continues its march to completion and tutorials for the language start popping up on the internet, we may want to consider its role in the HTML Purifier scheme of things. There are several ways we can treat it:</p>

<ul><li>HTML5 as a full-fledged document language. This means we treat it as if people were actually serving their pages as HTML5. This is not immediately useful, but will be once browsers start adding support for HTML5.</li>
<li>HTML5 as a meta-language that can be converted to HTML4. The premise behind this is we can turn HTML5 into HTML4 + CSS which will look the same as it would look in a browser that supports HTML5. So a user can write <code>&lt;l&gt;Line&lt;/l&gt;</code> and get back the valid HTML 4.01 code <code>&lt;div class="html5-l"&gt;Line&lt;/div&gt;</code>; the former is much shorter.</li>
</ul>

<p>The downside of doing any of these early-bird implementations is that we will have to be very careful to keep them in sync with the specification.</p>

<p>Are there any comments?</p>]]></description>
            <dc:creator>Ambush Commander</dc:creator>
            <category>Internals</category>
            <pubDate>Sat, 12 Jun 2010 20:57:38 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2244,2244#msg-2244</guid>
            <title>Limit possible attribute values (5 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2244,2244#msg-2244</link>
            <description><![CDATA[<p>Hello.</p>

<p>I was hoping that it would be possible to limit the possible values for a given element's attribute.</p>

<p>For example: how would I limit div.class to only have the values of "code", "quote", or "plain"? And, also, not specifying the class attribute at all should be valid.</p>

<p>Thanks,
Eric</p>]]></description>
            <dc:creator>erisco</dc:creator>
            <category>Internals</category>
            <pubDate>Thu, 28 Aug 2008 09:38:18 -0700</pubDate>
        </item>
        <item>
            <guid>http://htmlpurifier.org/phorum/read.php?5,2239,2239#msg-2239</guid>
            <title>Trying to allow param 'flashvars' in object/embed (72 replies)</title>
            <link>http://htmlpurifier.org/phorum/read.php?5,2239,2239#msg-2239</link>
            <description><![CDATA[<p>Hi, I am new to HTML Purifier, Ive been playing with it a lot this week and have started to become overwhelmed. I am using 3.1.1 standalone. I tried things on the site and I tried to add the links here but this validator said it was spam... lol..</p>

<p>Initial Objective: I need to allow flashvars to pass through the purifier</p>

<p>current configuration:
</p>

<pre>
$config = HTMLPurifier_Config::createDefault();	

$config-&gt;set('HTML', 'DefinitionID', 'allow flash movies');
$config-&gt;set('HTML', 'DefinitionRev', 1);
$config-&gt;set('HTML', 'SafeEmbed', true); 
$config-&gt;set('HTML', 'SafeObject', true);

$config-&gt;set('Core', 'DefinitionCache', null);

$purifier = new HTMLPurifier($config);
$clean_html = $purifier-&gt;purify($dirty_html);
return $clean_html;

</pre>

<p>Just looking for some direction.  The ultimate goal is to allow all object/embed/param to pass. HTML Purifier works on a whitelist ideology, im looking for more of a blacklist approach when it comes to objects/embed/param flash. However, it would seem that using my above config is good enough as long as I could get the flashvars param/attribute to pass.</p>]]></description>
            <dc:creator>Doohic</dc:creator>
            <category>Internals</category>
            <pubDate>Mon, 26 Apr 2010 08:46:02 -0700</pubDate>
        </item>
    </channel>
</rss>
