Welcome! » Log In » Create A New Profile

[Customization] Enable basic Javascript

Posted by counterpoint 
[Customization] Enable basic Javascript
April 11, 2007 01:39PM

In some circumstances I need to be able to allow basic Javascript within HTML. Can this be done by: $config->setAllowedHTML('script[type,src]');

Edited 1 time(s). Last edit at 04/17/2007 05:37PM by Ambush Commander.

Re: Enable basic Javascript
April 11, 2007 02:59PM

No, you'll have to define a script element in HTMLDefinition and give it the proper attribute definitions. I don't know why you're using HTML Purifier, however, if you allow scripting... HTML Purifier's all about removing that stuff.

HTML Purifier, Standards Compliant HTML Filtering

Re: Enable basic Javascript
April 11, 2007 05:00PM

Pity, that doesn't sound too easy.

I don't really see the logic of your comment, though. One of HTML Purifier's stated objectives is to "make sure your documents are standards compliant" which is still a requirement for users who may be trusted not be malicious, but are nonetheless not trusted to be standards compliant. This arises in the construction of a CMS, where administrators have to be able to enter HTML that includes scripts, but are not necessarily skilled in XHTML. For that reason, I'd like to be able to selectively permit JavaScript while still cleaning XHTML.

Re: Enable basic Javascript
April 11, 2007 05:39PM

I understand your point now. Inline scripting is not a very good for making semantic webpages, but lessee...

<?php

/**
 * Implements required attribute stipulation for <script>
 */
class HTMLPurifier_AttrTransform_ScriptRequired extends HTMLPurifier_AttrTransform
{
    function transform($attr, $config, &$context) {
        if (!isset($attr['type'])) {
            $attr['type'] = 'text/javascript';
        }
        return $attr;
    }
}

/**
 * XHTML 1.1 Scripting module, defines elements that are used to contain
 * information pertaining to executable scripts or the lack of support
 * for executable scripts.
 * @note This module does not contain inline scripting elements
 */
class HTMLPurifier_HTMLModule_Scripting extends HTMLPurifier_HTMLModule
{
    var $name = 'Scripting';
    var $elements = array('script', 'noscript');
    var $content_sets = array('Block' => 'script | noscript', 'Inline' => 'script | noscript');
    
    function HTMLPurifier_HTMLModule_Scripting() {
        // TODO: create custom child-definition for noscript that
        // auto-wraps stray #PCDATA in a similar manner to 
        // blockquote's custom definition (we would use it but
        // blockquote's contents are optional while noscript's contents
        // are required)
        foreach ($this->elements as $element) {
            $this->info[$element] = new HTMLPurifier_ElementDef();
        }
        $this->info['noscript']->attr = array( 0 => array('Common') );
        $this->info['noscript']->content_model = 'Heading | List | Block';
        $this->info['noscript']->content_model_type = 'required';
        $this->info['script']->attr = array(
            'defer' => new HTMLPurifier_AttrDef_Enum(array('defer')),
            'src'   => new HTMLPurifier_AttrDef_URI(true),
            'type'  => new HTMLPurifier_AttrDef_Enum(array('text/javascript'))
        );
        $this->info['script']->content_model = '#PCDATA';
        $this->info['script']->content_model_type = 'optional';
        $this->info['script']->attr_transform_post['type'] =
            new HTMLPurifier_AttrTransform_ScriptRequired();
    }
}

?>

Use it like:

$def =& $config->getHTMLDefinition(true); // get the raw version
$def->manager->addModule('Scripting');

The class probably could have been shorter but when I do things I like to do them right. :-)

Oh, by the way, make sure you enclose all your scripts with <![CDATA[ ]]> tags. End browsers won't see them, but otherwise HTML Purifier will mangle any characters that have special meaning in HTML. Although then the browsers may mangle it. ARGH!!!

You may run into mysterious bugs when running the above code. Be warned!

HTML Purifier, Standards Compliant HTML Filtering

Edited 4 time(s). Last edit at 04/11/2007 02:51PM by Ambush Commander.

Re: Enable basic Javascript
April 12, 2007 12:58PM

That is excellent! Thank you very much for providing it. I take your points, and agree, but there is no alternative to using scripts to include things like Google adverts. Maybe you don't like them either :) They are a fact of life, though!

Re: Enable basic Javascript
April 12, 2007 05:22PM

Actually, it is quite simple to implement Google Ads without any inline scripting. Inside an external JavaScript file, assign a "Google Ads linker" function to the onload event. This function would search for tags that are specially tagged something, say, the class "google-ad" and insert in the appropriate JavaScript code on runtime.

HTML Purifier, Standards Compliant HTML Filtering

Re: Enable basic Javascript
April 12, 2007 05:55PM

Although that is a very neat suggestion, it may fall short of solving the CMS problem. Since someone has to be able to enter the Google code through a browser at some point (since a significant element of the concept of a CMS is user management via browser) the problem of purification will still arise. Having a special mechanism for entry of pure scripts and nothing else would be unduly inflexible, and also hard to police. (I'm still concerned about policing quality of code here, not malevolence).

Re: Enable basic Javascript
April 12, 2007 09:06PM

It would be inflexible, but by narrowing the scope of things allowed you can maintain better quality code. Here's what I'm suggesting:

Standard AdSense code looks something like:

<script type="text/javascript"><!--
 google_ad_client = "pub-4086838842346968";
 google_alternate_ad_url = "http://wikia.com/skins/monobook/google_adsense_script.html";
 google_ad_width = 120;
 google_ad_height = 600;
 google_ad_format = "120x600_as";
 google_ad_channel ="";
 google_color_border = "FFFFFF";
 //google_color_bg = "FFFFFF";
 google_color_link = ["0000FF","000000"];
 google_color_url = "002BB8";
 google_color_text = "000000";
 google_hints = "";
 
//--></script>
<script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script>

All of those variables are customizable, but there's no reason they couldn't be set independently of the HTML field, and then the script include implemented using a <div class="google-ad" /> If you wanted to, this could be performed transparently by preg_match'ing and then preg_replace'ing the relevant HTML sections out, or simply inserting them back in afterwards. Similar processes have been implemented for Flash videos.

The only reason I'm being so insistent on this issue is because if the only thing you're allowing <script> for is Google ads, then there's other ways to implement. However, if you want to give your users a blank check, then by all means, keep the script tags.

HTML Purifier, Standards Compliant HTML Filtering

Felix
Re: Enable basic Javascript
January 21, 2018 07:49AM

Hi I love your option about adding the JavaScript using div and use of preg matching and replacing. Am on a project that this is an issue? Please I need your assistance by being more clear about your approach. Thanks

Re: Enable basic Javascript
January 21, 2018 09:42PM
Author:
Your Email:

Subject:

HTML input is enabled. Make sure you escape all HTML and angled brackets with &lt; and &gt;.

Auto-paragraphing is enabled. Double newlines will be converted to paragraphs; for single newlines, use the pre tag.

Allowed tags: a, abbr, acronym, b, blockquote, caption, cite, code, dd, del, dfn, div, dl, dt, em, i, ins, kbd, li, ol, p, pre, s, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, var.

For inputting literal code such as HTML and PHP for display, use CDATA tags to auto-escape your angled brackets, and pre to preserve newlines:

<pre><![CDATA[
Place code here
]]></pre>

Power users, you can hide this notice with:

.htmlpurifier-help {display:none;}

Message: