|
Tomasz Muras
Extending few classesAugust 18, 2010 04:48PM |
Hello everyone,
Moodle is using a modified version of HTML Purifier - you can see a diff between Moodle and vanilla version below. This is a bit problematic for me as that forces me to use (in Debian, I'm a Maintainer) the version bundled with Moodle - and I'd prefer to use your version, that is already packaged for Debian. The adjustments that Moodle did are necessary, to re-implement them I would need to extend few classes: * HTMLPurifier_AttrDef_Lang * HTMLPurifier_HTMLModule_Text * HTMLPurifier_HTMLModule_XMLCommonAttributes
Could you suggest some way of having HTML Purifier to use the extended classes (how could I inject them). Of course I would like to do it without modifying/patching the HTML Purifier code.
cheers, Tomek
diff -ru vanilla/HTMLPurifier/AttrDef/Lang.php moodle/HTMLPurifier/AttrDef/Lang.php
--- vanilla/HTMLPurifier/AttrDef/Lang.php 2010-06-01 04:22:39.000000000 +0100
+++ moodle/HTMLPurifier/AttrDef/Lang.php 2010-05-22 01:04:23.000000000 +0100
@@ -9,6 +9,10 @@
public function validate($string, $config, $context) {
+// moodle change - we use special lang strings unfortunatelly
+ return preg_replace('/[^0-9a-zA-Z_-]/', '', $string);
+// moodle change end
+
$string = trim($string);
if (!$string) return false;
diff -ru vanilla/HTMLPurifier/HTMLModule/Text.php moodle/HTMLPurifier/HTMLModule/Text.php
--- vanilla/HTMLPurifier/HTMLModule/Text.php 2010-06-01 04:22:39.000000000 +0100
+++ moodle/HTMLPurifier/HTMLModule/Text.php 2010-05-22 01:04:23.000000000 +0100
@@ -45,6 +45,13 @@
$this->addElement('span', 'Inline', 'Inline', 'Common');
$this->addElement('br', 'Inline', 'Empty', 'Core');
+ // Moodle specific elements - start
+ $this->addElement('nolink', 'Inline', 'Flow');
+ $this->addElement('tex', 'Inline', 'Flow');
+ $this->addElement('algebra', 'Inline', 'Flow');
+ $this->addElement('lang', 'Inline', 'Flow', 'I18N');
+ // Moodle specific elements - end
+
// Block Phrasal --------------------------------------------------
$this->addElement('address', 'Block', 'Inline', 'Common');
$this->addElement('blockquote', 'Block', 'Optional: Heading | Block | List', 'Common', array('cite' => 'URI') );
diff -ru vanilla/HTMLPurifier/HTMLModule/XMLCommonAttributes.php moodle/HTMLPurifier/HTMLModule/XMLCommonAttributes.php
--- vanilla/HTMLPurifier/HTMLModule/XMLCommonAttributes.php 2010-06-01 04:22:39.000000000 +0100
+++ moodle/HTMLPurifier/HTMLModule/XMLCommonAttributes.php 2010-05-22 01:04:23.000000000 +0100
@@ -5,9 +5,11 @@
public $name = 'XMLCommonAttributes';
public $attr_collections = array(
+/* moodle comment - xml:lang breaks our multilang
'Lang' => array(
'xml:lang' => 'LanguageCode',
)
+*/
);
}
diff -ru vanilla/HTMLPurifier/Lexer.php moodle/HTMLPurifier/Lexer.php
--- vanilla/HTMLPurifier/Lexer.php 2010-06-01 04:22:39.000000000 +0100
+++ moodle/HTMLPurifier/Lexer.php 2010-07-06 01:04:03.000000000 +0100
@@ -252,8 +252,10 @@
public function normalize($html, $config, $context) {
// normalize newlines to \n
- $html = str_replace("\r\n", "\n", $html);
- $html = str_replace("\r", "\n", $html);
+ if ($config->get('Output.Newline')!=="\n") {
+ $html = str_replace("\r\n", "\n", $html);
+ $html = str_replace("\r", "\n", $html);
+ }
if ($config->get('HTML.Trusted')) {
// escape convoluted CDATA
|
Re: Extending few classes August 18, 2010 10:43PM |
Admin Registered: 6 years ago Posts: 2,634 |
A few of those patches can be adjusted for via purely configuration changes ala http://htmlpurifier.org/docs/enduser-customize.html , but some of those are trickier.
HTMLPurifier/AttrDef/Lang.php, it should be theoretically possible to take the raw configuration, find the Lang module, and manually put in your new implementation of Lang.
HTMLPurifier/HTMLModule/Text.php is totally well supported by the customization interface.
HTMLPurifier/HTMLModule/XMLCommonAttributes.php you can just disallow xml:lang at the configuration level
HTMLPurifier/Lexer.php this is confusing. Why do you need to preserve carriage returns, if you've told HTML Purifier that the input is UNIX line-endings style?
|
Tomasz Muras
Re: Extending few classesAugust 19, 2010 04:23PM |
The code in Lexer.php was put there to disable line endings normalization altogether, see http://tracker.moodle.org/browse/MDL-22654 . Do you think it would make sense to make line endings normalization configurable? It would definitely be helpful in this case.
cheers, Tomek
|
Re: Extending few classes August 24, 2010 12:49AM |
Admin Registered: 6 years ago Posts: 2,634 |
|
Tomasz Muras
Re: Extending few classesAugust 24, 2010 04:09AM |
|
Re: Extending few classes August 24, 2010 01:12PM |
Admin Registered: 6 years ago Posts: 2,634 |
Sure. See http://htmlpurifier.org/docs/dev-config-schema.html for how to add a configuration directive; please check up on some existing directives to get a feel for style.
|
Tomasz Muras
Re: Extending few classesSeptember 09, 2010 04:43PM |
Hello!
The patch to add new configuration option and disable newline normalization (in library/HTMLPurifier/Lexer.php) is trivial but I've run into problems when adding a test:
$this->config->set('HTML.NewlineNormalization', false);
$input = "plain text\r\n";
$expect = array(
new HTMLPurifier_Token_Text("plain text\r\n")
);
$this->assertTokenization($input, $expect);
This works OK for DirectLex and DOMLex but fails for PH5P. This is because HTML5 class is stripping the lines in a constructor:
$data = str_replace("\r\n", "\n", $data);
$data = str_replace("\r", null, $data);
I think these lines should be removed from HTML5 class, since this is done in Lexer::normalize anyway, would you agree?
cheers, Tomek Muras
|
Re: Extending few classes September 11, 2010 02:56AM |
Admin Registered: 6 years ago Posts: 2,634 |