|
Pound signs removed December 19, 2011 07:46AM |
Registered: 1 year ago Posts: 4 |
$desc = html_entity_decode($_POST['desc']);
require_once './HTMLPurifier.standalone.php';
$config = HTMLPurifier_Config::createDefault();
$config->set('Core.Encoding', 'utf-8');
$config->set('Core.EscapeNonASCIICharacters', true);
$config->set('HTML.Allowed', 'p,span,em,ul,ol,li');
$config->set('AutoFormat.RemoveEmpty', 'true');
$purifier = new HTMLPurifier($config);
$desc = $purifier->purify($desc);
Using this code pound signs are removed - where am I going wrong?
|
Re: Pound signs removed December 19, 2011 11:03AM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Pound signs removed December 20, 2011 04:12AM |
Registered: 1 year ago Posts: 4 |
|
Re: Pound signs removed December 20, 2011 12:52PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Pound signs removed February 17, 2012 05:14AM |
Registered: 3 years ago Posts: 61 |
Due to my experience, I consider html_entity_decode() and htmlspecialchars_decode() signs that the code is doing something it should not; I hope you don't mind me explaining why in your topic, DavidIanWaters, I can't guarantee it's valid for your case, but hear me out:
When you use a JavaScript editor to edit HTML, and you want to load pre-existing HTML into said editor, you should be doing it like this:
<textarea id="editor"><?php echo htmlspecialchars($htmlToEdit, ...); ?></textarea>
Reason: Even ignoring that you obviously don't want anyone breaking out of your editor textarea by supplying </textarea>, what you want between your <textarea>-tags is plaintext. Imagine the editor isn't being loaded. You want to see the HTML source, right? So you want to treat your data as plain text - and you're outputting it into HTML, so you need to escape it like you would any other plain text.
The editor will take this plaintext and interpret it as HTML once more (which is where things get confusing for a lot of developers). For this, it doesn't need to decode the text any more than you would need to turn a > into > by hand. It sees what you would.
Now, when a browser triggers a form send, it will send the plaintext. This is more obvious if you consider a normal input field:
<input name="foo" type="text" value="5 > 4" />
After the form is sent, this will arrive server-side as $_REQUEST: array('foo' => '5 > 4') without that you need to decode it first. The browser sending the form has already decoded it for you.
The exact same behaviour is true for:
<textarea name="foo">5 > 4</textarea>
This, too, will arrive in your script as $_REQUEST: array('foo' => '5 > 4') without that you need to touch it with a decode.
So... if you are decoding after you've gotten anything from the browser, please carefully analyse what you are doing.
If you sanitise your HTML after you erroneously decode it, of course you're still safe from XSS and other awful things when you output it again :) but chances are that you're breaking the document structure in some way.
Please reconsider that call:
(Edit: Fixed formatting after an HTML escaping issue ravaged the forum.)
Edited 1 time(s). Last edit at 07/30/2012 01:55PM by pinkgothic.