|
Jörg Ludwig
Bug: cuts off html after 8 kbyte with special charsJune 29, 2011 11:14AM |
We use HTML Purifier to clean up HTML mails from customers before displaying them. Under certain circumstances an ISO-8859-1 HTML string is cut off in the middle. The following scripts reproduces the problem:
require_once "HTMLPurifier.auto.php";
$in = "€".str_repeat(".", 50000);
$cfg = HTMLPurifier_Config::createDefault();
$cfg->set("Core.Encoding", "iso-8859-1");
$purifier = new HTMLPurifier($cfg);
$out = $purifier->purify($in);
echo "in: ".strlen($in)."";
echo "out: ".strlen($out)."";
echo $out;
Output:
in: 50007 out: 8159 ................... [...]
Expected Output:
in: 50007 out: 50007 [Euro symbol]............ [...]
The problem does not occur with encoding set to UTF-8. Unfortunately we cannot just convert the encoding as the encoding is also declared in the HTML header of the input string.
|
Re: Bug: cuts off html after 8 kbyte with special chars June 29, 2011 10:36PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Jörg Ludwig
Re: Bug: cuts off html after 8 kbyte with special charsJuly 04, 2011 10:31AM |
|
Re: Bug: cuts off html after 8 kbyte with special chars July 04, 2011 10:48AM |
Admin Registered: 6 years ago Posts: 2,632 |
What happens if you set %Core.EscapeNonASCIICharacters to true.
|
Jörg Ludwig
Re: Bug: cuts off html after 8 kbyte with special charsJuly 08, 2011 07:03AM |
|
Re: Bug: cuts off html after 8 kbyte with special chars July 08, 2011 08:05AM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Bug: cuts off html after 8 kbyte with special chars December 18, 2011 01:47PM |
Admin Registered: 6 years ago Posts: 2,632 |
Looks like this bug: https://bugs.php.net/bug.php?id=48147
|
Re: Bug: cuts off html after 8 kbyte with special chars December 18, 2011 05:36PM |
Admin Registered: 6 years ago Posts: 2,632 |
I just submitted two upstream bugs on this issue:
|
Re: Bug: cuts off html after 8 kbyte with special chars December 25, 2011 09:58AM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Bug: cuts off html after 8 kbyte with special chars February 17, 2012 05:42AM |
Registered: 3 years ago Posts: 61 |
I just submitted two upstream bugs on this issue:
http:// sources.redhat <dot> com/bugzilla/show_bug.cgi?id=13518
http:// sources.redhat <dot> com/bugzilla/show_bug.cgi?id=13517
Just want to add to this list since I was just reading through the related bugs and ended up searching for it, myself:
http:// sources.redhat <dot> com/bugzilla/show_bug.cgi?id=13541
That's the follow-up to Bug 13518. Maybe this'll save someone the search! :)
(spaces and <dot>s so aksimet doesn't spam-file me.)
(Edit: Fixed formatting after an HTML encoding bug ravaged the forum ^-^)
Edited 1 time(s). Last edit at 07/30/2012 01:51PM by pinkgothic.