|
Tim Koop
Nested List ItemsMarch 21, 2011 01:37PM |
I'm trying purify this html:
<ol>
<li>First Main</li>
<ol>
<li>first indented</li>
<li>second indented</li>
</ol>
<li>Second Main</li>
</ol>
It comes out like this:
<ol>
<li>First Main</li>
<li><ol>
<li>first indented</li>
<li>second indented</li>
</ol></li>
<li>Second Main</li>
</ol>
When I look at the html it produces, I think it is wrong.
Is there some configuration I can set to get it working the way I want it to?
Thanks.
-- Tim
|
Re: Nested List Items March 21, 2011 01:43PM |
Admin Registered: 6 years ago Posts: 2,640 |
|
BugSlayer
Re: Nested List ItemsSeptember 07, 2011 05:00PM |
Same problem, but the problem runs much deeper than it looks. Using contenteditable in Firefox produces invalid nested list markup (exactly like what the OP showed). The easiest way to observe this would be to play with cleditor at http://premiumsoftware.net/cleditor/ (this is just a thin UI layer around contenteditable). Click the list button, press enter, press the indent button, view source.
At the end of the day this is a major use case in the latest version of a major browser, but the purified markup does not correctly reflect the intent of the input markup.
The ideal result (best reflects the most likely intent of whoever made the markup) would be this:
<ol>
<li>First Main
<ol>
<li>first indented</li>
<li>second indented</li>
</ol>
</li>
<li>Second Main</li>
</ol>
I looked at the code but implementing the above seemed like it would be very tricky, given how the algorithm is currently constructed.
There is another possible solution/workaround/hack that is VERY easy to code. This is to add an option to explicitly allow this very wrong, but still common, list structure. This requires only the following tweak in HTMLPurifier_HTMLModule_List::setup():
if ($config->get('HTML.AllowInvalidListNesting')) {
$ol = $this->addElement('ol', 'List', 'Required: ol | ul | li', 'Common');
$ul = $this->addElement('ul', 'List', 'Required: ol | ul | li', 'Common');
} else {
$ol = $this->addElement('ol', 'List', 'Required: li', 'Common');
$ol->wrap = "li";
$ul = $this->addElement('ul', 'List', 'Required: li', 'Common');
$ul->wrap = "li";
}
This doesn't give the same benefit as fully valid markup, but the behavior of the invalid markup in the browser matches the original intent, and this minor deviation shouldn't have any security implications so it is "better".
|
Re: Nested List Items September 22, 2011 09:51AM |
Registered: 3 years ago Posts: 61 |
@BugSlayer: You've just saved me hours worth of headaches. Have a lifetime supply of gold stars! :D
@Ambush Commander: Any chance that directive (HTML.AllowInvalidListNesting) and BugSlayer's patch might make it into the official code?
Edited 1 time(s). Last edit at 07/30/2012 01:59PM by pinkgothic.
|
flipergebet
Re: Nested List ItemsOctober 06, 2011 08:08AM |
|
Re: Nested List Items October 07, 2011 12:43PM |
Admin Registered: 6 years ago Posts: 2,640 |
|
Re: Nested List Items December 26, 2011 01:00AM |
Admin Registered: 6 years ago Posts: 2,640 |
|
Re: Nested List Items February 10, 2012 08:32AM |
Registered: 3 years ago Posts: 61 |
As a heads-up, it seems some(?) versions of Internet Explorer also construct awful things like:
<ol><ol>value</ol></ol>
I imagine you don't want to allow for that, though, but I figured it's worth mentioning.
(Edit: Fixed formatting after an HTML escaping bug ravaged the forum.)
Edited 1 time(s). Last edit at 07/30/2012 02:00PM by pinkgothic.
|
Re: Nested List Items February 10, 2012 09:19AM |
Admin Registered: 6 years ago Posts: 2,640 |
The fix is in HTML Purifier 4.4.0. I think the behavior here is now pretty reasonable:
|
Re: Nested List Items February 10, 2012 10:35AM |
Registered: 3 years ago Posts: 61 |