Welcome! » Log In » Create A New Profile

Nested List Items

Posted by Tim Koop 
Tim Koop
Nested List Items
March 21, 2011 01:37PM

I'm trying purify this html:

<ol>
    <li>First Main</li>
    <ol>
        <li>first indented</li>
        <li>second indented</li>
    </ol>
    <li>Second Main</li>
</ol>

It comes out like this:

<ol>
    <li>First Main</li>
    <li><ol>
        <li>first indented</li>
        <li>second indented</li>
    </ol></li>
    <li>Second Main</li>
</ol>

When I look at the html it produces, I think it is wrong.

Is there some configuration I can set to get it working the way I want it to?

Thanks.

-- Tim

Re: Nested List Items
March 21, 2011 01:43PM

ol is not a valid child of ol, so the former HTML is invalid. There's no configuration knob, unfortunately. I recommend looking into changing your CSS styling so the latter renders like the former. Another classic trick is to place the ol inside the preceding li.

BugSlayer
Re: Nested List Items
September 07, 2011 05:00PM

Same problem, but the problem runs much deeper than it looks. Using contenteditable in Firefox produces invalid nested list markup (exactly like what the OP showed). The easiest way to observe this would be to play with cleditor at http://premiumsoftware.net/cleditor/ (this is just a thin UI layer around contenteditable). Click the list button, press enter, press the indent button, view source.

At the end of the day this is a major use case in the latest version of a major browser, but the purified markup does not correctly reflect the intent of the input markup.

The ideal result (best reflects the most likely intent of whoever made the markup) would be this:

<ol>
    <li>First Main
        <ol>
            <li>first indented</li>
            <li>second indented</li>
        </ol>
    </li>
    <li>Second Main</li>
</ol>

I looked at the code but implementing the above seemed like it would be very tricky, given how the algorithm is currently constructed.

There is another possible solution/workaround/hack that is VERY easy to code. This is to add an option to explicitly allow this very wrong, but still common, list structure. This requires only the following tweak in HTMLPurifier_HTMLModule_List::setup():

        if ($config->get('HTML.AllowInvalidListNesting')) {
            $ol = $this->addElement('ol', 'List', 'Required: ol | ul | li', 'Common');
            $ul = $this->addElement('ul', 'List', 'Required: ol | ul | li', 'Common');
        } else {
            $ol = $this->addElement('ol', 'List', 'Required: li', 'Common');
            $ol->wrap = "li";
            $ul = $this->addElement('ul', 'List', 'Required: li', 'Common');
            $ul->wrap = "li";
        }

This doesn't give the same benefit as fully valid markup, but the behavior of the invalid markup in the browser matches the original intent, and this minor deviation shouldn't have any security implications so it is "better".

Re: Nested List Items
September 22, 2011 09:51AM

@BugSlayer: You've just saved me hours worth of headaches. Have a lifetime supply of gold stars! :D

@Ambush Commander: Any chance that directive (HTML.AllowInvalidListNesting) and BugSlayer's patch might make it into the official code?

Edited 1 time(s). Last edit at 07/30/2012 01:59PM by pinkgothic.

flipergebet
Re: Nested List Items
October 06, 2011 08:08AM

I found this solution very useful and have implemented it in my solution. I would second the call for it to make it into core.

Re: Nested List Items
October 07, 2011 12:43PM

Yeah, we can probably add a "violate standards compliance mode for usability" mode.

Re: Nested List Items
December 26, 2011 01:00AM

So, I was hacking up the horrible hack, when I realized it actually wouldn't be too difficult to do this properly. So I did. Expect it in 4.3.1

Re: Nested List Items
February 10, 2012 08:32AM

As a heads-up, it seems some(?) versions of Internet Explorer also construct awful things like:

<ol><ol>value</ol></ol>

I imagine you don't want to allow for that, though, but I figured it's worth mentioning.

(Edit: Fixed formatting after an HTML escaping bug ravaged the forum.)

Edited 1 time(s). Last edit at 07/30/2012 02:00PM by pinkgothic.

Re: Nested List Items
February 10, 2012 10:35AM

Aah, fantastic! I didn't realise it'd be able to deal with that construct, too. Should have tried. Thanks for the heads-up! Very nice result.

Author:
Your Email:

Subject:

HTML input is enabled. Make sure you escape all HTML and angled brackets with &lt; and &gt;.

Auto-paragraphing is enabled. Double newlines will be converted to paragraphs; for single newlines, use the pre tag.

Allowed tags: a, abbr, acronym, b, blockquote, caption, cite, code, dd, del, dfn, div, dl, dt, em, i, ins, kbd, li, ol, p, pre, s, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, var.

For inputting literal code such as HTML and PHP for display, use CDATA tags to auto-escape your angled brackets, and pre to preserve newlines:

<pre><![CDATA[
Place code here
]]></pre>

Power users, you can hide this notice with:

.htmlpurifier-help {display:none;}

Message: