Welcome! » Log In » Create A New Profile

HTMLPurifier + Markdown -> A lot of empty p tags

Posted by AzizLight 
HTMLPurifier + Markdown -> A lot of empty p tags
January 15, 2010 10:52AM

Hello everybody, I am creating my own minimalist blog engine using codeigniter. I added HTMLPurifier and if works fine. I also use Markdown. I will post here the plugin I use and some code to illustrate my problem:

The plugin: htmlpurifier_pi.php:

function purify($html)
{
	if (empty($html) || trim((string)$html) === '')
	{
		log_message('error','htmlpurifier_pi::purify : The html you sent to the HTML Purifier is empty...I wonder how is that possible...');
		return FALSE;
	}
	
	if (is_array($html))
	{
		foreach ($html as $key => $value)
		{
			$html[$key] = purify($value);
		}
		
		return $html;
	}
	else
	{
		require_once(APPPATH . 'plugins/htmlpurifier/HTMLPurifier.standalone.php'); 
		
		$allowed_tags = 'p,em,i,strong,b,a[href],ul,ol,li,code,pre,blockquote';
		
		$config = HTMLPurifier_Config::createDefault();
		$config->set('HTML.Doctype', 'XHTML 1.0 Strict');
		$config->set('HTML.Allowed', $allowed_tags);
		$config->set('HTML.TidyLevel', 'heavy');
		$config->set('AutoFormat.Linkify', 'true');
		$config->set('AutoFormat.AutoParagraph', 'true');
		$htmlpurifier = new HTMLPurifier($config);
		
		log_message('debug','HTML Purified!');
		return $htmlpurifier->purify($html);
	}
} // End of purify

The example:

$this->load->plugin('htmlpurifier_pi.php');
$body = purify(markdown($this->input->post('body')));

Now if I post the following comment:

This is a list:

* This is a list item
* This is another list item

1. This is the first item of an ordered list
2. This is the second item of an ordered list

Quote:

> Only idiots
> never change their minds

Here is the code that I get:

<p>This is a list:</p>

<p></p>

<ul><li>This is a list item</li>
<li>This is another list item</li>
</ul>

<ol><li>This is the first item of an ordered list</li>
<li>This is the second item of an ordered list</li>
</ol>

<p>Quote:</p>

<p></p>

<blockquote>
  <p>Only idiots
  never change their minds</p>
</blockquote>

See? I get a bunch of empty p tags. How could I get rid of those p tags please?

Re: HTMLPurifier + Markdown -> A lot of empty p tags
January 17, 2010 04:00AM
Re: HTMLPurifier + Markdown -> A lot of empty p tags
January 17, 2010 01:06PM

Thanks for your answer. I actually found what was the problem.

$config->set(&#039;AutoFormat.AutoParagraph&#039;, &#039;true&#039;);

This is done two times since Markdown wraps paragraphs in p tags too, so I guess Markdown adds some whitespace between each paragraph, and that whitespace is then wrapped in p tags by HTMLPurifier. Removing AutoParagraph solves the problem.

I'm finding that AutoFormat.RemoveEmpty works perfectly with

<p></p>

tagsets, but fails with the following examples:

<p>&nbsp; </p>
<p>\n</p>
Re: HTMLPurifier + Markdown -> A lot of empty p tags
July 11, 2012 10:33AM
Sorry, you do not have permission to post/reply in this forum.