Welcome! » Log In » Create A New Profile

YouTube iFrames are stripped out of valid HTML

Posted by mrfr33z3 
YouTube iFrames are stripped out of valid HTML
August 06, 2014 02:25PM

I have content that is being stored in a DB, where the iframe tags are properly escaped with HTML entities. I'm then decoding this content and passing it to HTML Purifier with PHP in the form of:

html_entity_decode("Some HTML");

This results is a usable bit of HTML, which if I write directly into a page, works fine. However, once it is passed through Purify, the iFrames are stripped out.

purify(html_entity_decode("Some HTML"));

I need to be able to support both YouTube and Vimeo in our application. I have added the appropriate configuration options like so:

$config->set('HTML.SafeIframe', true);
$config->set('URI.SafeIframeRegexp', '%^(\/\/www\.youtube(?:-nocookie)?\.com\/embed\/|\/\/player\.vimeo\.com\/)%');

And have tested the regex supplied above like so:

$regex = '%^(\/\/www\.youtube(?:-nocookie)?\.com\/embed\/|\/\/player\.vimeo\.com\/)%';
$uri = "//www.youtube.com/embed/d9fg87f798f";
preg_match($regex, $uri, $matches);
print_r($matches);
// Outputs: Array ( [0] => //www.youtube.com/embed/ [1] => //www.youtube.com/embed/ )

iFrames are also allowed by my configuration at present as well:

$config->set('HTML.Allowed', 'p,a[href|title],abbr[title],acronym[title],b,strong,blockquote[cite],code,em,i,iframe[src|width|height],img[alt|title|class|src|height|width],h1,h2,h3,h3,ol,ul,li,table,tr,td,hr');

Any ideas? I'm grateful for any input on this. Thanks in advance.

Re: YouTube iFrames are stripped out of valid HTML
August 08, 2014 08:04AM

Can you post the input HTML (i.e. var dump before it is passed to purify) and what you get out?

Re: YouTube iFrames are stripped out of valid HTML
August 08, 2014 08:38AM

Sure thing...

This is how the HTML is being stored in the DB, as it is posted by TinyMCE and then cleaned by CodeIgniter's XSS implementation:

<p>&lt;iframe src="//www.youtube.com/embed/Jrd5HVb_lZQ" width="560" height="315" frameborder="0" allowfullscreen="allowfullscreen"&gt;&lt;/iframe></p>

When retrieved from the DB, it is the same, and when passed through html_entity_decode, it looks fine, like this:

<p><iframe src="//www.youtube.com/embed/OL8LREmbDi0" width="560" height="315" frameborder="0" allowfullscreen="allowfullscreen"></iframe></p>

I've tried debugging the string from the DB and then passed through purify in the same request. When this is passed through purify, I'm not getting the paragraph block with the iFrame in it at all. It's getting stripped out completely. If there is other HTML in the string, that comes through fine. It's just the block that includes the iFrame.

Re: YouTube iFrames are stripped out of valid HTML
August 08, 2014 08:56AM

What happens if you purify with no config except %HTML.SafeIframe and %URI.SafeIframeRegexp set to allow everything (e.g. empty string)

Re: YouTube iFrames are stripped out of valid HTML
August 08, 2014 09:26AM

Good point. I began taking the config options out of the equation and figured out the following was causing the issue:

$config->set(&#039;HTML.Doctype&#039;, &#039;XHTML 1.0 Strict&#039;);

@Ambush - Thanks again for your assistance. I realize that iFrames are *not* strict XHTML 1.0, and this is why the iFrame is being stripped out with that configuration option present.

Edited 1 time(s). Last edit at 08/11/2014 10:26AM by mrfr33z3.

Re: YouTube iFrames are stripped out of valid HTML
August 11, 2014 10:29AM

For those interested, I ended up going with "HTML 4.01 Transitional" as the %HTML.Doctype. This cleared up the issue I was having with iFrames.

Re: YouTube iFrames are stripped out of valid HTML
August 11, 2014 10:35AM

Haha, man, we really should give a warning when that happens.

Sorry, you do not have permission to post/reply in this forum.