Welcome! » Log In » Create A New Profile

Non Latin Domains

Posted by Georgi 
Georgi
Non Latin Domains
June 02, 2010 11:47AM

Hello, I searched the docs and the forum, but I didn't find any information about the non-latin domains. I tested with several domains and they all were removed by the htmlpurifier (tested with 4.0.0, 4.1.0, 4.1.1).

Test results in Test

Does the library support non-latin domains or is this feature on your to-do list? Any ideas for quick fix of the issue?

Thanks!

Georgi

Re: Non Latin Domains
June 02, 2010 12:08PM

Correct, HTML Purifier doesn't currently support them. I was thinking of putting in a punycode converter to HTML Purifier to handle such cases; I haven't done enough research into IDNs to know what can safely be put in them.

Georgi
Re: Non Latin Domains
June 02, 2010 03:57PM

Correct, HTML Purifier doesn't currently support them. I was thinking of putting in a punycode converter to HTML Purifier to handle such cases; I haven't done enough research into IDNs to know what can safely be put in them.

Thank you for the quick reply!

The support tickets from "innovative" users are killing me, but I will keep the htmlpurifier, because I really like it.

Several of my colleagues which are working on different projects are experiencing the same problems with the new idiotic non-latin domain names, so working solution will be available soon from one of us. I will post it here too.

Best regards, Georgi

Henrique Vicente
Re: Non Latin Domains
August 13, 2010 10:01AM

Correct, HTML Purifier doesn't currently support them. I was thinking of putting in a punycode converter to HTML Purifier to handle such cases; I haven't done enough research into IDNs to know what can safely be put in them.

Thank you for the quick reply!

The support tickets from "innovative" users are killing me, but I will keep the htmlpurifier, because I really like it.

Several of my colleagues which are working on different projects are experiencing the same problems with the new idiotic non-latin domain names, so working solution will be available soon from one of us. I will post it here too.

Best regards, Georgi

Hello, do you have a solution already? Thanks.

Re: Non Latin Domains
August 13, 2010 10:47AM

Not yet, sorry.

Georgi
Re: Non Latin Domains
August 09, 2011 04:59AM

All my attempts to make it work good with non latin domains were a complete failure. Nobody has the time to read all the needed docs for a proper filter.

So if somebody has some sort of solution - please share it!

Best Regards!

Author:
Your Email:

Subject:

HTML input is enabled. Make sure you escape all HTML and angled brackets with < and >.

Auto-paragraphing is enabled. Double newlines will be converted to paragraphs; for single newlines, use the pre tag.

Allowed tags: a, abbr, acronym, b, blockquote, caption, cite, code, dd, del, dfn, div, dl, dt, em, i, ins, kbd, li, ol, p, pre, s, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, var.

For inputting literal code such as HTML and PHP for display, use CDATA tags to auto-escape your angled brackets, and pre to preserve newlines:

<pre><![CDATA[
Place code here
]]></pre>

Power users, you can hide this notice with:

.htmlpurifier-help {display:none;}

Message: