Welcome! » Log In » Create A New Profile

Internationalized domain names

Posted by Colin Snover 
Colin Snover
Internationalized domain names
October 01, 2010 10:25PM

From HTMLPurifier/AttrDef/URI/Host.php:

        // This breaks I18N domain names, but we don't have proper IRI support,
        // so force users to insert Punycode. If there's complaining we'll
        // try to fix things into an international friendly form.

Consider this an official complaint. :)

Re: Internationalized domain names
October 02, 2010 08:09PM

Grumble now I have to find a punycode encoding algorithm...

Colin Snover
Re: Internationalized domain names
October 08, 2010 05:12PM

You may blame tinyarro.ws for this. :)

Anyway, looks like the PEAR package is the go-to package, but there is also code in Zend Framework’s Zend_Validate_Hostname and Zend_Uri code that might be useful too.

Re: Internationalized domain names
November 12, 2010 01:51PM

Upon closer consideration, maybe I'll just make my domain name regex sufficiently clever to deal with IDNs and not bother attempting to convert everything to Punycode.

Re: Internationalized domain names
November 12, 2010 04:44PM

No... that's bad because we've made a commitment to URIs, not IRIs, and mumble legacy clients mumble. But there's no convincing PHP punycode implementations (the PEAR one's test suite is half failures... I refuse to use that). So this will have to block on a convincing Punycode implementation dropping into my lap... one that is coded in PHP5 style... is LGPL... and has a real test suite...

Re: Internationalized domain names
July 04, 2011 02:04PM

Unfortunately, I am indefinitely tabling this bug. Patches or sponsorship accepted, but I currently have zero motivation to work on this feature request. Sorry :-( Users needing a workaround can directly paste punycoded links.

Author:
Your Email:

Subject:

HTML input is enabled. Make sure you escape all HTML and angled brackets with < and >.

Auto-paragraphing is enabled. Double newlines will be converted to paragraphs; for single newlines, use the pre tag.

Allowed tags: a, abbr, acronym, b, blockquote, caption, cite, code, dd, del, dfn, div, dl, dt, em, i, ins, kbd, li, ol, p, pre, s, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, var.

For inputting literal code such as HTML and PHP for display, use CDATA tags to auto-escape your angled brackets, and pre to preserve newlines:

<pre><![CDATA[
Place code here
]]></pre>

Power users, you can hide this notice with:

.htmlpurifier-help {display:none;}

Message: