Welcome! » Log In » Create A New Profile

[PHP] UTF8 characters...

Posted by Misha 
[PHP] UTF8 characters...
April 05, 2007 05:53PM

Commander,

I am sorry for bothering you with a question not immediately related to HtmlPurifier, but I think you are the only one able to help me.

I have decided to convert my DB to UTF8 following your guidelines. Now, I restored the database, and when I see it in phpmyadmin, all looks fine. I can see and search for special characters.

Anyhow, when I output the SAME fields in a php script (even just outputting them with echo after having retrieved them from the database), some of the special characters are replaced by question marks.

So while I can read this just fine in phpmyadmin "it is like a 360

Re: [PHP] UTF8 characters...
April 05, 2007 05:57PM

Have you set the connection encoding? Try executing this query before pulling data from the database:

SET NAMES utf8

I don't know if MySQL 5.0 needs this, but MySQL 4.1 definitely does.

HTML Purifier, Standards Compliant HTML Filtering

Re: [PHP] UTF8 characters...
April 05, 2007 06:14PM

Wooooooah! You are the best, I knew you could help me.

I have been asking around in php channels, and nobody knew what could be causing this!

Thanks sir!

By the way, since I have just converted to UTF8, is there some way you can suggest to verify the database does not contain invalid characters (if this is possible at all!)?

Well, anyhow, thanks again! I have been fighting with this all day. It will be so much better to have a system where I do not have to use entities for everything!

Re: [PHP] UTF8 characters...
April 05, 2007 06:56PM

Happy to help.

Quote:
By the way, since I have just converted to UTF8, is there some way you can suggest to verify the database does not contain invalid characters (if this is possible at all!)?

Hmm... I really don't know. Try asking some MySQL gurus about it. I, personally, wouldn't worry about it.

HTML Purifier, Standards Compliant HTML Filtering

Re: [PHP] UTF8 characters...
April 06, 2007 01:18PM

Also, worth noting that if the computer does not have the glyphs/fonts to display certain characters, then those characters can appear as question marks, but they will 'function' (e.g., as part of some script) normally.

Author:
Your Email:

Subject:

HTML input is enabled. Make sure you escape all HTML and angled brackets with < and >.

Auto-paragraphing is enabled. Double newlines will be converted to paragraphs; for single newlines, use the pre tag.

Allowed tags: a, abbr, acronym, b, blockquote, caption, cite, code, dd, del, dfn, div, dl, dt, em, i, ins, kbd, li, ol, p, pre, s, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, var.

For inputting literal code such as HTML and PHP for display, use CDATA tags to auto-escape your angled brackets, and pre to preserve newlines:

<pre><![CDATA[
Place code here
]]></pre>

Power users, you can hide this notice with:

.htmlpurifier-help {display:none;}

Message: