This is a post born from spending hours trying to squash bugs and zap gremlins.
In an attempt to streamline content re-versioning in different languages, I had created a work flow that went like this:
- Create Google spreadsheet for easy collaborative editing
- Pull down a Microsoft Excel version of the Google spreadsheet (alas, not CSV as the Google-generated CSV wasn’t playing ball with MySQL)
- Import this into MySQL
- Generate static HTML with translations inserted where appropriate for each language
The process was fine, but somewhere within all these steps something was going awry. Latin characters with accents weren’t showing up properly and apostrophes were rendering in all different ways — �, `â, ? — anything except what I needed. Furthermore, Cyrillic, Chinese and Arabic weren’t even displaying at all.
Finally, I spotted a snippet on the MySQL site from 2006, written by Lorenz Pressler:
after mysql_connect() , and mysql_select_db() add this lines:
mysql_query(“SET NAMES utf8″);
…and that was all I needed. In fact I didn’t even need to convert anything into UTF-8 in PHP. Once MySQL was outputting UTF-8 correctly, everything was fine. The database was encoded in UTF-8, so I assumed too much in thinking that meant it would automatically output it in that way.
So, if ever you have problems with MySQL and UTF characters not displaying, try
SET NAMES and hopefully that’l fix the issue.