Posted by & filed under General, MySQL, PHP.

Many content management systems and forums were originally created using the latin1 character set and the latin1_swedish_ci collation in MySQL. The problem many of these systems are facing today is the growing demand for multi-language content using special characters that cannot be accurately represented in the latin1 character set. This is where utf8 comes in.

The problem with simply converting a database from latin1 to utf8 is in the data itself. When you convert a database or table to a different character set or collation, it does not convert the content held within the tables. What happens is users end up with ‘strange’ characters in their data.

strange_a_e latin1_convert_to_utf8

You can prevent this by taking steps to protect the data before you convert the database or tables.

An overview of the steps:

  1. Backup the database. No, really, do it.
  2. Alter the string field types to their binary equivalents in all tables. This will protect the data during conversion.
    • VARCHAR to VARBINARY
    • CHAR to BINARY
    • LONGTEXT to LONGBLOB
    • MEDIUMTEXT to MEDIUMBLOB
    • TEXT to BLOB
    • TINYTEXT to TINYBLOB
  3. Convert the database and tables from latin1 to utf8 character set and latin1_swedish_ci to utf8_general_ci collation.
  4. Convert the binary field types back to their original string field types.
  5. Backup the resulting database.
  6. Finished

Ex. Generic convert column to BLOB from TEXT

ALTER TABLE tbl_name MODIFY column_name BLOB

Ex. Generic convert table to new character set

ALTER TABLE tbl_name CHARACTER SET charset_name COLLATE collation_name

Ex. Generic convert database to new character set utf8

ALTER DATABASE database_name CHARACTER SET utf8;
(5.00 out of 5)

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>