Server - System - Manager - CentOS, Operation System, VBB, HACKING AND SECURITY

Go Back   Server - System - Manager - CentOS, Operation System, VBB, HACKING AND SECURITY > CentOS Việt Nam - Viet Nam Linux CentOS Community > Cài đặt CentOS và ứng dụng > MySQL trên CentOS

Reply
 
LinkBack (3) Thread Tools Display Modes
  3 links from elsewhere to this Post. Click to view. #1 (permalink)  
Old 29-07-2008
Lovelinux's Avatar
Super Moderator
 
Join Date: Jun 2008
Posts: 221
Thanks: 22
Thanked 46 Times in 24 Posts
Default Convert font latin1 to utf-8 và từ iso to utf-8 trong MYSQL

1. latin1 to UTF-8 in MySQL:
MySQL dump
First of all, we need to dump the old data into a file.


Code: Create a MySQL dump $ mysqldump -h host.com --user=frog -p --default-character-set=latin1 -c \
--insert-ignore --skip-set-charset dbname > dump.sql


Please mention, that you have to replace the user, the host and the dbname, otherwise it will result in an error

Convert dump
Next thing to do is, converting the characters in the MySQL dump from latin1 to UTF-8


Code: Convert dump $ iconv -f ISO-8859-1 -t UTF-8 dump.sql > dump_utf8.sql
$ perl -pi -w -e 's/CHARSET=latin1/CHARSET=utf8/g;' dump_utf8.sql


If you have another source charset, you need to replace the -f option with your local character set.

Drop and create
Now it's time to drop the old database and create a new one with UTF-8 support.


Code: Drop and Create $ mysql --user=frog -p --execute="DROP DATABASE dbname;
CREATE DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;"

(MySql seems to recommend utf8_unicode_ci over utf8_general_ci for 5.1+, see MySQL :: MySQL 5.1 Referenzhandbuch :: 10.9.1 Unicode-Zeichensätze)
Import dump to databse
Last but not least, we need to import the converted data back to the new database.


Code: Import dump $ mysql --user=frog --max_allowed_packet=16M -p --default-character-set=utf8 dbname < dump_utf8.sql


The max_allowed_packet-option is sometimes important. If your import ends up with a "ERROR 1153 at line 42: Got a packet bigger than 'max_allowed_packet'", you need to increase the packet size. Please mention, that you also need to update /etc/mysql/my.cnf and set max_allowed_packet=16M under the [mysqld] directive

2. iso-8859-1 to UTF-8 in MySQL:
my server adds a default header (iso-8859-1) to the pages it serves. When I switched to WordPress, I was careful to save all my import files as UTF-8, and I honestly thought that everything I had imported into the database was UTF-8. Somewhere in the process, it got switched back to iso-8859-1 (latin-1). The solution to make sure the pages served are indeed UTF-8, as specified in the meta tags of my HTML pages, is to add the following line to .htaccess:
AddDefaultCharset OFF (If one wanted to force UTF-8, AddDefaultCharset UTF-8 would do it, but actually, it’s better to leave the possibility to serve pages with different encodings, isn’t it?)
Now, when I did that, of course, all the accented characters in my site went beserk — proof if it was needed that my database content was not UTF-8. Here is the story of what I went through (and it took many days to find the solution, believe me, although it takes only 2 minutes to do once everything is ready) to convert my database content from ISO-8859-1 to UTF-8. Thanks a lot to all those who helped me through this — and they are many!
First thing, dump the database. I always forget the command for dumps, so here it is:
mysqldump --opt -u root -p wordpress > wordpress.sql As we’re going to be doing stuff, it might be wise to make a copy of the working wordpress database. I did that by creating a new database in PhpMyAdmin, and importing my freshly dumped database into it:
mysql -u root -p wordpress-backup < wordpress.sql Then, conversion. I tried a PHP script, I tried BBEdit, and they seemed to mess up. (Though as I had other issues elsewhere, they may well have worked but I mistakenly thought the problem was coming from there.) Anyway, command-line conversion with iconv is much easier to do:
iconv -f iso-8859-15 -t utf8 wordpress.sql > wordpress-iconv.sql Then, import into the database. I first imported it into another database, edited wp-config.php to point to the new database, and checked that everything was ok:
mysql -u root -p wordpress-utf8 < wordpress-iconv.sql Once I was happy that it was working, I imported my converted dump into the WordPress production database:
mysql -u root -p wordpress < wordpress-iconv.sql On the way there, I had some trouble with MySQL. The MySQL dump more or less put the content of all my weblog posts on one line. For some reason, it didn’t cause any problems when importing the dump before conversion, to create the backup database, but it didn’t play nice after conversion.
I got this error when trying to import:
ERROR 1153 at line 378: Got a packet bigger than 'max_allowed_packet' Line 378 contained half my weblog posts… and was obviously bigger than the 1Mb limit for max_allowed_packet (the whole dump is around 2Mb).
I had to edit my.cnf (/etc/mysql/my.cnf on my system) and change the value for max_allowed_packet in the section titled [mysqld]. I set it to 8Mb. Then, I had to stop mysql and restart it: mysqladmin -u root -p shutdown to stop it, and mysqld_safe & to start it again (as root).
This is not necessarily the best way to do it, and it might not work like that on your system, but it’s what I did and the site is now back up again. Comments welcome, and hope this can be useful to others!


TIP Convert latin1 to UTF-8 in MySQL - Gentoo Linux Wiki
Climb to the Stars (Stephanie Booth) » Converting MySQL Database Contents to UTF-8
WordPress › Support » change from ISO-8859-15 to UTF-8
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

LinkBacks (?)
LinkBack to this Thread: http://hackingart.com/mysql_tren_centos/216-convert_font_latin1_utf_8_va_tu_iso_utf_8_trong_mysql.html
Posted By For Type Date
Convert charset from ISO-8859-1 to UTF-8 - Connek Group Forum This thread Refback 15-08-2008 10:54 PM
Convert charset from ISO-8859-1 to UTF-8 - Connek Group Forum This thread Refback 05-08-2008 05:35 AM
Convert charset from ISO-8859-1 to UTF-8 - Connek Group Forum This thread Refback 29-07-2008 07:51 AM


All times are GMT +1. The time now is 01:14 PM.


© Diễn đàn HackingArt (HA) được xây dựng và phát triển bởi các thành viên.
+ Diễn đàn HackingArt là nơi trao đổi của các webmaster chuyên nghiệp.