UTF-8: Reasons, Issues & Conversion ⇒ Frequently Asked Questions ⇒ CPG Dragonfly™ CMS

Conversion to UTF-8: My database is in x encoding how do I change it to UTF-8 encoding?

When you setup Dragonfly CMS™ on a fresh database, there is nothing you need to do for UTF regarding the database. But if you are upgrading and your db has characters other than english , special characters like éáà è¿, there are a few ways to ease the change.

* WARNING Attempt these only on a backup of your database first and browse the results locally as they may be irreversable. What you see in the display at phpMyAdmin may not be as it appears on the page with proper encoding

As of MySQL version 4.1.x (latest stable) via mysql or phpMyAdmin you can change the encoding of the text fields to binary then to UTF-8 with your prefered language set's collation.


pseudo example wrote
ALTER TABLE t MODIFY utf8_col BINARY(x);
ALTER TABLE t MODIFY utf8_col CHAR(x) CHARACTER SET utf8 collation utf8_general_ci;


Ultraedit will also easily convert a whole database to utf-8 if you dont have access to the mysql prompt, but you can also do this function even easier at a mysql prompt level:

First thing, make a dump (or backup) of the database .

mysqldump --opt -u root -p database_name > database_name.sql

As we’re going to be doing stuff, it might be wise to make a copy of the working database_name database. I did that by creating a new database in PhpMyAdmin, and importing my freshly dumped database into it:

mysql -u root -p database_name-backup < database_name.sql

The use the system tool iconv to change the encoding of the database

iconv -f iso-8859-15 -t utf8 database_name.sql > database_name-iconv.sql

Then, import into the database.

mysql -u root -p database_name-utf8 < database_name-iconv.sql

<br />Once I was happy that it was working, I imported my converted dump into the production database:

mysql -u root -p database_name < database_name-iconv.sql

If you get this error when trying to import:

ERROR 1153 at line x: Got a packet bigger than 'max_allowed_packet'

You need to edit my.cnf (/etc/mysql/my.cnf on my system) and change the value for max_allowed_packet in the section titled [mysqld].
Then stop mysql and restart it:

mysqladmin -u root -p shutdown to stop it, and mysqld_safe & to start it again (as root).

You can also break up a the database dump into managable parts using Ultraedit...

See me also

Why Use UTF-8?

One big reason is it allows all languages to use the same character set and be displayed on the same page

ما هي الشفرة الموحدة "يونِكود" ؟ :Arabic
Какво е Unicode ?
Unicode – това е уникален код за всеки знак,
независимо от компютърната платформа,
независимо от програмата,
независимо от езика.

Traditional Chinese: 什麽是Unicode(統一碼/標準萬國碼)?
Unicode給每個字元提供了一個唯一的數位,
不論是什麽平臺,不論是什麽程式,不論是什麽語言。

Simplified Chinese: 什么是Unicode(统一码)?
Unicode给每个字符提供了一个唯一的数字,
不论是什么平台,不论是什么程序,不论是什么语言。

Croatian: Što je Unicode?
Unicode koristi jedinstven broj za svaki znak,
bez obzira na platformu, bez obzira na program,bez obzira na jezik.

Czech: Co je Unicode?
Unicode přiřazuje každému znaku jedinečné číslo,
nezávisle na platformě,nezávisle na programu,nezávisle na jazyce.

Danish: Hvad er Unicode?
Unicode tildeler hvert enkelt skrifttegn et unikt tal, uanset hvilken platform, uanset hvilket program, uanset hvilket sprog.

Dutch: Wat is Unicode? Unicode biedt een uniek getal voor elk teken, ongeacht het gebruikte platform, ongeacht het gebruikte programma, ongeacht de gebruikte taal.

Esperanto:Kio estas Unikodo? Unikodo provizas unikan numeron por ĉiu signo, sendepende de platformo, sendepende de programo, sendepende de lingvo.

Finnish: Mikä on Unicode?
Unicode antaa jokaiselle merkille oman numeron,
alustasta, ohjelmasta, ja kielestä riippumattoman.

French:Qu'est ce Qu'Unicode? Unicode spécifie un numéro unique pour chaque caractère, quelle que soit la plate-forme, quel que soit le logiciel, quelle que soit la langue.

Georgian: რა არის უნიკოდი?
უნიკოდის ყოველ სიმბოლოს შეესაბამება უნიკალური რიცხვი, არა აქვს მნიშვნელობა რომელი პლატფორმაა, არა აქვს მნიშვნელობა რომელი პროგრამაა, არა აქვს მნიშვნელობა რომელი ენაა.

German: Was ist Unicode?
Unicode gibt jedem Zeichen seine eigene Nummer, systemunabhängig, programmunabhängig, sprachunabhängig.

מה זה יוניקוד (Unicode)?:Hebrew
יוניקוד מקצה מספר ייחודי לכל תו,
לא משנה על איזו פלטפורמה,
לא משנה באיזו תוכנית,
ולא משנה באיזו שפה.

Hindi यूनिकोड क्या है?

यूनिकोड प्रत्येक अक्षर के लिए एक विशेष नम्बर प्रदान करता है,
चाहे कोई भी प्लैटफॉर्म हो, चाहे कोई भी प्रोग्राम हो, चाहे कोई भी भाषा हो।

Icelandic: Hvað er Unicode?
Unicode staðallinn úthlutar hverju skriftákni tölu,
sem er óháð tölvugerð, sem er óháð forriti, sem er óháð tungumáli.

Interlingua: Que es Unicode?
Unicode assigna un numero unic a cata character, independentemente de platteforma, independentemente de programma, independentemente de lingua.

Italian: Unicode assegna un numero univoco a ogni carattere,
indipendentemente dalla piattaforma,
indipendentemente dall'applicazione,
indipendentemente dalla lingua.

Japanese: ユニコードとは何か?

ユニコードは、すべての文字に固有の番号を付与します
プラットフォームには依存しません
プログラムにも依存しません
言語にも依存しません

Korean 유니코드에 대해 ?
어떤 플랫폼,
어떤 프로그램, 어떤 언어에도 상관없이 유니코드는 모든 문자에 대해 고유 번호를 제공합니다.

Lithuanian Kas tai yra Unikodas?
Unikodas priskiria unikalų skaičių kiekvienam simboliui,
nepriklausomai nuo platformos, nepriklausomai nuo programos, nepriklausomai nuo kalbos.

Macedonian: Што е Unicode?
Unicode нуди единствен број за секој симбол, без оглед на платформата, без оглед на програмата, и без оглед на јазикот.
X'inhu l-Unicode?

Maltese: L-Unicode jipprovdi numru wieħed għal kull karattru,
għal kull magna, għal kull programm, għal kull lingwa.
:Persianيونی‌کُد چيست؟
يونی‌کد به همه‌ی نويسه‌ها اعداد يکتايی اختصاص می‌دهد،
مستقل از محيط،
مستقل از برنامه،
مستقل از زبان.

Polish: Czym jest Unikod ?
Unikod przypisuje unikalny numer każdemu znakowi, niezależny od używanej platformy, programu czy języka.

Portuguese: O que é Unicode?
Unicode fornece um número único para cada caracter,
não importa a plataforma,não importa o programa,não importa a língua.

Romanian: Ce este Unicode?
Unicode asignează un număr unic pentru fiecare caracter, independent de platformă, independent de aplicaţie, independent de limbă

Russian: Что такое Unicode?
Unicode - это уникальный код для любого символа, независимо от платформы, независимо от программы, независимо от языка.

Slovenian: Kaj je Unicode?
Unicode uporablja edinstvena števila za vsak znak, ne glede na platformo, ne glede na program, ne glede na jezik.

Spanish: ¿Qué es Unicode?
Unicode proporciona un número único para cada carácter, sin importar la plataforma, sin importar el programa, sin importar el idioma.

Swedish: Vad är Unicode?
Unicode ger varje tecken ett unikt nummer, oavsett plattform, oavsett program, oavsett språk.

Turkish: Evrensel Kod Nedir?
Evrensel Kod her yazı karakteri için bir ve yalnız bir sayı şart koşar, hangi altyapı, hangi yazılım, hangi dil olursa olsun.

Uyghur: ﻳﯘﻧﯩﻜﻮﺩ ﺩﯦﮕﻪﻥ ﻧﯧﻤﻪ؟
ﺋﻮﻣﯘﻣﻪﻥ ﺋﯧﻴﺘﻘﺎﻧﺪﺍ، ﻛﻮﻣﭙﻴﯘﺗﯧﺮﻻﺭ ﭘﻪﻗﻪﺕ ﺭﻩﻗﻪﻣﻠﻪﺭﻧﯩﻼ ﺑﯩﺮ ﺗﻪﺭﻩﭖ ﻗﯩﻠﯩﺪﯗ. ﺋﯘﻻﺭ ﻫﻪﺭ ﺑﯩﺮ ﻫﻪﺭﭖ ﯞﻩ ﺑﻪﻟﮕﯩﮕﻪ ﺑﯩﺮﺩﯨﻦ ﻧﻮﻣﯘﺭ ﺑﯧﺮﯨﺶ ﺋﺎﺭﻗﯩﻠﯩﻚ ﺋﯘﻻﺭﻏﺎ ﻧﯩﺴﺒﻪﺗﻪﻥ ﺳﺎﻗﻼﺵ ﺋﯧﻠﯩﭗ ﺑﺎﺭﯨﺪﯗ. ﻳﯘﻧﯩﻜﻮﺩ ﺋﯩﺠﺎﺩ ﻗﯩﻠﯩﻨﯩﺸﺘﯩﻦ

Vietnamese:Unicode là gì?
Unicode cung cấp một con số duy nhất cho mỗi ký tự, cho mọi hệ máy tính, cho mọi chương trình, cho mọi ngôn ngữ.

Welsh: Beth yw Unicode?
Mae Unicode yn darparu rhif unigryw ar gyfer pob nod, heb ddibynnu ar y platfform, heb ddibynnu ar y rhaglen, heb ddibynnu ar yr iaith.

as well as
mathematical operators:∏∑−√∞∟∩∫≈≠≡≤≥≥
fractions: ¾⅓⅔⅛⅜⅝⅞¼½
currency: $¢£¤¥₣₤₧₪₫€
punctuation: §©«®µ¶»¿‰‼
letter like symbols: ℅ℓ№™Ω℮


Fundamentally, computers just deal with numbers. They store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different encoding systems for assigning these numbers. No single encoding could contain enough characters: for example, the European Union alone requires several different encodings to cover all its languages. Even for a single language like English no single encoding was adequate for all the letters, punctuation, and technical symbols in common use.

These encoding systems also conflict with one another. That is, two encodings can use the same number for two different characters, or use different numbers for the same character. Any given computer (especially servers) needs to support many different encodings; yet whenever data is passed between different encodings or platforms, that data always runs the risk of corruption.
Unicode is changing all that!

Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. The Unicode Standard has been adopted by such industry leaders as Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, Unisys and many others. Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646. It is supported in many operating systems, all modern browsers, and many other products. The emergence of the Unicode Standard, and the availability of tools supporting it, are among the most significant recent global software technology trends.

Incorporating Unicode into client-server or multi-tiered applications and websites offers significant cost savings over the use of legacy character sets. Unicode enables a single software product or a single website to be targeted across multiple platforms, languages and countries without re-engineering. It allows data to be transported through many different systems without corruption.


http://www.unicode.org/
Unicode Consortium

Your site does not show my language with the right encoding in my browser can we change to 8859-x

We now use UTF-8 encoding so that people from all over the world can view your site under the proper encoding. All browsers are able to accept UTF-8 encoding.

Here's a debug list to check
  • Internet Explorer has an option under [View] which shows the language code with which you are viewing the page
    • When it says "Unicode (UTF)", your browser works properly.
    • If it says something else, you probably have a proxy/firewall issue. Consult with the vendor of your firewall to solve these problems
  • Firefox auto-detects and is always set to UTF-8
  • Mozilla has UTF-8 active by default
  • Check if your browser saves the cookies that are sent through the header. If you never receive cookies, it could be due to your proxy, if you are using one
    • Deactivate the use of a proxy server
    • Ask your internet service provider how to turn off their proxy, if one is used
  • As a last resort, check other languages, as well as this site. Are they all messed up? or just one? Perhaps there is an issue with that individual file. If you have an issue that is still not solved after reading this, please post in our forums or projects modules.

We have tested UTF-8 on all browsers in Windows 98, 2000 and XP. Some were setup multilingually and others use a native language (Dutch / English) without any additional packages.

We will not help you to modify your CPG-Nuke to use another charset, but we will help you with any problems you may encounter using UTF-8, that are not answered here or in one of these locations:
User Info

Welcome Anonymous



(Register)
Community

Support for DragonflyCMS in a other languages:

Deutsch
Español