I18n:Charset Aliases

From MozillaWiki
Jump to: navigation, search

I18n:Home Page

This is a list of charset aliases used by Mozilla, correct as of Gecko1.9.1 (Firefox 3.5)

Charset aliases sorted by alias

Alias Canonical name
5601 EUC-KR
646 us-ascii
850 IBM850
852 IBM852
855 IBM855
857 IBM857
862 IBM862
864 IBM864
864i IBM864i
866 IBM866
ansi-1251 windows-1251
ansi_x3.4-1968 us-ascii
arabic ISO-8859-6
ascii us-ascii
asmo-708 ISO-8859-6
chinese GB2312
cns11643 x-euc-tw
cp1250 windows-1250
cp1251 windows-1251
cp1252 windows-1252
cp1253 windows-1253
cp1254 windows-1254
cp1255 windows-1255
cp1256 windows-1256
cp1257 windows-1257
cp1258 windows-1258
cp819 ISO-8859-1
cp850 IBM850
cp852 IBM852
cp855 IBM855
cp857 IBM857
cp862 IBM862
cp864 IBM864
cp864i IBM864i
cp-866 IBM866
cp866 IBM866
csbig5 Big5
cseucjpkdfmtjapanese EUC-JP
csgb2312 GB2312
csIBM850 IBM850
csIBM852 IBM852
csIBM855 IBM855
csIBM857 IBM857
csIBM862 IBM862
csIBM864 IBM864
csibm864i IBM864i
csIBM866 IBM866
csiso103t618bit T.61-8bit
csiso111ecmacyrillic ISO-IR-111
csiso2022jp2 ISO-2022-JP
csiso2022jp ISO-2022-JP
csiso2022kr ISO-2022-KR
csiso58gb231280 GB2312
csiso88596e ISO-8859-6-E
csiso88596i ISO-8859-6-I
csiso88598e ISO-8859-8-E
csiso88598i ISO-8859-8-I
csisolatin1 ISO-8859-1
csisolatin2 ISO-8859-2
csisolatin3 ISO-8859-3
csisolatin4 ISO-8859-4
csisolatin5 ISO-8859-9
csisolatin6 ISO-8859-10
csisolatinarabic ISO-8859-6
csisolatincyrillic ISO-8859-5
csisolatingreek ISO-8859-7
csisolatinhebrew ISO-8859-8
csksc56011987 EUC-KR
csMacintosh x-mac-roman
csshiftjis Shift_JIS
csueckr EUC-KR
csunicode11 UTF-16BE
csunicode11utf7 UTF-7
csunicodeascii UTF-16BE
csunicodelatin1 UTF-16BE
csunicode UTF-16BE
csviqr VIQR
csviscii VISCII
cyrillic ISO-8859-5
ecma-114 ISO-8859-6
ecma-118 ISO-8859-7
ecma-cyrillic ISO-IR-111
elot_928 ISO-8859-7
gb_2312-80 GB2312
gbk x-gbk
greek8 ISO-8859-7
greek ISO-8859-7
hebrew ISO-8859-8
ibm819 ISO-8859-1
ibm874 windows-874
iso-10646-j-1 UTF-16BE
iso-10646-ucs-2 UTF-16BE
iso-10646-ucs-4 UTF-32BE
iso-10646-ucs-basic UTF-16BE
iso-10646-unicode-latin1 UTF-16BE
iso-10646 UTF-16BE
iso-2022-cn-ext ISO-2022-CN
iso-2022-jp-2 ISO-2022-JP
iso-ir-100 ISO-8859-1
iso-ir-101 ISO-8859-2
iso-ir-103 T.61-8bit
iso-ir-109 ISO-8859-3
iso-ir-110 ISO-8859-4
iso-ir-126 ISO-8859-7
iso-ir-127 ISO-8859-6
iso-ir-138 ISO-8859-8
iso-ir-144 ISO-8859-5
iso-ir-148 ISO-8859-9
iso-ir-149 EUC-KR
iso-ir-157 ISO-8859-10
iso-ir-58 GB2312
korean EUC-KR
ks_c_5601-1987 x-windows-949
ks_c_5601-1989 EUC-KR
ksc_5601 EUC-KR
ksc5601 EUC-KR
l1 ISO-8859-1
l2 ISO-8859-2
l3 ISO-8859-3
l4 ISO-8859-4
l5 ISO-8859-9
l6 ISO-8859-10
latin1 ISO-8859-1
latin2 ISO-8859-2
latin3 ISO-8859-3
latin4 ISO-8859-4
latin5 ISO-8859-9
latin6 ISO-8859-10
macintosh x-mac-roman
mac x-mac-roman
ms_kanji Shift_JIS
sun_eu_greek ISO-8859-7
t.61 T.61-8bit
unicode-1-1-utf-7 UTF-7
unicode-1-1-utf-8 UTF-8
unicode-2-0-utf-7 UTF-7
visual ISO-8859-8
windows-31j Shift_JIS
x-cp1250 windows-1250
x-cp1251 windows-1251
x-cp1252 windows-1252
x-cp1253 windows-1253
x-cp1254 windows-1254
x-cp1255 windows-1255
x-cp1256 windows-1256
x-cp1257 windows-1257
x-cp1258 windows-1258
x-euc-jp EUC-JP
x-iso-10646-ucs-2-be UTF-16BE
x-iso-10646-ucs-2-le UTF-16LE
x-iso-10646-ucs-4-be UTF-32BE
x-iso-10646-ucs-4-le UTF-32LE
x-sjis Shift_JIS
x-unicode-2-0-utf-7 UTF-7
x-x-big5 Big5
zh_cn.euc GB2312
zh_tw-big5 Big5
zh_tw-euc x-euc-tw

Charset aliases sorted by canonical name

Alias Canonical name
csbig5 Big5
x-x-big5 Big5
zh_tw-big5 Big5
cseucjpkdfmtjapanese EUC-JP
x-euc-jp EUC-JP
5601 EUC-KR
csksc56011987 EUC-KR
csueckr EUC-KR
iso-ir-149 EUC-KR
korean EUC-KR
ks_c_5601-1989 EUC-KR
ksc_5601 EUC-KR
ksc5601 EUC-KR
chinese GB2312
csgb2312 GB2312
csiso58gb231280 GB2312
gb_2312-80 GB2312
iso-ir-58 GB2312
zh_cn.euc GB2312
850 IBM850
cp850 IBM850
csIBM850 IBM850
852 IBM852
cp852 IBM852
csIBM852 IBM852
855 IBM855
cp855 IBM855
csIBM855 IBM855
857 IBM857
cp857 IBM857
csIBM857 IBM857
862 IBM862
cp862 IBM862
csIBM862 IBM862
864 IBM864
cp864 IBM864
csIBM864 IBM864
864i IBM864i
cp864i IBM864i
csibm864i IBM864i
866 IBM866
cp-866 IBM866
cp866 IBM866
csIBM866 IBM866
iso-2022-cn-ext ISO-2022-CN
csiso2022jp2 ISO-2022-JP
csiso2022jp ISO-2022-JP
iso-2022-jp-2 ISO-2022-JP
csiso2022kr ISO-2022-KR
cp819 ISO-8859-1
csisolatin1 ISO-8859-1
ibm819 ISO-8859-1
iso-ir-100 ISO-8859-1
l1 ISO-8859-1
latin1 ISO-8859-1
csisolatin6 ISO-8859-10
iso-ir-157 ISO-8859-10
l6 ISO-8859-10
latin6 ISO-8859-10
csisolatin2 ISO-8859-2
iso-ir-101 ISO-8859-2
l2 ISO-8859-2
latin2 ISO-8859-2
csisolatin3 ISO-8859-3
iso-ir-109 ISO-8859-3
l3 ISO-8859-3
latin3 ISO-8859-3
csisolatin4 ISO-8859-4
iso-ir-110 ISO-8859-4
l4 ISO-8859-4
latin4 ISO-8859-4
csisolatincyrillic ISO-8859-5
cyrillic ISO-8859-5
iso-ir-144 ISO-8859-5
arabic ISO-8859-6
asmo-708 ISO-8859-6
csisolatinarabic ISO-8859-6
ecma-114 ISO-8859-6
iso-ir-127 ISO-8859-6
csiso88596e ISO-8859-6-E
csiso88596i ISO-8859-6-I
csisolatingreek ISO-8859-7
ecma-118 ISO-8859-7
elot_928 ISO-8859-7
greek8 ISO-8859-7
greek ISO-8859-7
iso-ir-126 ISO-8859-7
sun_eu_greek ISO-8859-7
csisolatinhebrew ISO-8859-8
hebrew ISO-8859-8
iso-ir-138 ISO-8859-8
visual ISO-8859-8
csiso88598e ISO-8859-8-E
csiso88598i ISO-8859-8-I
csisolatin5 ISO-8859-9
iso-ir-148 ISO-8859-9
l5 ISO-8859-9
latin5 ISO-8859-9
csiso111ecmacyrillic ISO-IR-111
ecma-cyrillic ISO-IR-111
csshiftjis Shift_JIS
ms_kanji Shift_JIS
windows-31j Shift_JIS
x-sjis Shift_JIS
csiso103t618bit T.61-8bit
iso-ir-103 T.61-8bit
t.61 T.61-8bit
646 us-ascii
ansi_x3.4-1968 us-ascii
ascii us-ascii
csunicode11 UTF-16BE
csunicodeascii UTF-16BE
csunicodelatin1 UTF-16BE
csunicode UTF-16BE
iso-10646-j-1 UTF-16BE
iso-10646-ucs-2 UTF-16BE
iso-10646-ucs-basic UTF-16BE
iso-10646-unicode-latin1 UTF-16BE
iso-10646 UTF-16BE
x-iso-10646-ucs-2-be UTF-16BE
x-iso-10646-ucs-2-le UTF-16LE
iso-10646-ucs-4 UTF-32BE
x-iso-10646-ucs-4-be UTF-32BE
x-iso-10646-ucs-4-le UTF-32LE
csunicode11utf7 UTF-7
unicode-1-1-utf-7 UTF-7
unicode-2-0-utf-7 UTF-7
x-unicode-2-0-utf-7 UTF-7
unicode-1-1-utf-8 UTF-8
csviqr VIQR
csviscii VISCII
cp1250 windows-1250
x-cp1250 windows-1250
ansi-1251 windows-1251
cp1251 windows-1251
x-cp1251 windows-1251
cp1252 windows-1252
x-cp1252 windows-1252
cp1253 windows-1253
x-cp1253 windows-1253
cp1254 windows-1254
x-cp1254 windows-1254
cp1255 windows-1255
x-cp1255 windows-1255
cp1256 windows-1256
x-cp1256 windows-1256
cp1257 windows-1257
x-cp1257 windows-1257
cp1258 windows-1258
x-cp1258 windows-1258
ibm874 windows-874
cns11643 x-euc-tw
zh_tw-euc x-euc-tw
gbk x-gbk
csMacintosh x-mac-roman
macintosh x-mac-roman
mac x-mac-roman
ks_c_5601-1987 x-windows-949

Input only substitutions

In the following cases a specified encoding is interpreted as another encoding (usually a superset) when decoding source documents, but not when outputting text in the given encoding.

Encoding Interpreted as
GB2312 x-gbk
ISO-8859-11 windows-874
TIS-620 windows-874
us-ascii windows-1252
ISO-8859-1 windows-1252
EUC-KR x-windows-949

Sources

This information is taken from the source files in http://mxr.mozilla.org/mozilla1.9.1/source/intl/uconv/, chiefly from http://mxr.mozilla.org/mozilla1.9.1/source/intl/uconv/src/charsetalias.properties, processed to remove all lines where the alias and canonical name are identical in a case-insensitive comparison ignoring ".", "-" and "_".