RE: Simplified Chinese (GB2312) in Manila
Posted
Last Modified
In Response To
2/22/2002; 11:03 PM by Nobumi IyanagaLast Modified
In Response To
2/22/2002; 11:03 PM by Nobumi Iyanaga
RE: Simplified Chinese (GB2312) in Manila (#16162)
Reply To This Message [Edit]
Hello Samuel and Brian,
>Read on the web at http://community.scriptmeridian.org/16162
>----------------------------------
>
>On 2/22/2002 9:45 AM, Samuel Reynolds <sam@spinwardstars.com> wrote:
>
> >UTF-8 is the 8-bit subset of unicode that corresponds to
> >the ISO-8859 (Latin-1) "extended ASCII" character set that
> >is the Windows default set. Character 0xnn in UTF-8 is
> >always character 0x00nn in unicode.
>
>UTF-8 only preserves the ASCII character set. You may be thinking of
>ISO-8859-1, in that Unicode chars U+0000 to U+00FF are exactly the
>characters in ISO-8859-1. Win1252 is very similar to ISO-8859-1 but not
>identical.
>
>http://www.ietf.org/rfc/rfc2279.txt
>http://www.microsoft.com/globaldev/reference/sbcs/1252.htm
>http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-1.TXT
>http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
>
Yes, as long as ASCII characters are concerned, there is no problem. But
with accented characters, double-byte characters, all the text is garbled...
Best regards,
Nobumi Iyanaga
Tokyo,
Japan
>Read on the web at http://community.scriptmeridian.org/16162
>----------------------------------
>
>On 2/22/2002 9:45 AM, Samuel Reynolds <sam@spinwardstars.com> wrote:
>
> >UTF-8 is the 8-bit subset of unicode that corresponds to
> >the ISO-8859 (Latin-1) "extended ASCII" character set that
> >is the Windows default set. Character 0xnn in UTF-8 is
> >always character 0x00nn in unicode.
>
>UTF-8 only preserves the ASCII character set. You may be thinking of
>ISO-8859-1, in that Unicode chars U+0000 to U+00FF are exactly the
>characters in ISO-8859-1. Win1252 is very similar to ISO-8859-1 but not
>identical.
>
>http://www.ietf.org/rfc/rfc2279.txt
>http://www.microsoft.com/globaldev/reference/sbcs/1252.htm
>http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-1.TXT
>http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
>
Yes, as long as ASCII characters are concerned, there is no problem. But
with accented characters, double-byte characters, all the text is garbled...
Best regards,
Nobumi Iyanaga
Tokyo,
Japan
Enclosures
None.
Replies
None.