A community of Frontier
and Radio Users


Meridian News


Community List


Regex Project

RE: Simplified Chinese (GB2312) in Manila

Posted 
Last Modified 
In Response To 
 
2/23/2002; 12:03 AM by Nobumi Iyanaga
2/23/2002; 12:03 AM by Nobumi Iyanaga
RE: Simplified Chinese (GB2312) in Manila (#16162)
Reply To This Message [Edit]
Hello Samuel and Brian,

>Read on the web at http://community.scriptmeridian.org/16162
>----------------------------------
>
>On 2/22/2002 9:45 AM, Samuel Reynolds <sam@spinwardstars.com> wrote:
>
> >UTF-8 is the 8-bit subset of unicode that corresponds to
> >the ISO-8859 (Latin-1) "extended ASCII" character set that
> >is the Windows default set. Character 0xnn in UTF-8 is
> >always character 0x00nn in unicode.
>
>UTF-8 only preserves the ASCII character set. You may be thinking of
>ISO-8859-1, in that Unicode chars U+0000 to U+00FF are exactly the
>characters in ISO-8859-1. Win1252 is very similar to ISO-8859-1 but not
>identical.
>
>http://www.ietf.org/rfc/rfc2279.txt
>http://www.microsoft.com/globaldev/reference/sbcs/1252.htm
>http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-1.TXT
>http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
>

Yes, as long as ASCII characters are concerned, there is no problem. But
with accented characters, double-byte characters, all the text is garbled...

Best regards,

Nobumi Iyanaga
Tokyo,
Japan

Enclosures


None.  

Replies


None.