RE: Simplified Chinese (GB2312) in Manila
Posted
Last Modified
In Response To
2/23/2002; 1:03 AM by Emmanuel. M. DecarieLast Modified
In Response To
2/23/2002; 1:03 AM by Emmanuel. M. Decarie
RE: Simplified Chinese (GB2312) in Manila (#16164)
Reply To This Message [Edit]
Thinking more about this, and rereading this page:
<http://www.ascc.net/xml/en/utf-8/faq/zhl10n-faq-xsl2.html>, I think
I'll follow the advice from the author:
>I personally feel that the answer is neither dictionaries nor
>parsing but markup. When entering the data, there should either be
>markup of all proper nouns or markup of all word boundaries.
>
>The last is simplest: when typing in Chinese, put in a space between
>words! That system works well in the West, keyboards already have
>spacebars, and it is simple enough for people to do when they have
>the habit. The spaces should be ignored by the publication system:
>they should be treated as "zero-width spaces".
I could tell the users that if it want its chinese text to be
indexed, he need to split chinese word with space. I think that if I
start from such a text to implement indexing and searching, its going
to be much more simpler.
What do you think Nobumi (or others interested in the discussion),
does it make sense for you?
Cheers
-Emmanuel
--
______________________________________________________________________
Emmanuel Décarie / Programmation pour le Web - Programming for the Web
Frontier - Perl - Javascript - XML <http://scriptdigital.com/>
<http://www.ascc.net/xml/en/utf-8/faq/zhl10n-faq-xsl2.html>, I think
I'll follow the advice from the author:
>I personally feel that the answer is neither dictionaries nor
>parsing but markup. When entering the data, there should either be
>markup of all proper nouns or markup of all word boundaries.
>
>The last is simplest: when typing in Chinese, put in a space between
>words! That system works well in the West, keyboards already have
>spacebars, and it is simple enough for people to do when they have
>the habit. The spaces should be ignored by the publication system:
>they should be treated as "zero-width spaces".
I could tell the users that if it want its chinese text to be
indexed, he need to split chinese word with space. I think that if I
start from such a text to implement indexing and searching, its going
to be much more simpler.
What do you think Nobumi (or others interested in the discussion),
does it make sense for you?
Cheers
-Emmanuel
--
______________________________________________________________________
Emmanuel Décarie / Programmation pour le Web - Programming for the Web
Frontier - Perl - Javascript - XML <http://scriptdigital.com/>
Enclosures
None.
Replies
RE: Simplified Chinese (GB2312) in Manila
2/23/2002 by jt
I'm just catching up, and haven't followed all the links. But it seems like you have a winnah, Emmanuel...! It would require
2/23/2002 by jt