A community of Frontier
and Radio Users


Meridian News


Community List


Regex Project

RE: Simplified Chinese (GB2312) in Manila

Posted 
Last Modified 
In Response To 
 
2/23/2002; 1:03 AM by Emmanuel. M. Decarie
2/23/2002; 1:03 AM by Emmanuel. M. Decarie
RE: Simplified Chinese (GB2312) in Manila (#16164)
Reply To This Message [Edit]
Thinking more about this, and rereading this page:
<http://www.ascc.net/xml/en/utf-8/faq/zhl10n-faq-xsl2.html>, I think
I'll follow the advice from the author:

>I personally feel that the answer is neither dictionaries nor
>parsing but markup. When entering the data, there should either be
>markup of all proper nouns or markup of all word boundaries.
>
>The last is simplest: when typing in Chinese, put in a space between
>words! That system works well in the West, keyboards already have
>spacebars, and it is simple enough for people to do when they have
>the habit. The spaces should be ignored by the publication system:
>they should be treated as "zero-width spaces".

I could tell the users that if it want its chinese text to be
indexed, he need to split chinese word with space. I think that if I
start from such a text to implement indexing and searching, its going
to be much more simpler.

What do you think Nobumi (or others interested in the discussion),
does it make sense for you?

Cheers
-Emmanuel

--
______________________________________________________________________
Emmanuel Décarie / Programmation pour le Web - Programming for the Web
Frontier - Perl - Javascript - XML <http://scriptdigital.com/>

Enclosures


None.  

Replies







I'm just catching up, and haven't followed all the links. But it seems like you have a winnah, Emmanuel...! It would require