Talk:Japanese language and computers

In relating to the Japanese language and computers, unique adaptation issues arise.

Most problems are not unique to Japanese, but common to other DBCS languages. Although the specific solutions are unique to Japanese.

Many problems relate to transliteration and romanization,

Romanization has little to do with problem, it's just a way of input method.

There are several standard methods to encode characters for use on a computer, including JIS, SJIS, EUC, and Unicode.

Strictly speaking, Unicode is not a character encoding, it's a coded character set.

While mapping the set of kana is a simple matter, kanji has proven more difficult. Because the Japanese kanji differ slightly or significantly from the corresponding characters in Chinese, it has proven both challenging and controversial to construct an encoding system which encompasses both Chinese and Japanese characters equitably.

Whether it corresponds to Chinese chracter is not problem (unless it is in relation to Unicode).

Unicode has been criticized in Japan (as well as in China and Korea) because it assigns the same code to similar characters from various East Asian languages, even though the character may varies in terms of form and pronunciation [1].

Pronunciation has nothing to do with problem.

Unicode is also criticized for failing to allow for older and alternate forms of kanji.

This has nothing to do with the problem since Unicode contains all JIS chracter set. The problem is Unicode uses different criteria of coding rule.

Though Japanese computer users have almost no trouble handling contemporary text, ancient Japanese language research has been considerably handicapped by this limitation.

This problem has led to the continued wide use of many encoding standards, despite increased Unicode use in other countries. For example, most Japanese e-mail and web pages are encoded in SJIS or JIS rather than Unicode. This has led to the problem of mojibake (misconverted characters) and much unreadable Japanese text on computers.

This paragraph doesn't make sense since it has nothing to do with ancient Japanese language, but rather, a problem of support of legacy data.

Japanese text input is a complicated matter not only because of the encoding problems discussed above

Text input has little to do with encoding, it is a matter of selecting a character.

but also because it is practically impossible to type all of characters used in Japanese writing system with a finite set of keys in keyboards. On modern computers, Japanese is input on a standard keyboard

What does standard of standard keyboard mean? Perhaps standard roman alphabet keyboard? Mobile phone keypad is another way of input, by the way.

via romanization

I think kana input is also popular.

combined with an Input Method Editor which allows the user to choose the correct characters from a list. There is also another method, known as Oyayubi shift, developed by Fujitsu, which allows direct kana input, but this method is now obsolete.

I don't understand why Oyayubi shift has to be mentioned here, while kana input is not mentioned at all.

Because a number of often-used characters are omitted in a standard character set such as JIS or even Unicode, gaiji (外字　external character) is sometimes used to supplement the character set.

Is gaiji really used along with Unicode? Curious since I'm not sure about this.

However, with the spread of computer networking and the Internet, gaiji is no longer used as frequently. As a result, omitted characters are written with similar or simpler characters in their place.

omitted characters are written with similar or simpler characters in their place. Is this correct? Shouldn't it be As a result, those chracters need to be replaced with similar or simpler characters.?

Fukumoto 18:00, 27 Feb 2004 (UTC)

The hastingsresearch unicode page [2] misrepresents the issues a lot. There's a rebuttal [3]. The article should be adjusted to remove the anti-Unicode bias which is wholly without basis.

130.233.18.89 03:41, 13 Mar 2004 (UTC)

The rebuttal comes from a member of Unicode standard committee. How can we expect a fairness from such a person? Anyway, this page might help us.

http://www-106.ibm.com/developerworks/unicode/library/u-secret.html

Also, I am going to merge those confusion and controversy into unicode article. It doesn't say controversay of unicode much. I believe unicode issues are political and cultural things not technical. As a programmer, I don't think Unicode is worse than Shift-JIS. But none of encode scheme is good by nature anyway. -- Taku 04:47, Mar 13, 2004 (UTC)