But before we can look at the solution that the JSR-204 expert group came up with, we need to learn some terminology.
Gautami Font Code Standard WhoseSupplementary characters are characters in the Unicode standard whose code points are above UFFFF, and which therefore cannot be described as single 16-bit entities such as the char data type in the Java programming language.Such characters are generally rare, but some are used, for example, as part of Chinese and Japanese personal names, and so support for them is commonly required for government applications in East Asian countries. The Java platform is being enhanced to enable processing of supplementary characters with minimal impact on existing applications. New low-level APIs enable operations on individual characters where necessary. These are now interpreted as UTF-16 sequences, and the implementations of these APIs is changed to correctly handle supplementary characters. The enhancements are part of version 5.0 of the Java 2 Platform, Standard Edition (J2SE). Besides explaining these enhancements in detail, this article also provides guidelines for application developers for determining and implementing necessary changes to enable use of the complete Unicode character set. Background Unicode was originally designed as a fixed-width 16-bit character encoding. The primitive data type char in the Java programming language was intended to take advantage of this design by providing a simple data type that could hold any character. However, it turned out that the 65,536 characters possible in a 16-bit encoding are not sufficient to represent all characters that are or have been used on planet Earth. The Unicode standard therefore has been extended to allow up to 1,112,064 characters. Those characters that go beyond the original 16-bit limit are called supplementary characters. Version 2.0 of the Unicode standard was the first to include a design to enable supplementary characters, but it was only in version 3.1 that the first supplementary characters were assigned. Version 5.0 of the J2SE is required to support version 4.0 of the Unicode standard, so it has to support supplementary characters. Support for supplementary characters is likely to also become a common business requirement in East Asian markets. Government applications are going to require them in order to correctly represent names that include rare Chinese characters. The Chinese government requires support for GB18030, a character encoding that encodes the entire Unicode character set, and so includes supplementary characters if Unicode version 3.1 or later is assumed. The Taiwanese standard CNS-11643 includes numerous characters that have been included in Unicode 3.1 as supplementary characters. The Hong Kong government defined a collection of characters that are needed for Cantonese, and some of these characters are supplementary characters in Unicode. Finally, some vendors in Japan are planning to use the large private use area in the supplementary character space for more than 50,000 kanji character variants in order to migrate from their proprietary systems to solutions based on the Java platform. The Java platform therefore not only has to support supplementary characters, but it also has to make it easy for applications to do the same. Since supplementary characters break a fundamental assumption of the Java programming language and might require a fundamental change in the programming model, an expert group was convened under the Java Community Process to choose the right solution for the problem. The group is called the JSR-204 expert group, using the number of the Java Specification Request for Unicode Supplementary Character Support. Technically, the decisions of the expert group only apply to the J2SE platform, but since the Java 2 Platform, Enterprise Edition (J2EE) sits on top of the J2SE platform, it benefits directly, and we expect that the configurations of the Java 2 Platform, Micro Edition (J2ME) will adopt the same design approach.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |