[Next | Appendix A] [Prev | Introduction Matter] [Up | Mail Pages]

RECOMMENDATIONS

The encoding structure for Ethiopian script proposed by Unicode has evolved in two draft stages. The original draft, which appeared in "Appendix E: Proposed Scripts for Future Versions of Unicode" of the Unicode Standard: Version 1.0, suggests that Ethiopian syllabic letters be "represented as whole codes, rather than by composition, because the composites have truly become the units of the script. . . " (p. 635). Subsequently, however, another draft proposal (The Unicode Technical Report #1 Draft Proposal on Ethiopian Script) was made in which the "whole codes" approach was discarded in favor of the "composition" approach. Our recommendation is that the proposal made in the original draft (using "whole codes" rather than "composition") be made the Unicode standard for Ethiopian script.

To that end we will attempt to answer some of the questions and problems that precipitated the second draft proposal, and present the encoding structure as well as the keyboard layout for rendering using the original draft proposal that we feel is more appropriate.

We strongly urge the Unicode Consortium to consider our recommendation which was made after careful study and examination of both draft proposals .


PROPOSED ENCODING STANDARD FOR ETHIOPIAN SCRIPT

ENCODING PRINCIPLE

1. The Ethiopian Script features Ethiopian punctuation marks, the syllabary, and numerals. We believe that a unique 16 bit representation must be used for each syllable of the Ethiopian script. We have here an attached hard copy of the script with its 16 bit hexadecimal code assignments that we feel is complete. (See Appendix A)

2. The Encoding Order Structure

U+1200 T0 U+120F         Ethiopian Punctuation
U+1210 T0 U+1330         Ethiopian Letters
U+1338 TO U+1347         Currently unassigned
U+1348 TO U+1367         Ethiopian Extended Letters
U+1368 TO U+137F         Ethiopian Numbers

ETHIOPIAN SCRIPT SORTING ORDER

1. The sorting order is adopted from the traditional Ethiopian layout of the syllables. Each consonant with all of its consonant/vowel variations precedes the next consonant.

2. The sorting order, therefore, adheres to the encoding order of the Ethiopian script set. If the need arises to adopt a different sorting order, it can be done at implementation level with out breaching the encoding order.

ETHIOPIAN SCRIPT FUTURE EXTENSION

If there is a need to further extend the Ethiopian script in the future, or in case the script is adopted by a new language, diacritical marks should be used to mark the changes instead of creating new elements of the alphabet.

ANSWERS TO QUESTIONS RAISED IN THE UNICODE TECHNICAL REPORT #l DRAFT PROPOSAL ON ETHIOPIAN SCRIPT

Although we disagree with the encoding method suggested in this draft proposal, we feel we can answer the questions raised regarding the Ethiopian script .

1. IS THIS COLLECTION MISSING ANY IMPORTANT, WELL-ESTABLISHED "EXTENSION" LETTERS FOR WRITING LESS-COMMON LANGUAGES?

No. We believe the collection is complete.

2. ARE THE GLYPHS IN THE CHARTS APPROPRIATE?

Yes. In our estimation the glyphs are appropriate.

3. CAN YOU SUPPLY DOCUMENTATION TO SUPPORT THE SPECIFICATION OF THE FOLLOWING TWO CHARACTERS?

121D ETHIOPIAN CONSONANT GG
1237 ETHIOPIAN VOWEL PHONETIC AE

IN PARTICULAR, DOES U+1237 OCCUR (AS A VOWEL, NOT AS A MARK OF "w" ROUNDING) ON ANY CONSONANT OTHER THAN U+1211? SHOULD THE COMBINATION OF U+1237 WITH U+1211 SIMPLY BE ENCODED AS A DISTINCT CONSONANT (TO BE ADDED BETWEEN CURRENT U+1211 AND U+1212)?

We cannot find any documentation to support the specification of 121D. 121D may be derived from 121C in the same manner that 120B (a labiodental fricative that occurs only in foreign words) is also derived from 120A. We believe it is appropriate to include 121D and to place it after 121C.

U+1237 does not occur on any consonant other than U+1211.

It should therefore be encoded as a distinct sound between U+1211 and U+1212.

4. ARE THE FOLLOWING CHARACTERS SPECIFIED CORRECTLY?

1256 ETHIOPIAN COMMA (MODERN USAGE LIKE COLON)

1257 ETHIOPIAN COLON (MODERN USAGE LIKE SEMICOLON)

1259 ETHIOPIAN NEW COMMA (MODERN USAGE)

Yes. The above specifications are correct.

5. DO SYLLABLE GLYPH VARIANTS EVER OCCUR DISTINCTIVELY WITHIN THE SAME TEXT, OR ARE THEY MERELY FONT DESIGN CHOICES like the glyph variants of Latin "a" or "g"?

No syllable glyph variants do not occur within the same text.

6. SHOULD WE DEFINE AN ETHIOPIAN WHITE SPACE CHARACTER WHICH CAN BE EASILY GUARANTEED TO HAVE THE SAME (MINIMUM) WIDTH AT U+1255 ETHIOPIAN WORDSPACE? CURRENTLY OPINION IS THAT THIS IS UNNECESSARY.

Yes. The Ethiopian White Space should be defined.


CONCERNS

Below, we have a list of concerns regarding the implications of using the second draft proposal (the "composition" method) for encoding the Ethiopian script.

1. The encoding principle used in the report requires that 4 octets be used per glyph. This is because every syllable in the proposal will be represented using two (or three in some cases) characters of 16 bits, one for the consonant and the other for the vowel(s). The consequences of this arrangement are:

-- doubling, if not more, of required storage space
-- doubling, if not more, of required processing time

2. The report treats the encoding and sorting orders independently of each other and leaves the sorting order completely to the language that implements the script. We feel the two need not be treated separately.

3. We feel the approach leaves some questions unanswered regarding the implementation of "rendering."

a. Having a ligature table that maps an incoming stream of pair letters (consonant+vowel) into a corresponding glyph from a source (likely a font) that consists of all Ethiopian visible glyphs undermines the standardization of the script since the nature of that source is unknown or uncontrolled by the encoding principles. Therefore, this rendering requires another standardization at implementation level on how the full set should be represented so that compatibility can be guaranteed to a "unique universal ligature table."

b. Given the encoding chart, if the syllables are treated as (consonant+vowel) pair on the level of visible glyph, the chart doesn't adequately capture the complete Ethiopian Script.

4. Part of the historical/cultural heritage of our Ethiopian script is the deliberate transition the language made away from the alphabet system and towards a syllabic form. Although the technical merits and demerits of the syllabic form may be debatable, we do not feel it is appropriate to simply "dismantle" the script into its original components even on grounds of technical simplification, which we are not convinced is achieved via this method. The vowels/consonants scheme used in the draft proposal might be useful in the process of designing keyboards for the Ethiopian script, but we see no benefit in using the scheme at the level of internal representation.