================================================================ Ethiopian Encoding in Unicode/10646 Joe Becker February 26, 1995 This proposal constitutes a major revision of the Draft Proposal for Ethiopian Encoding of October 1992 published in Unicode Technical Report #1. Based on feedback received concerning the 1992 proposal, the basic coding scheme has been changed from an "underlying alphabetic" representation to a syllable-based representation. The current proposal consists of a chart, a character names list, and a block introduction. The content is based primarily on "View and Recommendation on The Unicode Technical Report #1 Draft Proposal on Ethiopian Script", by the Committee for Standardization of Ethiopian Script (CSES), August 1993. The members of the CSES include: Abass Alamnehe EthiO Systems and University of Houston Fesseha Atlaw Hewlett Packard and Dashen Engineering Company Tekle Awdew Informix, formerly IBM Tsehay Demeke Rensselaer Polytechnic Institute, formerly American Red Cross Yitna Firdyiwek Northern Virginia Community College Grum Ketema AT&T labs Theodros Kidane Hewlett Packard Samuel Kinde Virginia Polytechnic Institute Terrefe Ras-Work International Telecommunication Union Teshager Tesfaye Sun Microsystems Tadesse Tsegaye Pacific Gas and Electric Matewos Worku University of California Berkeley The titles of these gentlemen are available on request, suffice it to say that they represent the major body of native experience in implementation of computer systems for the Ethiopian script, not only within the US but also within their homelands. ================================================================ BLOCK INTRODUCTION Ethiopian U+1200 -> U+137F The Ethiopian script, which originally evolved for the archaic language Ge'ez, is currently used to write several languages of Eastern Africa, including Amharic, Tigre, and Oromo. The script continues to be extended for writing languages that have little tradition of printed typography; new characters to cover such extensions may added to the standard later as definitive information about them becomes available. Encoding Principles. The Ethiopian script is a syllabary that is traditionally presented as a 2-dimensional matrix of consonant-vowel combinations. The encoding follows this structure, in particular the codespace range U+1200 -> U+135F is interpreted as a matrix of 44 consonants crossed with 8 vowels. This arrangement is convenient for computer indexing as well as natural to users of the script. Variant Glyph Forms. Each cell in the codespace range U+1200 -> U+135F represents a conceptual syllable, which may or not actually be realized in one or another language that uses the Ethiopian script. Also, the same syllable in the same language may occasionally be represented by slightly different glyph forms in one or another font, analogous to the minor glyph variants of Latin lowercase "a" or "g" which do not co-occur in the same font. Therefore, the particular glyph shown in the code chart for each position in the matrix, and indeed the presence or absence of a glyph and character name for a given cell, are merely indicative of one common graphic interpretation of that conceptual syllable. In implementations, the selection of particular glyph forms to represent each conceptual syllable is the prerogative of the font designer. Also, in the relatively rare cases where ligatures are desired within the Ethiopian script, these may be provided as features of particular fonts, but they are not represented in the character encoding. Labialized Subseries. A few Ethiopian consonants have labialized ("W") forms that are traditionally allotted their own consonant group in the syllable matrix, though only a subset of the possible voweled forms are realized. These derivative syllables are encoded after the main alphabet, currently in the range U+1320 -> U+1347. Since the standard vowel series includes both "A" and "WA", there are two conceptual syllable cells corresponding to labial consonants followed by "WA", e.g. U+1247 ETHIOPIAN LETTER QWA // Q + WA U+1323 [unnamed potential QWA] // QW + A In these cases where the two syllables are equivalent, the Unicode encoding suggests that the "consonant + WA" entry in the main syllable chart be used in preference to the entry in the labialized subseries. The five specific cases that are currently encoded are: U+1247 ETHIOPIAN LETTER QWA is preferred to U+1323 U+124F ETHIOPIAN LETTER QHWA is preferred to U+132B U+1277 ETHIOPIAN LETTER HXWA is preferred to U+1333 U+1297 ETHIOPIAN LETTER KWA is preferred to U+133B U+12E7 ETHIOPIAN LETTER GWA is preferred to U+1343 Also, *within* the labialized subseries groups, the 6th vowel ("-E") forms are sometimes considered to be 2nd vowel ("-U") forms, e.g. U+1321 [unnamed potential QWU] U+1325 ETHIOPIAN LETTER QWE In these cases where the two syllables are nearly equivalent, the Unicode encoding suggests that the "-E" entry be used in preference to the "-U" entry. The five specific cases that are currently encoded are: U+1325 ETHIOPIAN LETTER QWE is preferred to U+1321 U+132D ETHIOPIAN LETTER QHWE is preferred to U+1329 U+1335 ETHIOPIAN LETTER HXWE is preferred to U+1331 U+133D ETHIOPIAN LETTER KWE is preferred to U+1339 U+1345 ETHIOPIAN LETTER GWE is preferred to U+1341 Keyboard Input. Because there are over 300 characters in the Ethiopian script, the units of keyboard input must constitute some smaller set of entities, typically 44+8 codes interpreted as the coordinates of the syllable matrix. Since these keyboard input codes are expected to be transient entities that are resolved into syllabic characters before they enter stored text, keyboard input codes are not specified in this standard. Letter Names. The Ethiopian script often has multiple letters corresponding to the same Latin letter, making it difficult to assign unique Latin names. Therefore the names list makes use of certain devices (such as doubling a Latin letter in the name) merely to create uniqueness; this has no relation to the phonetics of the Ethiopian letters. Encoding Order and Sorting. The order of the letters in the encoding is based on the traditional alphabetical order. This order may differ from the sort order used for one or another language, if only because in many languages various pairs or triplets of letters are treated as equivalent in the first sorting pass. For example, an Amharic dictionary may start out with a section headed by *three* H-like letters: U+1200 ETHIOPIAN LETTER HAE U+1210 ETHIOPIAN LETTER HHAE U+1270 ETHIOPIAN LETTER HXAE Thus the encoding order cannot and does not implement a collation procedure for any particular language using this script. Space Characters. The traditional word separator is U+1361 ETHIOPIAN WORDSPACE ( : ), but in modern usage a plain white wordspace is becoming common. A separate character U+1360 ETHIOPIAN SPACE has been provided for the latter usage, so that its (minimum) width may be set equal to that of the traditional wordspace if desired. Diacritical Marks. The Ethiopian script generally makes no use of diacritical marks, but there are two circumstances in which such marks have been applied. Marks are sometimes used for scholarly or didactic purposes, in particular U+0308 COMBINING DIAERESIS or U+030E COMBINING DOUBLE VERTICAL LINE ABOVE to indicate emphasis or gemination. In such cases, marks from the General Diacritical Marks range U+0300 -> U+036F should be used; as always, a diacritical mark follows the letter to which it applies. The second case is that marks (often with a bow-tie appearance) are sometimes used to create variant letterforms, e.g. for using the Ethiopian script to spell other languages. This approach does not apply to the Unicode encoding, whose elements are conceptual syllables rather than syllable glyphs. Extensions to the script for additional syllables should be added at the end of the syllable matrix. Encoding Structure. The Unicode block for the Ethiopian script is divided into the following ranges: U+1200 -> U+135F Syllabic letters U+1360 -> U+1367 Punctuation U+1368 -> U+137B Numbers U+137C -> U+137F Currently unassigned ================================================================ | CHARACTER NAMES LIST @ Letters 1200 ETHIOPIAN LETTER HAE 1201 ETHIOPIAN LETTER HU 1202 ETHIOPIAN LETTER HI 1203 ETHIOPIAN LETTER HA 1204 ETHIOPIAN LETTER HY 1205 ETHIOPIAN LETTER HE 1206 ETHIOPIAN LETTER HO 1207 1208 ETHIOPIAN LETTER LAE 1209 ETHIOPIAN LETTER LU 120A ETHIOPIAN LETTER LI 120B ETHIOPIAN LETTER LA 120C ETHIOPIAN LETTER LY 120D ETHIOPIAN LETTER LE 120E ETHIOPIAN LETTER LO 120F ETHIOPIAN LETTER LWA 1210 ETHIOPIAN LETTER HHAE 1211 ETHIOPIAN LETTER HHU 1212 ETHIOPIAN LETTER HHI 1213 ETHIOPIAN LETTER HHA 1214 ETHIOPIAN LETTER HHY 1215 ETHIOPIAN LETTER HHE 1216 ETHIOPIAN LETTER HHO 1217 ETHIOPIAN LETTER HHWA 1218 ETHIOPIAN LETTER MAE 1219 ETHIOPIAN LETTER MU 121A ETHIOPIAN LETTER MI 121B ETHIOPIAN LETTER MA 121C ETHIOPIAN LETTER MY 121D ETHIOPIAN LETTER ME 121E ETHIOPIAN LETTER MO 121F ETHIOPIAN LETTER MWA 1220 ETHIOPIAN LETTER SZAE 1221 ETHIOPIAN LETTER SZU 1222 ETHIOPIAN LETTER SZI 1223 ETHIOPIAN LETTER SZA 1224 ETHIOPIAN LETTER SZY 1225 ETHIOPIAN LETTER SZE 1226 ETHIOPIAN LETTER SZO 1227 ETHIOPIAN LETTER SZWA 1228 ETHIOPIAN LETTER RAE 1229 ETHIOPIAN LETTER RU 122A ETHIOPIAN LETTER RI 122B ETHIOPIAN LETTER RA 122C ETHIOPIAN LETTER RY 122D ETHIOPIAN LETTER RE 122E ETHIOPIAN LETTER RO 122F ETHIOPIAN LETTER RWA 1230 ETHIOPIAN LETTER SAE 1231 ETHIOPIAN LETTER SU 1232 ETHIOPIAN LETTER SI 1233 ETHIOPIAN LETTER SA 1234 ETHIOPIAN LETTER SY 1235 ETHIOPIAN LETTER SE 1236 ETHIOPIAN LETTER SO 1237 ETHIOPIAN LETTER SWA 1238 ETHIOPIAN LETTER SHAE 1239 ETHIOPIAN LETTER SHU 123A ETHIOPIAN LETTER SHI 123B ETHIOPIAN LETTER SHA 123C ETHIOPIAN LETTER SHY 123D ETHIOPIAN LETTER SHE 123E ETHIOPIAN LETTER SHO 123F ETHIOPIAN LETTER SHWA 1240 ETHIOPIAN LETTER QAE 1241 ETHIOPIAN LETTER QU 1242 ETHIOPIAN LETTER QI 1243 ETHIOPIAN LETTER QA 1244 ETHIOPIAN LETTER QY 1245 ETHIOPIAN LETTER QE 1246 ETHIOPIAN LETTER QO 1247 ETHIOPIAN LETTER QWA 1248 ETHIOPIAN LETTER QHAE this series: Tigrigna 1249 ETHIOPIAN LETTER QHU 124A ETHIOPIAN LETTER QHI 124B ETHIOPIAN LETTER QHA 124C ETHIOPIAN LETTER QHY 124D ETHIOPIAN LETTER QHE 124E ETHIOPIAN LETTER QHO 124F ETHIOPIAN LETTER QHWA 1250 ETHIOPIAN LETTER BAE 1251 ETHIOPIAN LETTER BU 1252 ETHIOPIAN LETTER BI 1253 ETHIOPIAN LETTER BA 1254 ETHIOPIAN LETTER BY 1255 ETHIOPIAN LETTER BE 1256 ETHIOPIAN LETTER BO 1257 ETHIOPIAN LETTER BWA 1258 ETHIOPIAN LETTER VAE 1259 ETHIOPIAN LETTER VU 125A ETHIOPIAN LETTER VI 125B ETHIOPIAN LETTER VA 125C ETHIOPIAN LETTER VY 125D ETHIOPIAN LETTER VE 125E ETHIOPIAN LETTER VO 125F ETHIOPIAN LETTER VWA 1260 ETHIOPIAN LETTER TAE 1261 ETHIOPIAN LETTER TU 1262 ETHIOPIAN LETTER TI 1263 ETHIOPIAN LETTER TA 1264 ETHIOPIAN LETTER TY 1265 ETHIOPIAN LETTER TE 1266 ETHIOPIAN LETTER TO 1267 ETHIOPIAN LETTER TWA 1268 ETHIOPIAN LETTER CAE 1269 ETHIOPIAN LETTER CU 126A ETHIOPIAN LETTER CI 126B ETHIOPIAN LETTER CA 126C ETHIOPIAN LETTER CY 126D ETHIOPIAN LETTER CE 126E ETHIOPIAN LETTER CO 126F ETHIOPIAN LETTER CWA 1270 ETHIOPIAN LETTER HXAE 1271 ETHIOPIAN LETTER HXU 1272 ETHIOPIAN LETTER HXI 1273 ETHIOPIAN LETTER HXA 1274 ETHIOPIAN LETTER HXY 1275 ETHIOPIAN LETTER HXE 1276 ETHIOPIAN LETTER HXO 1277 ETHIOPIAN LETTER HXWA 1278 ETHIOPIAN LETTER NAE 1279 ETHIOPIAN LETTER NU 127A ETHIOPIAN LETTER NI 127B ETHIOPIAN LETTER NA 127C ETHIOPIAN LETTER NY 127D ETHIOPIAN LETTER NE 127E ETHIOPIAN LETTER NO 127F ETHIOPIAN LETTER NWA 1280 ETHIOPIAN LETTER NYAE 1281 ETHIOPIAN LETTER NYU 1282 ETHIOPIAN LETTER NYI 1283 ETHIOPIAN LETTER NYA 1284 ETHIOPIAN LETTER NYY 1285 ETHIOPIAN LETTER NYE 1286 ETHIOPIAN LETTER NYO 1287 ETHIOPIAN LETTER NYWA 1288 ETHIOPIAN LETTER AAE 1289 ETHIOPIAN LETTER AU 128A ETHIOPIAN LETTER AI 128B ETHIOPIAN LETTER AA 128C ETHIOPIAN LETTER AY 128D ETHIOPIAN LETTER AE 128E ETHIOPIAN LETTER AO 128F ETHIOPIAN LETTER AWA 1290 ETHIOPIAN LETTER KAE 1291 ETHIOPIAN LETTER KU 1292 ETHIOPIAN LETTER KI 1293 ETHIOPIAN LETTER KA 1294 ETHIOPIAN LETTER KY 1295 ETHIOPIAN LETTER KE 1296 ETHIOPIAN LETTER KO 1297 ETHIOPIAN LETTER KWA 1298 ETHIOPIAN LETTER KXAE 1299 ETHIOPIAN LETTER KXU 129A ETHIOPIAN LETTER KXI 129B ETHIOPIAN LETTER KXA 129C ETHIOPIAN LETTER KXY 129D ETHIOPIAN LETTER KXE 129E ETHIOPIAN LETTER KXO 129F ETHIOPIAN LETTER KXWA 12A0 ETHIOPIAN LETTER WAE 12A1 ETHIOPIAN LETTER WU 12A2 ETHIOPIAN LETTER WI 12A3 ETHIOPIAN LETTER WA 12A4 ETHIOPIAN LETTER WY 12A5 ETHIOPIAN LETTER WE 12A6 ETHIOPIAN LETTER WO 12A7 12A8 ETHIOPIAN LETTER OAE 12A9 ETHIOPIAN LETTER OU 12AA ETHIOPIAN LETTER OI 12AB ETHIOPIAN LETTER OA 12AC ETHIOPIAN LETTER OY 12AD ETHIOPIAN LETTER OE 12AE ETHIOPIAN LETTER OO 12AF 12B0 ETHIOPIAN LETTER ZAE 12B1 ETHIOPIAN LETTER ZU 12B2 ETHIOPIAN LETTER ZI 12B3 ETHIOPIAN LETTER ZA 12B4 ETHIOPIAN LETTER ZY 12B5 ETHIOPIAN LETTER ZE 12B6 ETHIOPIAN LETTER ZO 12B7 ETHIOPIAN LETTER ZWA 12B8 ETHIOPIAN LETTER ZHAE 12B9 ETHIOPIAN LETTER ZHU 12BA ETHIOPIAN LETTER ZHI 12BB ETHIOPIAN LETTER ZHA 12BC ETHIOPIAN LETTER ZHY 12BD ETHIOPIAN LETTER ZHE 12BE ETHIOPIAN LETTER ZHO 12BF ETHIOPIAN LETTER ZHWA 12C0 ETHIOPIAN LETTER YAE 12C1 ETHIOPIAN LETTER YU 12C2 ETHIOPIAN LETTER YI 12C3 ETHIOPIAN LETTER YA 12C4 ETHIOPIAN LETTER YY 12C5 ETHIOPIAN LETTER YE 12C6 ETHIOPIAN LETTER YO 12C7 12C8 ETHIOPIAN LETTER DAE 12C9 ETHIOPIAN LETTER DU 12CA ETHIOPIAN LETTER DI 12CB ETHIOPIAN LETTER DA 12CC ETHIOPIAN LETTER DY 12CD ETHIOPIAN LETTER DE 12CE ETHIOPIAN LETTER DO 12CF ETHIOPIAN LETTER DWA 12D0 ETHIOPIAN LETTER DDAE this series: Oromiffa -- note: variant glyphs exist for this series 12D1 ETHIOPIAN LETTER DDU 12D2 ETHIOPIAN LETTER DDI 12D3 ETHIOPIAN LETTER DDA 12D4 ETHIOPIAN LETTER DDY 12D5 ETHIOPIAN LETTER DDE 12D6 ETHIOPIAN LETTER DDO 12D7 ETHIOPIAN LETTER DDWA 12D8 ETHIOPIAN LETTER JAE 12D9 ETHIOPIAN LETTER JU 12DA ETHIOPIAN LETTER JI 12DB ETHIOPIAN LETTER JA 12DC ETHIOPIAN LETTER JY 12DD ETHIOPIAN LETTER JE 12DE ETHIOPIAN LETTER JO 12DF ETHIOPIAN LETTER JWA 12E0 ETHIOPIAN LETTER GAE 12E1 ETHIOPIAN LETTER GU 12E2 ETHIOPIAN LETTER GI 12E3 ETHIOPIAN LETTER GA 12E4 ETHIOPIAN LETTER GY 12E5 ETHIOPIAN LETTER GE 12E6 ETHIOPIAN LETTER GO 12E7 ETHIOPIAN LETTER GWA 12E8 ETHIOPIAN LETTER THAE 12E9 ETHIOPIAN LETTER THU 12EA ETHIOPIAN LETTER THI 12EB ETHIOPIAN LETTER THA 12EC ETHIOPIAN LETTER THY 12ED ETHIOPIAN LETTER THE 12EE ETHIOPIAN LETTER THO 12EF ETHIOPIAN LETTER THWA 12F0 ETHIOPIAN LETTER CHAE 12F1 ETHIOPIAN LETTER CHU 12F2 ETHIOPIAN LETTER CHI 12F3 ETHIOPIAN LETTER CHA 12F4 ETHIOPIAN LETTER CHY 12F5 ETHIOPIAN LETTER CHE 12F6 ETHIOPIAN LETTER CHO 12F7 ETHIOPIAN LETTER CHWA 12F8 ETHIOPIAN LETTER PHAE 12F9 ETHIOPIAN LETTER PHU 12FA ETHIOPIAN LETTER PHI 12FB ETHIOPIAN LETTER PHA 12FC ETHIOPIAN LETTER PHY 12FD ETHIOPIAN LETTER PHE 12FE ETHIOPIAN LETTER PHO 12FF ETHIOPIAN LETTER PHWA 1300 ETHIOPIAN LETTER TSAE 1301 ETHIOPIAN LETTER TSU 1302 ETHIOPIAN LETTER TSI 1303 ETHIOPIAN LETTER TSA 1304 ETHIOPIAN LETTER TSY 1305 ETHIOPIAN LETTER TSE 1306 ETHIOPIAN LETTER TSO 1307 ETHIOPIAN LETTER TSWA 1308 ETHIOPIAN LETTER TZAE 1309 ETHIOPIAN LETTER TZU 130A ETHIOPIAN LETTER TZI 130B ETHIOPIAN LETTER TZA 130C ETHIOPIAN LETTER TZY 130D ETHIOPIAN LETTER TZE 130E ETHIOPIAN LETTER TZO 130F 1310 ETHIOPIAN LETTER FAE 1311 ETHIOPIAN LETTER FU 1312 ETHIOPIAN LETTER FI 1313 ETHIOPIAN LETTER FA 1314 ETHIOPIAN LETTER FY 1315 ETHIOPIAN LETTER FE 1316 ETHIOPIAN LETTER FO 1317 ETHIOPIAN LETTER FWA 1318 ETHIOPIAN LETTER PAE 1319 ETHIOPIAN LETTER PU 131A ETHIOPIAN LETTER PI 131B ETHIOPIAN LETTER PA 131C ETHIOPIAN LETTER PY 131D ETHIOPIAN LETTER PE 131E ETHIOPIAN LETTER PO 131F ETHIOPIAN LETTER PWA 1320 ETHIOPIAN LETTER QWAE 1321 1322 ETHIOPIAN LETTER QWI 1323 1324 ETHIOPIAN LETTER QWY 1325 ETHIOPIAN LETTER QWE 1326 1327 1328 ETHIOPIAN LETTER QHWAE this series: Tigergna 1329 132A ETHIOPIAN LETTER QHWI 132B 132C ETHIOPIAN LETTER QHWY 132D ETHIOPIAN LETTER QHWE 132E 132F 1330 ETHIOPIAN LETTER HXWAE 1331 1332 ETHIOPIAN LETTER HXWI 1333 1334 ETHIOPIAN LETTER HXWY 1335 ETHIOPIAN LETTER HXWE 1336 1337 1338 ETHIOPIAN LETTER KWAE 1339 133A ETHIOPIAN LETTER KWI 133B 133C ETHIOPIAN LETTER KWY 133D ETHIOPIAN LETTER KWE 133E 133F 1340 ETHIOPIAN LETTER GWAE 1341 1342 ETHIOPIAN LETTER GWI 1343 1344 ETHIOPIAN LETTER GWY 1345 ETHIOPIAN LETTER GWE 1346 1347 1348 1349 134A 134B 134C 134D 134E 134F 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 135A 135B 135C 135D 135E 135F @ Punctuation 1360 ETHIOPIAN SPACE 1361 ETHIOPIAN WORDSPACE 1362 ETHIOPIAN PERIOD 1363 ETHIOPIAN COMMA modern usage 1364 ETHIOPIAN SEMICOLON pre-modern usage like colon 1365 ETHIOPIAN COLON pre-modern usage like comma 1366 ETHIOPIAN QUESTION MARK archaic 1367 ETHIOPIAN PARAGRAPH SEPARATOR archaic @ Numbers 1368 ETHIOPIAN DIGIT ONE 1369 ETHIOPIAN DIGIT TWO 136A ETHIOPIAN DIGIT THREE 136B ETHIOPIAN DIGIT FOUR 136C ETHIOPIAN DIGIT FIVE 136D ETHIOPIAN DIGIT SIX 136E ETHIOPIAN DIGIT SEVEN 136F ETHIOPIAN DIGIT EIGHT 1370 ETHIOPIAN DIGIT NINE 1371 ETHIOPIAN NUMBER TEN 1372 ETHIOPIAN NUMBER TWENTY 1373 ETHIOPIAN NUMBER THIRTY 1374 ETHIOPIAN NUMBER FORTY 1375 ETHIOPIAN NUMBER FIFTY 1376 ETHIOPIAN NUMBER SIXTY 1377 ETHIOPIAN NUMBER SEVENTY 1378 ETHIOPIAN NUMBER EIGHTY 1379 ETHIOPIAN NUMBER NINETY 137A ETHIOPIAN NUMBER HUNDRED 137B ETHIOPIAN NUMBER TEN THOUSAND 137C 137D 137E 137F ================================================================