The System for Ethiopic Representation in ASCII

-1995 Standard-


Changes for 1995

Changes in The System for Ethiopic Representation in ASCII, or SERA, are introduced August 1, 1995 to resolve issues of; ease of use, alternate preferences, and to fascilitate future applications.

Categorically additions to the convention are given for; multilingualism, WWW hypertext, gemination, the previously unaddressed Ethiopic question mark found in historic documents, and a mode for punctuation that maintains the appearance of punctuation in native form.

Revisions are also made to the rules of punctuation use for the purpose of simplification.

It is hoped that these changes for 1995 will make SERA more accessible for software implementation and easier to use by Fidel composers working in ASCII. A discussion of these additions and revisions follows. A FAQ is now available to discuss and answer questions about SERA comprehensively.


Multilingual Extensions

SERA was initially designed as a bilingual system for Ethiopic and one other script. This presents obvious limitations in the real world and so a mechanism and convention is required to fascilate this reality. A minor layer of abstraction added to the existing escape sequence already defined for graphic purposes solves both requirements.

It is necessary when the writer desires text to later be transcribed into another script, that the script to be transcribed into be in some way identified to the transcriber. Following the convention devised for HTML 3.0 for the same purpose; SERA parsers will use the ISO's 2 character and 3 character language tags following the extended escape ``\~''.

It is assumed that a document will be written primary in two languages --which may be written in one or two scripts. The regular script escape, ``\'' , always serves the two primary languages in the document. After switching to a third language, ``\'' will indicate a return to the last of the two major modes.

In example:

 1) ..latin.. \ ..fidel.. \~ar ..arabic.. \~el ..greek.. \ ..fidel..
 2) ..latin.. \ ..fidel.. \~ar ..arabic.. \~ar ..fidel..
*1) Hence ``\'' will always mean ``return to last'' of the 2 defaulted scripts.
*2) If an escape is found that would change the current language to the current language, the 2nd escape should mark the end-of-zone and function as per ``\'' in (1).


Revision of Vowels

The multilingual capability added to SERA allows for the correction of an earlier inconsistancy in the representation of the vowel forms of the first vowel group. The revision is as follows:

Independent Vowels:

e/a* u/U i a/A E I o/O e3
* The use of ``a'' for ``'' will only be applied when transcribing an Amharic document (``e'' remains valid as well). The alternative definition of ``A'' for ``the 4th form independent vowel'' will then be the only means in Amharic text to write the forth form vowel.


World Wide Web and HTML

Experience gained with Mule and the W3 web browser in '94-'95 indicates that the parsing issues should be clarified to avoid malformed hypertext documents.

When a SERA->Fidel transliterator precedes the text processing of an HTML parser, let it be the convention that text between `<' and `>' remain in Latin script. Let the rule be applied additionally between the HTML escapes & and ; . This requires that the SERA transliterator be aware that it is processing an HTML (likewise for SGML, VRML, etc.) document; the implementation of which is left to developer.


Punctuation Revision

Comprehensive rules for punctuation and escapes are provided in questions 3 and 5 in the SERA FAQ

In Ethiopic Zones


   ' is always ignored (current).

   ` is ignored unless a special vowel or consonant follows
         -- s,S,h, e,u\U,i,a\A,E,I,o\O

  \' and \` are the only way to give ` and ' in ethiopic zones.


Drop | as a special character it's purpose may be accomplished with
' and `' sufficiently.  The cost of using 2 characters instead of
one is minor considering the exceedingly rare use of | for vowel
emphasis.  The simplification benefits also justify this approach.

Examples:

ysTlN = y'sT'l'N tgrNa = tg'r'Na alfelgm = alfel'g'm TrE = T`'rE

Merits of Punctuation Revision

1) SERA now uses only ' and ` as special characters ( in addition to \ ). 2) Rules for ` and ' , \` and \' are more consistent. 3) Fewer rules over all to remember.

Addition for Gemination

Let '' be the Ethiopic gemination marker for SERA. Humans may interpret the markers without complication. Software can use the markers '' to place the ethiopic gemination .. over the preceding charter: English SERA Software Doubling .. alle ale'' (a)(le) .. yellem yele''m (ye)(le)(m) .. fellg fel''g (fe)(l)(g)

Archaic Question Mark

Previously untreated. Let \? denote the 3 dot, , question mark found infrequently in some historical texts. The extension for the question mark is provided primarily to fascilitate the republishing of those documents where it is found.

A Mode For ASCII-Glyph Based Composition of Ethiopic Punctuation

Ethiopic Punctuation Subzone \::

wordspace : : halfcolon : :- colon : -: period : :: quotation : << >> paragraph : :|: break

Rules: [ :: is Latin, (::) is Ge'ez ]

:: is always read as period (::) ::: is read as (::)(:) -period takes precedence :::: is read as (::)(::) n: = n/2 (::) if n even = (n-1)/2 (::) + (:) if n odd :': is read as (:)(:) This utilizes the existing separator ', likewise for ` . <<< is read as (<<)< (follow n: rules) <'< is read as << Note That the Ethiopic punctuation escapes (\< \_ \* etc.) are still recognized in the glyph based punctuation subzones. **Wordspace Buffering: In present day practices the Ge'ez wordspace is padded on each side with small amounts of white space for ease of reading: [Image of Passage] Were text transcribed literally into ASCII the result would appear as a continuous string: yalewna:yeneberew:yemimeTawum:hulu:yemigeza:gEta: amlak:-alfana:omEga:InE:neN:ylal:: To maintain the original ease of reading the ASCII colons, when used for wordspace, should be padded on each side with with ASCII space: yalewna : yeneberew : yemimeTawum : hulu : yemigeza : gEta: amlak:- alfana : omEga : InE : neN : ylal:: To return to the original form in fidel, the spaces used as padding should then be removed. Hence it shall be the convention the first ASCII space immediately following punctuation in glyph based punctuation subzones of Fidel script zones shall be removed. Further, the first space preceding Ge'ez wordspace shall also be removed.