Section 1/4

Notes from the Editor

###Ethio Science & Technology### is a weekly column presented to the Cleo community. It covers various issues in the science; in particular, those of more related to Ethiopia.

Last week, I contacted Glenn Adams, a technical director, Unicode Consortium, and received the latest proposal on Ethiopic script. According to his words:

"...Unicode has been working with a number of groups on this subject, such as the School of Oriental and African Studies at Univ. of London; however, I'm sure your group's input would be very valuable."

Today's column is dedicated on Unicode draft proposal on Ethiopic script. It examines the proposal in a brief manner stressing the drawback. Hopefully, this will open a wide range of discussion here at Cleo. Those of you who are knowledgeable and in particular involved in software design have historical responsibility in shaping what is at stakes.

Some important terms:

ISO: The International Organization for Standardization

ISO-10646: An official ISO standard for the written form all languages.

Ethiopic: The Ge'ez alphabet used by Ethiopian languages such as Amharic, Oromigna, Tigrigna, and others.

symbol : A description of figure that represent sound. Example, "a", "b", and "c".

code: A unique digit assigned to a symbol (a character) in the form of DECIMAL, OCTAL, and HEXADECIMAL.

ASCII : American Standard Code for Information Interchange.

Unicode: Unicode Organization.

Section 2/4

Is Unicode Proposal on Ethiopic Script Acceptable?

Current trend

Yitna's Ethiopic Add On for WP, Amha's word processor, and my EthTeX use the same principle in encoding Ethiopic script. All designate a unique code for each letter or glyph unlike the mechanism used for Ethiopian typewriters. The current de facto standard, ASCII doesn't allow more than 256 characters in one set; therefore, one has to resort into different tricks to over come the limit. Surprisingly, all the above systems implement the same methodology in addressing the problem.

Ethiopian Typewriters

Typewriters are mechanical--means they depend on physically engraved ``type'' (symbol) and one-to-one impact. For example, on your typewriter, when you press "b", the key triggers the near by gadget so that the engraved "b" can be pressed against the ribbon forming the shape of "b" on the paper beneath the ribbon. Unlike computer dot matrix or laser printers, typewriters are not capable of producing graphics elements except horizontal and vertical lines. If you have the proper system, computer, and printer, you can display a space ship on screen, or print with only one key stroke.

Under such physical limitation, Ethiopic script had to be accommodated into typewriter while the quality leaves not much to be desired. You remember to type an Ethiopic letter "ca", the following procedure is required:

first you type "b" producing


| |

| |

| |

then you press "shift" and type "bar sign" in which the carriage returns one letter backward positioning the entered letter as current spot and producing the modified symbol shown below.


| |

| |

| |

This example demonstrates the basic principle of how the Ethiopian typewriters work.

Drawback of the Unicode Proposal on Ethiopic Script

Unicode latest proposal designates 128 spots for Ethiopic scripts from 1200-127F (hexadecimal.) It recognizes 37 what it calls underlying letters, 13 Ethiopian vowel which would constitute the entire Ethiopic script with the underlying letters, 20 Ethiopic numbers, and 7 punctuation marks. That is it. The proposal presents this encoding principle assuming a would be software implements a mechanism such as used by the Ethiopian typewriters. Thus, to produce a letter Ethiopic "m", two keys must be pressed one being the underlying letter and the other one the vowel. In general, any Ethiopic letter is generated using two keys; as a result, all syllables excluding the ``Ge'ez bate'' are constructed on the fly from two symbols as it is done on Ethiopian typewriters. Wowwww!!

The idea would have been a revolutionary one, if it wasn't for the discrepancy it consists of and disregard for the preservation of beauty, form, position, and authenticity of Ethiopic script.

(1) In my view, the proposal is not complete because it doesn't demonstrate fully as to how its ``analytical'' or ``phonetic'' encoding would produce Ethiopic script that is acceptable in typography. All the vowel presented by the proposal don't permit the construction of a great many letter in Ethiopic script.

(2) About 90% of the "sadis bate", "sabe bate", "rabe bate"; and 40% of the "hamis bate" including some others, must loose their current appearance in order to adhere to the proposal. From my own experience in designing Ethiopic script with a font designing software METAFONT, each letter must be designed independently to preserve its appearance. Yes, there are a number of letter that can be constructed from two symbols using ligature, but attempting to construct the entire font with this method takes us back to the typewriter world, which we have been trying to get rid off.

Imagine the letters of "m bate":

ge'ez qaibe salise rabe hamiss sadiss sabe

----- ----- ------ ---- ------ ------ ----

m mu mi ma mea me mo

According to the proposal, "m" is the underlying letter and the rest must come from the combination of the underlying letter and Ethiopian vowel. In other words, "me" must be constructed from "m" and a particular "vowel", which I couldn't find on the given chart.

(3) What is more to this? As indicated above, the proposal requires two keys to generate one Ethiopic letter. This is not just to display Ethiopic on screen, but also to store it in a file form. In short, a letter is represented with two 16 bit (4 byte). This will increase the size of a file by almost 100 percent. Here is how to generate "h",

you type key code

============ =====

____ 1200 \

constant "h" --------------/ \

|- 4 byte

Ethiopian vowel " " ______________ /

\____ 1230 /

Remember, this is not the same as pressing "shift" and typing a character to generate an upper case letter because when you press "shift" the computer switches to the upper case letter mode leaving the small case letter and vs.; therefore, the "shift" key impact fades right after it is released.

(4) A traditional trend evolved for centuries; that is whenever a language adopts Ethiopic for writing, a new letters are also invented to comply the phonetic behavior of that particular language. For instance, Ge'ez only uses 26 underlying letters, but Amharic 33. Now, the number of Ethiopic underlying letter has reached 37(?) and God knows when this will end. This tradition must be subsided and instead use "accents" to reflect various phonetic behaviors. What do you think Yitna? Am I stepping on the line?

Section 3/4

What is Next?

I don't claim or pretend to be a linguist because I am not. Nevertheless, I felt I must voice my view before the vote is counted. Unicode and ISO must be commended for their effort to bring Ethiopic into international standard.

Forgive me for my bluntest statement, but we Ethiopians are not going to see on the side line while the international community is doing what it can to promote our script. A tiny contribution in this regard counts a "million." I here by ask the Cleo community at large:-

1. to accept Grum's suggestion in forming a "Committee for Ethiopic Standard Code." Having said that:

a. If you would like to be a member of the committee, please send me a note until some sort of structure is in place.

b. I also take the liberty to invite Grum, Fesseha, Samuel, Teshager, Araya, and myself to become members of the committee.

c. I recommend Yitna to be the chair person for the committee.

2. to help collect documents and if necessary to assist financially.

3. to help disseminate the information to all concerned Ethiopians who don't belong to Cleo.

Don't I sound like Yeltsin of Russia?



Abass Belay Alamnehe /