THE ETHIOPIAN MULTIMEDIA PC

Fisseha Mekuria, Ph. D.

Research Fellow, Sweden

Fisseham@isy.liu.se

Abstract:

This paper will discuss a possible future personal computer which is named as the Ethiopian Multimedia PC (EMPC). The EMPC is envisaged to contain all the services of a future multimedia PC such as Amharic, Oromigna or Tigrigna etc. speech recognizer, video editing, advanced graphic, voice control and text-to-speech (Amharic etc.) conversion possibilities. Some of the above multimedia services are already a reality, in a limited extent for the English and some other languages so why not extend this elegant user friendly multimedia PC technology to our Ethiopian environment ?

1.0 Introduction:

Intelligent terminals or personal computers which combine all the facilities of modern multimedia services and network capabilities such as modem, fax and video conferencing capabilities are under development. The technology for such a terminal is available, but integrating all these facilities requires an interdisciplinary research in the areas of hardware design and software programming of computers, real-time signal processing, human perception, linguistic and networking. The aim here is to open the door for a research effort for a future multimedia PC (MPC) which can incorporate, with some modification, available multimedia services and networking capability in the Ethiopian context. The treatise will be done as follows. First we will discuss available multimedia technology, later the services that can be derived from such an MPC will be described. It is very hard to contain all technology and services for a good MPC in this short article, the given references at the end of the article might be helpful for further reading.

2.0 Available Technology

The technology required for a MPC can be mainly divided into a software and hardware requirements. The software requirements can further be divided into:

a) control (operative systems and communication between the various PC parts and interfaces: modem, audio, video and network protocols such as ISDN & ATM etc.. )

b) Programs & algorithms (Programming environment & signal processing).

The hardware requirement can also be divided into

a) the main PC hardware and

b) the interface hardware for data, audio and video input and output. The integration of all these in a robust and cost effective manner will in the future represent the modern MPC.

One important property that is unique for a modern MPCs as compared to general purpose PCs, is that they are designed for applications dealing with real-time signals, such as speech, audio and video processing. For Amharic speech recognition for example, the MPC would require the following:

1) Speech input device interface : A noise free microphone, anti-aliasing filter, an analog to digital converter (ADC) with high enough resolution and sampling frequency, so that the input speech signal is not distorted before it reaches the signal processing algorithms.

2) The signal processing algorithms will have the task of deciding what sentence, word, phoneme or syllable is spoken through the microphone outgoing from the signal wave form. This operation is usually performed in stages: first a feature extraction is performed on the signal to filter out unwanted information or redundancy (The extracted features are also stored in memory as templates). Secondly these features have to be compared using a robust criteria with previously trained and stored templates for classification. Statistical and Neural Networks based classifiers are often used. A decision is made based on this comparison and the signal is classified into a spoken word.

3) If the speech recognizer is to be used as a voice input system to the MPC, ( just like the IBM Personal Digital Assistant, with a vocabulary of 80000 words) the MPC has to understand what has been said and activate an action such as, translate the Amharic spoken word into an English synonym word and why not ? speak the synonym through a speaker. This will further require a mathematical formalization of the acoustic , linguistic, grammatical and phonetic structure of the spoken language.

To be able to cope with the real-time demand of analog signal input, processing and output, MPCs require a form of data processing: the so- called digital signal processing (DSP). DSP devices offer precision and flexibility far more than attainable by analog electronic and equipment, for processing of signals. A recently appearing special purpose processors the so-called programmable DSP integrated circuit chips (PDSPs) promise the realization of modern MPCs by offering large computational throughput to meet the demand for the computation and processing of real-time signals such as audio and video. PDSP chips are powerful number crunchers for signal processing applications at the same time they are so flexible that they can be reprogrammed to perform different signal processing tasks in real-time. They can for example be a modem, fax, graphic processor and a video decoder at the same time.

3.0 Conclusion:

The envisaged Ethiopian MPC which will be based on a modern MPC technology can provide, apart from the usual MPC services, the facilities of:

a) A voice input interface for an Ethiopian language: For e.g. an Amharic speech understanding system. Since Ethiopian languages are unique and are not spoken anywhere else and since the future development of our country's economy and culture might depend in the interaction we have with our neighbors and international investors it seems reasonable to provide such a language understanding aid with a capability to start a dialogue between two (or more) speakers speaking different languages.

b) A text-to-speech conversion system. Here the proliferating Ethiopian word processing programs can be used to advantage to extend the limitation of the text into a fully understandable spoken Ethiopian language. Imagine an EMPC which could read the book "Fikir Iskemekabir" with a background instrumental "Tizita" music !! The formalization of the linguistic, grammatical and phonetical aspects will in turn enrich the respective Ethiopian language.

References:

[1] Geoff Bristow, " Electronic Speech Recognition", Collins Professional and technical Books,1986.

[2] J. Mariani, "Recent Advances in Speech Recognition", Proceedings ICASSP '89, Glasgow, Scotland, pp.429 - 441.

[3] S.E. Levinson, D.B. Roe," A Perspective on Speech Recognition" IEEE Communication Magazine, Jan. 1990, pp. 28 - 34.

[4] L.R. Rabiner "Applications of Voice Processing to Telecommunications", Proceedings of the IEEE, Feb. 1994, pp. 197- 231.