Multimedia Services: Speech Recognition Service
This module provides a speech recognition service to the OWCF.
The service provides two kinds of speech recognition engines:
- a command and control engine
- an Information Retrieval (IR) based speech recognition engine, whose performance have being reported to be superior or classical ASR engines.
The first engine aims at recognizing precisely the speech commands; the second engine aims at matching the input speech with a target described by a set of words.
Among the services already implanted and under integration with the OWCF we can mention the voice mouse functionality. Under this functionality, the service provides displacement information as a function of a continuous pronunciation of a user defined phoneme.
The basic idea of the speech tool kit is to provide the OWCF with a speech recognition component, abstracting the developer from the details of handling the audio hardware directly (which is a real time task). The Speech Recognition component, as in the case of the WUI toolkit has a process organized in a process chain. The following figure shows the separation of concerns approach taken in the case of the speech recognizer component.
The overall design of the Speech Recognition Toolkit follows the event driven approach, and consequently makes use of the publisher/subscriber design pattern. The following Figure shows the basic components of the architecture, their interfaces to other systems and the possible separation of components into different runtime environments.
The Application only needs to specify the speech grammar file, in order to have command recognized. In a event driven approach the speech recognizer will send an event in the case of a command is recognized by the Speech Recognizer component.
System requirements
The Speech Recognition component has the following requirements:
- Java JDK 1.5
- Linux, Windows XP, CE, SmartPhones, Symbian or embedded processors with audio support (Xscale, ARM processors, which includes the port to QBIC).
Maturity Level (?)
| Component Name | Responsible Partner | Initial Maturity Level | Current Maturity Level |
| Multimedia Services: Speech Recognition Service | Multitel | 4 | 6 |


