Dhivehi automatic speech recognition system

Abstract

This report details the work done to create a speech recognition solution for Dhivehi language. The system was developed using CMUSphinx speech recognition toolkit, which requires the development of a text corpus to use as output, a phonetic dictionary (list of phonemes), a language model (probabilistic representation of word occurrences in language) and an acoustic model (mapping voice features to text). The latter two can be trained using provided audio and text data. The development of our ASR system was carried out in two phases. The first phase dealt only with numbers (covering real numbers from 0 (inclusive), up to but not including 1 trillion). The second phase dealt with the entire Dhivehi language (with exceptions: it does not support thikijehi thaana and can only pick up the common Malé dialect). The system developed during the first phase had an accuracy rate of 75% (which barely passed our set minimum acceptable rate), while the system developed during the second phase had an accuracy rate of 42.5% (which failed our set minimum acceptable rate).

Description

Keywords

Speech recognition solution, Dhivehi Langauge, Language model, Acoustic model, Dhivehi phonetics, The ASR system, Number recognition, Voice pitch, Dhivehi typography, Automatic Speech Recognition

Citation

Hassan, I., Ifham, M., Rasheed, A.R., & Mohamed, Y. (2018). Dhivehi automatic speech recognition system (Project, Faculty of Engineering Science and Technology, Maldives National University). Retrieved from saruna.mnu.edu.mv

Endorsement

Review

Supplemented By

Referenced By