Title: Prosody Generation In Malay Language Speech Synthesizer

Abstract
: This paper describes the development of a Text-to-Speech (TTS) system for Malay language using intonation generation.  The system was developed based on two methods: concatenation of diphones waveforms and a prosodic model selection to control fundamental frequency and duration. The prosodic generation is part of the speech control module, which carries out the interface function, bridging the gap between the output of the block of text linguistic processing and the input of speech signal generation module. As a result, each voice segment (syllable) in a word being synthesized, is attributed to a set of pitch target values. Signal generation is implemented according to the prosody phrasing stream, which describes the phrase as a sequence of diphone phoneme codes with assigned duration and fundamental frequency values. To transform the base diphones to the required prosodic values, procedures used are close to Multi Band Resynthesis Overlap and Add (MBROLA) interface. The key steps in prosody generation based on MBROLA technology for concatenation TTS system is also described. Attention is paid to the ways of increasing naturalness of synthesized speech.

Authors: Siew Hock Ow , Roziati Zainuddin and David Wai Keong Loo






Back


IBIMA 2005 Conference   www.ibima.org