Advanced search

From:

To:

Items from 1 to 20 out of 32 results

chapter

Data-driven pause prediction for speech synthesis in storytelling style speech

Parakrant Sarkar, K. Sreenivasa Rao

2015 Twenty First National Conference on Communications (NCC) > 1 - 5

2015 Twenty First National Conference on Communications (NCC)

In the storyteller speech, pauses plays a significant role in introducing suspense and climax. Pauses are used to emphasize keywords, emotion-salient words and separate the phrases in the utterance. The objective of this work is to predict the position and duration of the pauses in the synthesized speech from the text-to-speech system. We analyzed the pause patterns in storyteller speech and classified...

chapter

Multilingual speech to speech translation system in bluetooth environment

M. D. Faizullah Ansari, R. S. Shaji, T. J. SivaKarthick, S. Vivek, more

2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT) > 1055 - 1058

2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT)

Voice Translator is speech to speech translation application for android mobile phone, which translates English speech to Hindi speech and vice versa. Voice Translator includes three modules, Voice Recognition, Machine Translation and Speech Synthesis. Voice Recognition module captures the voice or speech from the mobile user through speaker, identifies then converts the speech into text and then...

chapter

The development of syllable based text to speech system for Tamil language

M. Karthikadevi, K.G. Srinivasagan

2014 International Conference on Recent Trends in Information Technology > 1 - 6

2014 Fourth International Conference on Recent Trends in Information Technology (ICRTIT)

Speech synthesis is the most significant applications in linguistic communication process. The Text to Speech structure is the undertaking of accepts the input sentence and converts the audible speech as output. The Tamil language may be a syllable based language. A syllable is the unit of language, which may be spoken independent of the adjacent phones. It consists of an interrupted portion of sound,...

chapter

Cross-stream dependency modeling using continuous F0 model for HMM-based speech synthesis

Xin Wang, Zhen-Hua Ling, Li-Rong Dai

2012 8th International Symposium on Chinese Spoken Language Processing > 84 - 87

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In our previous work, we have presented a cross-stream dependency modeling method for hidden Markov model (HMM) based parametric speech synthesis. In this method, multi-space probability distribution (MSD) was adopted for F0 modeling and the voicing decision error influenced the accuracy of generated spectral features severely. Therefore, a cross-stream dependency modeling method using continuous...

chapter

Experiments on unsupervised statistical parametric speech synthesis

Jinfu Ni, Yoshinori Shiga, Hisashi Kawai, Hideki Kashioka

2012 8th International Symposium on Chinese Spoken Language Processing > 155 - 159

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In order to build web-based voicefonts, an unsupervised method is needed to automate the extraction of acoustic and linguistic properties of speech. This paper addresses the impact of automatic speech transcription on statistical parametric speech synthesis based on a single speaker's 100 hour speech corpus, focusing particularly on two factors of affecting speech quality: transcript accuracy and...

chapter

Automatic pronunciation prediction for text-to-speech synthesis of dialectal arabic in a speech-to-speech translation system

Sankaranarayanan Ananthakrishnan, Stavros Tsakalidis, Rohit Prasad, Prem Natarajan, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4957 - 4960

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Text-to-speech synthesis (TTS) is the final stage in the speech-tospeech (S2S) translation pipeline, producing an audible rendition of translated text in the target language. TTS systems typically rely on a lexicon to look up pronunciations for each word in the input text. This is problematic when the target language is dialectal Arabic, because the statistical machine translation (SMT) system usually...

chapter

Support vector regression fusion scheme in phone duration modeling

Alexandros Lazaridis, Iosif Mporas, Todor Ganchev, Nikos Fakotakis

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4732 - 4735

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

A fusion scheme of phone duration models (PDMs) is presented in this work. Specifically, a support vector regression (SVR)-fusion model is fed with the predictions of a group of independent PDMs operating in parallel. The American-English KED TIMIT and the Greek WCL-1 databases are used for evaluating the PDMs and the fusion scheme. The fusion scheme contributes to the accuracy improvement over the...

chapter

Automatically assessing acoustic manifestations of personality in speech

T Polzehl, S Moller, F Metze

2010 IEEE Spoken Language Technology Workshop > 7 - 12

2010 IEEE Spoken Language Technology Workshop (SLT 2010)

In this paper, we present first results on applying a personality assessment paradigm to speech input, and comparing human and automatic performance on this task. We cue a professional speaker to produce speech using different personality profiles and encode the resulting vocal personality impressions in terms of the Big Five NEO-FFI personality traits. We then have human raters, who do not know the...

chapter

A combined approach to the polysemy problems in a Chinese to Taiwanese TTS system

Yih-Jeng Lin, Ming-Shing Yu, Chin-Yu Lin

2010 7th International Symposium on Chinese Spoken Language Processing > 455 - 459

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

This paper proposes a combined approach to the polysemy problems in a Chinese to Taiwanese text-to-speech (TTS) system. Polysemy means there are words with more than one meaning or pronunciation. For example, there are two kinds of pronunciation for the word (he) in Taiwanese. They are /yi7/ and /yin7/. The first pronunciation, /yi7/, can mean `he' or `him'; and the other one, /yin7/, means `his'...

chapter

Improving GMM-based spectral conversion with optimal conversion function selection

Hsin-Te Hwang, Wen-Liang Wu, Sin-Horng Chen

2010 7th International Symposium on Chinese Spoken Language Processing > 392 - 396

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

We address the problem in the conventional Gaussian mixture model (GMM)-based spectral conversion from the viewpoint of optimal conversion function selection. The proposed method is motivated by that if the optimal conversion function based on minimum mel-cepstral distortion (MMCD) criterion can be selected during the conversion stage, the conversion performance in terms of mel-cepstral distortion...

chapter

Effects of F0 dimensions in perception of Mandarin tones

Bin Li, Caicai Zhang

2010 7th International Symposium on Chinese Spoken Language Processing > 322 - 325

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

This study focuses on the perception of two synthesized Mandarin tones: the high level tone (Tone 1) and the high falling tone (Tone 4), which have been reported difficult for Cantonese learners of Mandarin. As the two tones are distinctive in F0 directions and also vary in F0 onsets, it is worth investigating why Cantonese listeners find them perceptually indistinguishable. We aim to find out what...

chapter

The automatic prediction of Chinese text's prosodic structure based on tree structure

Yili Qian

2010 International Conference on Computer Application and System Modeling (ICCASM 2010) > 13 > V13-99 - V13-103

2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

The recognition of prosodic structure is an important research aspect in the field of Text-to-Speech. It is essential to improving the naturalness of machine-synthesized speech. This paper proposes an approach to predicting and assigning prosodic structure automatically for Chinese sentences based on their tree structures. It presents the modeling of a statistical language model based on the simply...

chapter

The study of Tibetan prosodic structure prediction model

Yu Hongzhi, Chen Chen, Chen Qi, Shi Jing

2010 2nd International Conference on Signal Processing Systems > 1 > V1-645 - V1-648

2010 2nd International Conference on Signal Processing Systems (ICSPS 2010)

Prosodic structure prediction plays a crucial role on the prosodic annotation of speech synthesis corpus as well as on improving the naturalness of synthesized speech. The paper studies Tibetan prosodic structure with Tibetan speech characteristics. Having analyzed a variety of variables that have an impact on Tibetan prosodic boundary, we obtain syllable boundary grammatical information, prosodic...

chapter

Classification of voice disorders in children with cochlear implantation and hearing aid using multiple classifier fusion

Z Mahmoudi, S Rahati, M M Ghasemi, V Asadpour, more

10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010) > 304 - 307

2010 10th International Conference on Information Sciences, Signal Processing and their Applications (ISSPA 2010)

Speech production and speech phonetic features gradually improve in children by obtaining audio feedback after cochlear implantation or using hearing aid. In this study, voice disorders in children with cochlear implantation and hearing aid are classified. 30 Persian children participated in the study, including 6 children in levels 1 to 3 and 12 in level 4. Voice samples of 5 isolated Persian words...

chapter

WordNet Based Sindhi Text to Speech Synthesis System

Javed Ahmed Mahar, Ghulam Qadir Memon, Syed Hyder Abbass Shah

2010 Second International Conference on Computer Research and Development > 20 - 24

Second International Conference on Computer Research and Development (ICCRD 2010)

The text-to-speech (TTS) synthesis technology enables machine to convert text into audible speech and used throughout the world to enhance the accessibility of the information. The important component of any TTS synthesis system is the database of sounds. In this study, three types of sound units i.e., phonemes, diphones and syllables are concatenated to produce natural sound for good quality Sindhi...

chapter

Support Vector Machine for Chinese Part-Of-Speech Tagging in Speech Synthesis Systems

Xiang Wang, Jianping Zhang, Yonghong Yan

2010 International Conference on Biomedical Engineering and Computer Science > 1 - 4

International Conference on Biomedical Engineering and Computer Science (ICBECS 2010)

The paper presents a support vector machine based Part-Of-Speech tagging on Chinese database which is part of our speech synthesis system. The model can be classified as SVM model and uses many sequential features to predict the POS tag. The text database was download from the internet with 1,280,000 words and 33 parts of Speech. The total accuracy of our experiments is 99.31%.

chapter

Refining Unit Boundaries for Mandarin Text-to-Speech Database

Minghui Dong, Ling Cen, P. Chan, Haizhou Li

2009 International Conference on Asian Language Processing > 245 - 248

2009 International Conference on Asian Language Processing (IALP 2009)

In unit selection based text-to-speech (TTS) synthesis, the accurate position of the unit boundaries in the unit selection database is one of the factors that determine the quality of the synthesized speech. To ensure the accuracy of the boundary positions, developers often have to manually verify the speech boundaries that are generated by automatic speech recognition techniques. In order to reduce...

chapter

Symbol based concatenation approach for Text to Speech System for Hindi using vowel classification technique

P. Chaudhury, M. Rao, K. Vinod Kumar

2009 World Congress on Nature&Biologically Inspired Computing (NaBIC) > 1082 - 1087

2009 World Congress on Nature & Biologically Inspired Computing (NaBIC 2009)

Indian languages such as Hindi is phonetic in nature. The text-to-speech (TTS) system for Hindi, exploits the phonetic nature of Hindi. The algorithm developed by us involves analysis of a sentence in terms of words and then symbols involving combination of pure consonants and vowel technique. Wave files are being merged as per the requirement to generate the modified consonants influenced by matras,...

chapter

An efficient and robust pitch marking algorithm on the speech waveform for TD-PSOLA

Aimilios Chalamandaris, Pirros Tsiakoulis, Sotiris Karabetsos, Spyros Raptis

2009 IEEE International Conference on Signal and Image Processing Applications > 397 - 401

2009 IEEE International Conference on Signal and Image Processing Applications (ICSIPA 2009)

In a Text-to-Speech system based on time-domain techniques that employ pitch-synchronous manipulation of the speech waveforms, one of the most important issues that affect the output quality is the way the analysis points of the speech signal are estimated and the actual points, i.e. the analysis pitchmarks. In this paper we present our methodology for calculating the pitchmarks of a speech waveform,...

chapter

Evaluation of the Concatenative Turkish Text-to-Speech System

Z. Orhan, Z. Gormez

2009 2nd International Congress on Image and Signal Processing > 1 - 5

2009 2nd International Congress on Image and Signal Processing (CISP)

In this study, the framework of a concatenative text-to-speech system for Turkish is built and its evaluation techniques, namely MOS, DRT and CT have been considered. Naturalness and intelligibility of the Turkish TTS system is tested by MOS and CT-DRT respectively. Although the system uses simple techniques, it provides promising results for Turkish TTS, since the selected concatenative method is...

Keywords:
ACCURACY
SPEECH
SPEECH SYNTHESIS

Publication date

Set your own date range

Content availability

Available (31)
None (1)

Keywords

HIDDEN MARKOV MODELS (11)
TRAINING (11)
NATURAL LANGUAGE PROCESSING (10)
DATABASES (9)
SPEECH PROCESSING (8)
SPEECH RECOGNITION (8)
TEXT ANALYSIS (7)
PREDICTIVE MODELS (5)
DECISION TREES (4)
FEATURE EXTRACTION (4)
COMPUTATIONAL LINGUISTICS (3)
CORRELATION (3)
DATA MINING (3)
HIDDEN MARKOV MODEL (3)
TEXT-TO-SPEECH (3)
TRAINING DATA (3)
ACOUSTICS (2)
BUILDINGS (2)
CHINESE TO TAIWANESE TTS SYSTEM (2)
CLASSIFICATION ALGORITHMS (2)
COMPUTER ARCHITECTURE (2)
COMPUTERS (2)
CONFERENCES (2)
DATA MODELS (2)
DETECTORS (2)
ERROR ANALYSIS (2)
FILTERING ALGORITHMS (2)
INFORMATION TECHNOLOGY (2)
LINGUISTICS (2)
MATHEMATICAL MODEL (2)
POLYSEMY PROBLEMS (2)
PRAGMATICS (2)
PREDICTION ALGORITHMS (2)
PROBABILITY DENSITY FUNCTION (2)
PROSODIC STRUCTURE (2)
SUPPORT VECTOR MACHINES (2)
SYLLABIFICATION (2)
TAGGING (2)
TEXT TO SPEECH SYNTHESIS (2)
TEXT TO SPEECH SYSTEM (2)
TIME DOMAIN ANALYSIS (2)
TIME-DOMAIN ANALYSIS (2)
TRANSFORMS (2)
UNIT SELECTION (2)
ACOUSTIC AND PROSODIC MODELING (1)
ACOUSTIC SIGNAL ANALYSIS (1)
ACOUSTIC SIGNAL PROCESSING (1)
ALGORITHM DESIGN AND ANALYSIS (1)
AMDF (1)
AMDF PITCH PERIOD DETECTION ALGORITHM (1)
AMPLITUDE COMPENSATION (1)
ANALYTICAL MODELS (1)
AUDIO DATABASES (1)
AUDIO FEEDBACK (1)
AUDIO PROCESSING (1)
AUDIO SIGNAL PROCESSING (1)
AUTOMATIC PHONETIC TIME-ALIGNMENT (1)
AUTOMATIC PREDICTION (1)
AUTOMATIC SPEECH RECOGNITION (1)
AUTOMATIC SPEECH RECOGNITION TECHNIQUES (1)
AUTOMATIC SPEECH SIGNAL SEGMENTATION (1)
AUTOMATIC SPEECH TRANSCRIPTION (1)
AUTOMATIC SYLLABLE SEGMENTATION (1)
AUTOMATICALLY ASSESSING ACOUSTIC MANIFESTATIONS (1)
BIG FIVE NEO-FFI (1)
BIOLOGY (1)
BLUETOOTH (1)
BOUNDARY (1)
BREAKS (1)
C2T TTS SYSTEM (1)
CANTONESE (1)
CANTONESE LISTENER (1)
CHAID (1)
CHAOS (1)
CHARACTER N-GRAM (1)
CHI-SQUARE AUTOMATIC INTERACTION DETECTOR (1)
CHINESE DATABASE (1)
CHINESE GRAPHEME-TO-PHONEME CONVERSION (1)
CHINESE PART-OF-SPEECH TAGGING (1)
CHINESE POLYPHONIC CHARACTER (1)
CHINESE PUTONGHUA (1)
CHINESE SENTENCE (1)
CHINESE TEXT PROSODIC STRUCTURE (1)
CHINESE TTS SYSTEM (1)
CITIES AND TOWNS (1)
CLASSIFICATION (1)
COCHLEAR IMPLANTATION (1)
COCHLEAR IMPLANTS (1)
COMBINED APPROACH (1)
COMPUTER APPLICATIONS (1)
COMPUTER SCIENCE (1)
CONCATENATED EMOTION SYNTHESIS (1)
CONCATENATION COST (1)
CONCATENATIVE TURKISH TEXT-TO-SPEECH SYSTEM (1)
CONTEXT (1)
CONTINUOUS F0 MODEL (1)
CONVEX HULL ENERGY ANALYSIS (1)
more

INFONA - science communication portal

Advanced search

Advanced search

Data-driven pause prediction for speech synthesis in storytelling style speech

Multilingual speech to speech translation system in bluetooth environment

The development of syllable based text to speech system for Tamil language

Cross-stream dependency modeling using continuous F0 model for HMM-based speech synthesis

Experiments on unsupervised statistical parametric speech synthesis

Automatic pronunciation prediction for text-to-speech synthesis of dialectal arabic in a speech-to-speech translation system

Support vector regression fusion scheme in phone duration modeling

Automatically assessing acoustic manifestations of personality in speech

A combined approach to the polysemy problems in a Chinese to Taiwanese TTS system

Improving GMM-based spectral conversion with optimal conversion function selection

Effects of F0 dimensions in perception of Mandarin tones

The automatic prediction of Chinese text's prosodic structure based on tree structure

The study of Tibetan prosodic structure prediction model

Classification of voice disorders in children with cochlear implantation and hearing aid using multiple classifier fusion

WordNet Based Sindhi Text to Speech Synthesis System

Support Vector Machine for Chinese Part-Of-Speech Tagging in Speech Synthesis Systems

Refining Unit Boundaries for Mandarin Text-to-Speech Database

Symbol based concatenation approach for Text to Speech System for Hindi using vowel classification technique

An efficient and robust pitch marking algorithm on the speech waveform for TD-PSOLA

Evaluation of the Concatenative Turkish Text-to-Speech System

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Advanced search

Advanced search

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options