Advanced search

From:

To:

Items from 1 to 20 out of 35 results

chapter

Feature diversity for emotion, language and speaker verification

S Dey, R Rajan, R Padmanabhan, H A Murthy

2011 National Conference on Communications (NCC) > 1 - 5

2011 National Conference on Communications (NCC)

In this paper we describe the utilisation of the diversity of different feature representations for speaker, emotion and language verification. The underlying principle behind the method is that some features are better at discriminating some classes and other features for other classes. Studies are done on four features and their combinations. An information theoretic procedure is described which...

chapter

Simultaneous speech recognition and speaker identification

Tobias Herbig, Franz Gerl, Wolfgang Minker

2010 IEEE Spoken Language Technology Workshop > 218 - 222

2010 IEEE Spoken Language Technology Workshop (SLT 2010)

In this paper we present a self-learning speech controlled system comprising speech recognition, speaker identification and speaker adaptation for a small number of users, e.g. five recurring speakers. A compact representation of speech and speaker characteristics is discussed. It is combined with a technique for efficient information retrieval to capture individual speech characteristics allowing...

chapter

Automatic transcription of parliamentary meetings and classroom lectures - A sustainable approach and real system evaluations -

Tatsuya Kawahara

2010 7th International Symposium on Chinese Spoken Language Processing > 1 - 6

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

Applications of automatic speech recognition (ASR) have been extended to a variety of tasks and domains, including spontaneous human-human speech. We have developed an ASR system for the Japanese Parliament (Diet), which is deployed this year. By exploiting official records made by human stenographers, we have realized an efficient training scheme of acoustic and language models, which does not require...

chapter

Excited commentator speech detection with unsupervised model adaptation for soccer highlight extraction

Yi Sun, Zhijian Ou, Wei Hu, Yimin Zhang

2010 International Conference on Audio, Language and Image Processing > 747 - 751

2010 International Conference on Audio, Language and Image Processing (ICALIP)

Soccer highlight detection is an active research topic in recent years. In this paper, we present our effort to detect an important audio keyword - excited commentator speech, which contributes to a state-of-the-art soccer highlight extraction system. We propose an approach of using statistical classifier based on Gaussian mixture models (GMMs) with unsupervised model adaptation. The excited speech...

chapter

A spoken dialog system based on automatically-generated example database

A Ito, T Morimoto, S Makino, M Ito

2010 International Conference on Audio, Language and Image Processing > 732 - 736

2010 International Conference on Audio, Language and Image Processing (ICALIP)

There have been proposed spoken dialog systems that utilizes simple database consisted of example sentences and the corresponding reply sentences. However, it is costly to prepare this database manually. In the present study, we propose a framework in which both the example and reply sentences are automatically generated from a database description table that describes minimum information for describing...

chapter

Signal-to-Signal Ratio Independent Speaker Identification for Co-channel Speech Signals

Rahim Saeidi, Pejman Mowlaee, Tomi Kinnunen, Zheng-Hua Tan, more

2010 20th International Conference on Pattern Recognition > 4565 - 4568

2010 20th International Conference on Pattern Recognition (ICPR 2010)

In this paper, we consider speaker identification for the co-channel scenario in which speech mixture from speakers is recorded by one microphone only. The goal is to identify both of the speakers from their mixed signal. High recognition accuracies have already been reported when an accurately estimated signal-to-signal ratio (SSR) is available. In this paper, we approach the problem without estimating...

chapter

Fast Adaptation of Speech and Speaker Characteristics for Enhanced Speech Recognition in Adverse Intelligent Environments

T Herbig, F Gerl, W Minker

2010 Sixth International Conference on Intelligent Environments > 100 - 105

2010 6th International Conference on Intelligent Environments (IE)

In this paper we present a technique for fast adaptation of speech and speaker related information. Fast learning is particularly useful for automatic personalization of speech-controlled devices. Such a personalization of human-computer interfaces to be used in intelligent environments represents an important research issue. Speech recognition is enhanced by speaker specific profiles which are continuously...

chapter

Learning task-dependent speech variability in discriminative acoustic model adaptation

Shoei Sato, Takahiro Oku, Shinichi Homma, Akio Kobayashi, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4910 - 4913

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

We present a new discriminative method of acoustic model adaptation that deals with a task-dependent speaking style. We have focused on differences of expressions or speaking styles between tasks and set the objective of this method as improving the recognition accuracy of indistinctly pronounced phrases dependent on a speaking style. The adaptation appends subword models for frequently observable...

chapter

Recognition of phonemes and words in singing

Annamaria Mesaros, Tuomas Virtanen

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 2146 - 2149

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper studies the influence of n-gram language models in the recognition of sung phonemes and words. We train uni-, bi-, and trigram language models for phonemes and bi- and trigrams for words. The word-level language model is estimated from a textual lyrics database. In the recognition we use a hidden Markov model based phonetic recognizer adapted to singing voice. The models were tested on...

chapter

Research on Adaptive Speaker Identification Based on GMM

Yuhuan Zhou, Jinming Wang, Xiongwei Zhang

2009 International Forum on Computer Science-Technology and Applications > 2 > 330 - 332

2009 International Forum on Computer Science-Technology and Applications (IFCSTA 2009)

In this paper, an adaptive speaker identification method combined with the human behavioral trait based on Gaussian mixture model (GMM) is constructed. The method can automatically select different length of speech for different speakers in identification process according to the feedback probability estimation, so it can guarantee identification accuracy without reducing, and to reduce the identification...

chapter

An improved recursive algorithm for automatic alignment of complex long audio

He Kejia, Liu Gang, Tang Jie, Guo Jun

2009 IEEE International Conference on Network Infrastructure and Digital Content > 690 - 694

2009 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC 2009)

In this paper we present an approach for automatic alignment of long audio data with varied acoustic conditions to their corresponding transcripts in an effective manner. Accurate time-aligned transcripts provide easier access to audio materials by aiding applications such as the indexing, summarizing and retrieving of audio segments. Accurate time alignments are also necessary for labeling the training...

chapter

Support vector machine based speaker identification systems using GMM parameters

Vijendra Raj Apsingekar, Phillip L De Leon

2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers > 1766 - 1769

2009 43rd Asilomar Conference on Signals, Systems and Computers

Speaker identification is the task of determining which speaker characteristics from the speakers known to the system best matches the unknown voice sample. SI requires multiple decision alternatives and to implement SI system using SVM techniques requires multi-class SVM classifier. In this paper, speaker model clustering is implemented on a SVM based SI system. Here, instead of clustering the speakers,...

chapter

An efficient multistage Rover method for Automatic Speech recognition

Haihua Xu, Jie Zhu, Guanyong Wu

2009 IEEE International Conference on Multimedia and Expo > 894 - 897

2009 IEEE International Conference on Multimedia and Expo (ICME)

In this paper, we implemented a multistage recognizer output voting error reduction (ROVER) method for better automatic speech recognition (ASR). The first stage ROVER is conducted by combining three recognizers, which are respectively trained with maximum likelihood estimation (MLE), minimum phone error (MPE) and recently proposed boosted maximum mutual information (BMMI) criteria. After that the...

chapter

Rapid unsupervised adaptation using context independent phoneme model

S. Kobashikawa, A. Ogawa, Y. Yamaguchi, S. Takahashi

2009 IEEE 13th International Symposium on Consumer Electronics > 209 - 212

2009 IEEE 13th International Symposium on Consumer Electronics (ISCE)

Users require rapid and highly accurate speech recognition systems. Accuracy could be improved by unsupervised adaptation as provided by CMLLR (Constrained Maximum Likelihood Linear Regression). CMLLR-based batch-type unsupervised adaptation estimates a single global transformation matrix by utilizing unsupervised labeling; unfortunately, it needs prior labeling and so is not rapid. Our proposed technique...

chapter

Modeling instantaneous intonation for speaker identification using the fundamental frequency variation spectrum

K. Laskowski, Qin Jin

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4541 - 4544

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In recent years, the field of automatic speaker identification has begun to exploit high-level sources of speaker-discriminative information, in addition to traditional models of spectral shape. These sources include pronunciation models, prosodic dynamics, pitch, pause, and duration features, phone streams, and conversational interaction. As part of this broader thrust, we explore a new frame-level...

chapter

Comparing maximum a posteriori vector quantization and Gaussian mixture models in speaker verification

T. Kinnunen, J. Saastamoinen, V. Hautamaki, M. Vinni, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4229 - 4232

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

Gaussian mixture model - universal background model (GMM-UBM) is a standard reference classifier in speaker verification. We have proposed a simplified model using vector quantization (VQ-UBM). In this study, we extensively compare these two classifiers on NIST 2005, 2006 and 2008 SRE corpora, while having a standard discriminative classifier (GLDS-SVM) as a reference point. We focus on parameter...

chapter

Applying discretized articulatory knowledge to dysarthric speech

F. Rudzicz

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4501 - 4504

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper applies two dynamic Bayes networks that include theoretical and measured kinematic features of the vocal tract, respectively, to the task of labeling phoneme sequences in unsegmented dysarthric speech. Speaker dependent and adaptive versions of these models are compared against two acoustic-only baselines, namely a hidden Markov model and a latent dynamic conditional random field. Both...

chapter

Unsupervised speaker adaptation for telephone call transcription

R. Wallace, K. Thambiratnam, F. Seide

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4393 - 4396

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Supervised and unsupervised experiments in acoustic model and language model adaptation are presented. Using one hour of automatically transcribed...

chapter

Speaker Identification in Room Reverberation Using GMM-UBM

A. Akula, V.R. Apsingekar, P.L. de Leon

2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop > 37 - 41

2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop

Speaker recognition systems tend to degrade if the training and testing conditions differ significantly. Such situations may arise due to the use of different microphones, telephone and mobile handsets or different acoustic conditions. Recently, the effect of the room acoustics on speaker identification (SI) has been investigated and it has been shown that a loss in accuracy results when using clean...

chapter

A Combined Task Analysis Method for Data Selection in Mandarin Isolated Word Recognition System

Z.Y. He, Z.G. Wang, W. Li, J. Wu

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

This paper studies the performance of the data selection with a combined task analysis method in task adaptation on Mandarin isolated word recognition. The proposed task analysis method combines coverage unit balanced task analysis with the confusability based analysis. The performance is evaluated with several experiments.

Keywords:
ACCURACY
SPEECH
ADAPTATION MODEL

Publication date

Set your own date range

Publication type

book (33)
article (2)

Keywords

SPEECH RECOGNITION (18)
HIDDEN MARKOV MODELS (14)
SPEAKER RECOGNITION (13)
ACOUSTICS (11)
TRAINING (11)
FEATURE EXTRACTION (9)
COMPUTATIONAL MODELING (8)
SIGNAL PROCESSING (8)
SPEAKER IDENTIFICATION (7)
SPEECH PROCESSING (7)
ALGORITHM DESIGN AND ANALYSIS (6)
DATABASES (6)
GAUSSIAN PROCESSES (6)
MAXIMUM LIKELIHOOD ESTIMATION (6)
NOISE (6)
SIGNAL PROCESSING ALGORITHMS (6)
ARTIFICIAL NEURAL NETWORKS (5)
CONVERGENCE (5)
ESTIMATION (5)
REAL TIME SYSTEMS (5)
CLASSIFICATION ALGORITHMS (4)
COMPUTERS (4)
CONFERENCES (4)
CORRELATION (4)
DATA MODELS (4)
GMM (4)
IMAGE COLOR ANALYSIS (4)
MEL FREQUENCY CEPSTRAL COEFFICIENT (4)
OBJECT RECOGNITION (4)
OPTIMIZATION (4)
ROBUSTNESS (4)
TRANSFORMS (4)
ACOUSTIC MEASUREMENTS (3)
ACOUSTIC SIGNAL PROCESSING (3)
ADAPTIVE FILTERS (3)
AUTOMATIC SPEECH RECOGNITION (3)
COMPANIES (3)
COMPLEXITY THEORY (3)
DATA MINING (3)
DELAY (3)
EQUATIONS (3)
FILTERING (3)
HELIUM (3)
IEEE TRANSACTIONS ON SIGNAL PROCESSING (3)
INDEXES (3)
INTERNET (3)
ITERATIVE METHODS (3)
LANGUAGE MODEL ADAPTATION (3)
MANUALS (3)
MATHEMATICAL MODEL (3)
MUTUAL INFORMATION (3)
SIGNAL RESOLUTION (3)
SILICON (3)
SIMULATION (3)
SPEECH ENHANCEMENT (3)
USA COUNCILS (3)
ACOUSTIC MODEL (2)
ACOUSTIC MODEL ADAPTATION (2)
ADAPTIVE SYSTEMS (2)
ADDITIVE NOISE (2)
ANALYTICAL MODELS (2)
APPROXIMATION METHODS (2)
ATTENUATION (2)
AUDIO CODING (2)
AUDITORY SYSTEM (2)
BAYES METHODS (2)
BAYESIAN METHODS (2)
BIOLOGICAL SYSTEM MODELING (2)
BLIND SOURCE SEPARATION (2)
CAMERAS (2)
CEPSTRAL ANALYSIS (2)
COLOR (2)
COMPUTER ARCHITECTURE (2)
DATA SELECTION (2)
DATABASE MANAGEMENT SYSTEMS (2)
DECODING (2)
DISCRETE FOURIER TRANSFORMS (2)
DISTANCE MEASUREMENT (2)
ECHO CANCELLERS (2)
EDUCATIONAL INSTITUTIONS (2)
ENERGY RESOLUTION (2)
FACE RECOGNITION (2)
FILTER BANK (2)
FREQUENCY DOMAIN ANALYSIS (2)
FREQUENCY MODULATION (2)
GAIN (2)
GAUSSIAN DISTRIBUTION (2)
GAUSSIAN MIXTURE MODEL (2)
GAUSSIAN MIXTURE MODELS (2)
HEURISTIC ALGORITHMS (2)
IMAGE RESOLUTION (2)
IMAGE SEGMENTATION (2)
INSTRUMENTS (2)
INTERFERENCE (2)
INTERNET TELEPHONY (2)
LEAST SQUARES APPROXIMATIONS (2)
LIGHTING (2)
more

INFONA - science communication portal

Advanced search

Advanced search

Feature diversity for emotion, language and speaker verification

Simultaneous speech recognition and speaker identification

Automatic transcription of parliamentary meetings and classroom lectures - A sustainable approach and real system evaluations -

Excited commentator speech detection with unsupervised model adaptation for soccer highlight extraction

A spoken dialog system based on automatically-generated example database

Signal-to-Signal Ratio Independent Speaker Identification for Co-channel Speech Signals

Fast Adaptation of Speech and Speaker Characteristics for Enhanced Speech Recognition in Adverse Intelligent Environments

Learning task-dependent speech variability in discriminative acoustic model adaptation

Recognition of phonemes and words in singing

Research on Adaptive Speaker Identification Based on GMM

An improved recursive algorithm for automatic alignment of complex long audio

Support vector machine based speaker identification systems using GMM parameters

An efficient multistage Rover method for Automatic Speech recognition

Rapid unsupervised adaptation using context independent phoneme model

Modeling instantaneous intonation for speaker identification using the fundamental frequency variation spectrum

Comparing maximum a posteriori vector quantization and Gaussian mixture models in speaker verification

Applying discretized articulatory knowledge to dysarthric speech

Unsupervised speaker adaptation for telephone call transcription

Speaker Identification in Room Reverberation Using GMM-UBM

A Combined Task Analysis Method for Data Selection in Mandarin Isolated Word Recognition System

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options