Advanced search

From:

To:

Items from 1 to 20 out of 248 results

chapter

Automatic speech recognition models: A characteristic and performance review

U. G. Patil, S. D. Shirbahadurkar, A. N. Paithane

2016 International Conference on Computing Communication Control and automation (ICCUBEA) > 1 - 7

2016 International Conference on Computing Communication Control and automation (ICCUBEA)

This paper presents a review on few notable speech recognition models that are reported in the last decade. Firstly, the models are categorized into sparse models, learning models and domain - specific models. Subsequently, the characteristics of the models have been observed using speech constraints, algorithmic constraints and performance constraints. The performance of these models reported in...

chapter

Noise impact assessment on the accuracy of the determination of speaker’s gender by using method of the cumulant coefficients

Kostiantyn Pylypenko, Arkadiy Prodeus

2015 XI International Conference on Perspective Technologies and Methods in MEMS Design (MEMSTECH) > 102 - 106

2015 XI International Conference on Perspective Technologies and Methods in MEMS Design (MEMSTECH

A new method of classification of a speaker’s gender based on cumulant coefficients is proposed. The effect of an additive noise and measurement error of classification signs on accuracy of classification is analyzed. The expediency of construction of an adaptive system of classification operating with considering of masking of a speech signal by noise is shown. Comparison of the proposed method of...

chapter

Part-of-speech labeling for Reuters database

R. Cretulescu, A. David, D. Morariu, L. Vintan

2015 19th International Conference on System Theory, Control and Computing (ICSTCC) > 117 - 122

2015 19th International Conference on System Theory, Control and Computing (ICSTCC)

Even if the Vector Space Model used for document representation in information retrieval systems integrates a small quantity of knowledge it continues to be used due to its computational cost, speed execution and simplicity. We try to improve this document representation by adding some syntactic information such as the parts of speech. In this paper, we have evaluated three different tagging algorithms...

chapter

A learning-based approach for Romanian syllabification and stress assignment

Diana Balc, Anamaria Beleiu, Rodica Potolea, Camelia Lemnaru

2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP) > 37 - 42

2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)

This paper tackles the Romanian syllabification and stress assignment problems, and proposes an efficient machine learning based solution. We show that by designing the appropriate feature sets for each specific problem, learning algorithms achieve satisfactory accuracy rates for both problems (∼92% for syllabification, ∼85% for stress assignment), even for relatively small training set sizes. We...

chapter

A hybrid Parts Of Speech tagger for Malayalam language

Anisha Aziz T, Sunitha C

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1502 - 1507

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Parts of speech tagging is an important research topic in Natural Language Processing research are. Since it is one among the first steps of any natural language processing (NLP) techniques such as machine translation, if any error happens for tagging the same will repeat in the whole NLP process. So far works had been done on POS tagging based on SVM, MBLP, HMM, Ngram. All of these methods were not...

chapter

Reducing morpho-phonetic confusion in sub-word based Uyghur ASR

Mijit Ablimit, Askar Hamdulla, Akbar Pattar

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 348 - 352

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Sub-word units like morphemes are selected as the lexicon for highly inflectional languages, as they can provide better coverage and a smaller vocabulary size. However, short units shrink the context of statistical models, prone to morpho-phonetic changes, and not always outperform the word based model. When sequence of units are merged or split, unit boundaries are phonetically harmonized in the...

chapter

Feature extraction analysis on Indonesian speech recognition system

Untari N. Wisesty, Adiwijaya, Widi Astuti

2015 3rd International Conference on Information and Communication Technology (ICoICT) > 54 - 58

2015 3rd International Conference on Information and Communication Technology (ICoICT )

Speech recognition is widely applied to speech to text, speech to emotion, in order to make gadget and computer easier to use, or to help people with hearing disability. Feature extraction is one of significant step in the performance of speech recognition. Therefore, the proper selection is really needed. In this paper, we analyze feature extraction that can have good performance for Indonesian speech...

chapter

Speech event detection by non negative matrix deconvolution

Carla Lopes, Fernando Perdigao

2007 15th European Signal Processing Conference > 1280 - 1284

2007 15th European Signal Processing Conference

Support Vector Machines (SVM) are applied to the problem of detecting and classifying broad acoustic-phonetic classes (events). In this paper an approach based on Non-Negative Matrix Deconvolution (NMD) is proposed to merge frame-based SVM predictions into segmental events. To turn the SVM outputs, which are frame-based, into a signal segmented in terms of events, two different event merger methods...

chapter

Deep neural networks for cochannel speaker identification

Xiaojia Zhao, Yuxuan Wang, DeLiang Wang

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4824 - 4828

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speaker identification (SID) in cochannel speech, where two speakers are talking simultaneously over a single recording channel, is a challenging problem. Previous studies address this problem in the anechoic environment under the Gaussian mixture model (GMM) framework. On the other hand, cochannel SID in reverberant conditions has not been addressed. This paper studies cochannel SID in both anechoic...

chapter

Speech recognition with prediction-adaptation-correction recurrent neural networks

Yu Zhang, Dong Yu, Michael L. Seltzer, Jasha Droppo

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5004 - 5008

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We propose the prediction-adaptation-correction RNN (PAC-RNN), in which a correction DNN estimates the state posterior probability based on both the current frame and the prediction made on the past frames by a prediction DNN. The result from the main DNN is fed back to the prediction DNN to make better predictions for the future frames. In the PAC-RNN, we can consider that, given the new, current...

chapter

Weighted training for speech under Lombard Effect for speaker recognition

Muhammad Muneeb Saleem, Gang Liu, John H.L. Hansen

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4350 - 4354

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The presence of Lombard Effect in speech is proven to have severe effects on the performance of speech systems, especially speaker recognition. Varying kinds of Lombard speech are produced by speakers under influence of varying noise types [1]. This study proposes a high-accuracy classifier using deep neural networks for detecting various kinds of Lombard speech against neutral speech, independent...

chapter

Brandt's GLR method & refined HMM segmentation for TTS synthesis application

Safaa Jarifi, Dominique Pastor, Olivier Rosec

2005 13th European Signal Processing Conference > 1 - 4

2005 13th European Signal Processing Conference

In comparison with standard HMM (Hidden Markov Model) with forced alignment, this paper discusses two automatic segmentation algorithms from different points of view: the probabilities of insertion and omission, and the accuracy. The first algorithm, hereafter named the refined HMM algorithm, aims at refining the segmentation performed by standard HMM via a GMM (Gaussian Mixture Model) of each boundary...

chapter

Longer-length acoustic units for continuous speech recognition

Annika Hamalainen, Johan de Veth, Lou Boves

2005 13th European Signal Processing Conference > 1 - 4

2005 13th European Signal Processing Conference

Recent research on the TIMIT database suggests that longer-length acoustic units are better suited for modelling pronunciation variation and long-term temporal dependencies in speech than traditional phoneme-length units, yielding substantial improvements in recognition accuracy [9]. In this paper, we investigate whether similar improvements can be gained on another database, viz. excerpts from novels...

chapter

Sequential forward feature selection with low computational cost

Dimitrios Ververidis, Constantine Kotropoulos

2005 13th European Signal Processing Conference > 1 - 4

2005 13th European Signal Processing Conference

This paper presents a novel method to control the number of crossvalidation repetitions in sequential forward feature selection algorithms. The criterion for selecting a feature is the probability of correct classification achieved by the Bayes classifier when the class feature probability density function is modeled by a single multivariate Gaussian density. Let the probability of correct classification...

chapter

Error handling in multimodal biometric systems using reliability measures

Krzysztof Kryszczuk, Jonas Richiardi, Plamen Prodanov, Andrzej Drygajlo

2005 13th European Signal Processing Conference > 1 - 4

2005 13th European Signal Processing Conference

In this paper, we present a framework for predicting and correcting classification decision errors based on modality reliability measures in a multimodal biometric system. In our experiments we use face and speech experts based on a recently proposed framework which uses Bayesian networks. The expert decisions and the accompanying information on their reliability are combined in a decision module...

chapter

Automatic gender classification using the mel frequency cepstrum of neutral and whispered speech: A comparative study

G. Nisha Meenakshi, Prasanta Kumar Ghosh

2015 Twenty First National Conference on Communications (NCC) > 1 - 6

2015 Twenty First National Conference on Communications (NCC)

A whispered speech resembles an unvoiced speech due to the lack of vocal fold vibration unlike the neutral speech. Since information about the gender of a speaker typically lies in the pitch resulted from the vocal fold vibration (or source signal), identifying gender from the whispered speech is more challenging compared to that from the neutral speech. In the absence of the pitch, we study the use...

chapter

Speaker based Language Independent Isolated Speech Recognition System

Shanthi Therese S., Chelpa Lingam

2015 International Conference on Communication, Information & Computing Technology (ICCICT) > 1 - 7

2015 International Conference on Communication, Information & Computing Technology (ICCICT)

This paper presents a speaker based Language Independent Isolated Speech Recognition System (LIISRS). The most popular feature extraction technique Mel Frequency Cepstral Coefficients (MFCC) is used for training the system. Representative specific features are identified using K-Means algorithm. Distortion measure is calculated using Euclidian distance function. Pitch contour characteristics are used...

chapter

Two-stage phone recognition system using articulatory and spectral features

K E Manjunath, K. Sreenivasa Rao, M Gurunath Reddy

2015 International Conference on Signal Processing and Communication Engineering Systems > 107 - 111

2015 International Conference on Signal Processing And Communication Engineering Systems (SPACES)

In this paper, we propose a two-stage phone recognition system using articulatory and spectral features. In the first stage, articulatory features are predicted from spectral features using FeedForward Neural Networks (FFNNs). In the second stage, phone recognition is carried out using the predicted articulatory features and spectral features together. FFNNs and Hidden Markov Models are explored for...

chapter

Improvement of phone recognition accuracy using source and system features

K E Manjunath, K. Sreenivasa Rao, M Gurunath Reddy

2015 International Conference on Signal Processing and Communication Engineering Systems > 501 - 505

2015 International Conference on Signal Processing And Communication Engineering Systems (SPACES)

The goal of this work is to improve phone recognition accuracy using combination of source and system features. As speech is produced by exciting time varying vocal tract system with time varying excitation, we want to explore both source and system components of speech production system for phone recognition. The excitation source information is derived by processing linear prediction residual of...

chapter

Extracting situational awareness from microblogs during disaster events

Anirban Sen, Koustav Rudra, Saptarshi Ghosh

2015 7th International Conference on Communication Systems and Networks (COMSNETS) > 1 - 6

2015 7th International Conference on Communication Systems and Networks (COMSNETS)

Microblogging sites such as Twitter and Weibo are increasingly being used to enhance situational awareness during various natural and man-made disaster events such as floods, earthquakes, and bomb blasts. During any such event, thousands of microblogs (tweets) are posted in short intervals of time. Typically, only a small fraction of these tweets contribute to situational awareness, while the majority...

Keywords:
ACCURACY
SPEECH
TRAINING

Publication date

Set your own date range

Content availability

Available (247)
None (1)

Publication type

book (241)
article (7)

Keywords

SPEECH RECOGNITION (133)
HIDDEN MARKOV MODELS (102)
FEATURE EXTRACTION (82)
ACOUSTICS (53)
SPEECH PROCESSING (50)
SUPPORT VECTOR MACHINES (41)
DATABASES (30)
SPEAKER RECOGNITION (30)
MEL FREQUENCY CEPSTRAL COEFFICIENT (27)
NATURAL LANGUAGE PROCESSING (26)
CLASSIFICATION ALGORITHMS (24)
TRAINING DATA (22)
TESTING (21)
ARTIFICIAL NEURAL NETWORKS (20)
DATA MINING (20)
LEARNING (ARTIFICIAL INTELLIGENCE) (18)
AUTOMATIC SPEECH RECOGNITION (17)
DATA MODELS (17)
NOISE (16)
VECTORS (16)
SPEAKER IDENTIFICATION (15)
SVM (15)
GAUSSIAN PROCESSES (14)
SUPPORT VECTOR MACHINE (14)
TAGGING (14)
COMPUTATIONAL MODELING (13)
HIDDEN MARKOV MODEL (12)
MFCC (12)
PATTERN CLASSIFICATION (12)
SIGNAL PROCESSING (12)
ADAPTATION MODEL (11)
CORRELATION (11)
SPEECH SYNTHESIS (11)
ALGORITHM DESIGN AND ANALYSIS (10)
COMPUTERS (10)
CONFERENCES (10)
GMM (10)
KERNEL (10)
MATHEMATICAL MODEL (10)
ROBUSTNESS (10)
STATISTICAL ANALYSIS (10)
EDUCATIONAL INSTITUTIONS (9)
EMOTION RECOGNITION (9)
ESTIMATION (9)
GAUSSIAN MIXTURE MODEL (9)
MACHINE LEARNING (9)
NATURAL LANGUAGES (9)
NEURAL NETWORKS (9)
PATTERN RECOGNITION (9)
SIGNAL CLASSIFICATION (9)
SUPPORT VECTOR MACHINE CLASSIFICATION (9)
TRANSFORMS (9)
CEPSTRAL ANALYSIS (8)
CLASSIFICATION (8)
DECODING (8)
HMM (8)
MAXIMUM LIKELIHOOD ESTIMATION (8)
PRINCIPAL COMPONENT ANALYSIS (8)
SPEECH ANALYSIS (8)
STRESS (8)
TEXT ANALYSIS (8)
VOCABULARY (8)
CONTEXT (7)
DISCRIMINATIVE TRAINING (7)
MICROPHONES (7)
PROBABILITY (7)
SIGNAL TO NOISE RATIO (7)
ANALYTICAL MODELS (6)
DICTIONARIES (6)
EQUATIONS (6)
ERROR ANALYSIS (6)
INFORMATION RETRIEVAL (6)
LABELING (6)
PREDICTIVE MODELS (6)
VECTOR QUANTIZATION (6)
ACOUSTIC MODELING (5)
ACOUSTIC SIGNAL PROCESSING (5)
CHARACTER RECOGNITION (5)
CLUSTERING METHODS (5)
COMPLEXITY THEORY (5)
CONTEXT MODELING (5)
COVARIANCE MATRIX (5)
DECISION TREES (5)
DETECTORS (5)
ELECTRONIC MAIL (5)
ENCODING (5)
ENTROPY (5)
GAUSSIAN MIXTURE MODELS (5)
KNOWLEDGE BASED SYSTEMS (5)
LABORATORIES (5)
LANGUAGE MODEL (5)
NEURAL NETS (5)
NIST (5)
OPTIMIZATION (5)
PATTERN CLUSTERING (5)
REAL TIME SYSTEMS (5)
REVERBERATION (5)
more

INFONA - science communication portal

Advanced search

Advanced search

Automatic speech recognition models: A characteristic and performance review

Noise impact assessment on the accuracy of the determination of speaker’s gender by using method of the cumulant coefficients

Part-of-speech labeling for Reuters database

A learning-based approach for Romanian syllabification and stress assignment

A hybrid Parts Of Speech tagger for Malayalam language

Reducing morpho-phonetic confusion in sub-word based Uyghur ASR

Feature extraction analysis on Indonesian speech recognition system

Speech event detection by non negative matrix deconvolution

Deep neural networks for cochannel speaker identification

Speech recognition with prediction-adaptation-correction recurrent neural networks

Weighted training for speech under Lombard Effect for speaker recognition

Brandt's GLR method & refined HMM segmentation for TTS synthesis application

Longer-length acoustic units for continuous speech recognition

Sequential forward feature selection with low computational cost

Error handling in multimodal biometric systems using reliability measures

Automatic gender classification using the mel frequency cepstrum of neutral and whispered speech: A comparative study

Speaker based Language Independent Isolated Speech Recognition System

Two-stage phone recognition system using articulatory and spectral features

Improvement of phone recognition accuracy using source and system features

Extracting situational awareness from microblogs during disaster events

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options