Advanced search

From:

To:

Items from 1 to 20 out of 135 results

chapter

Noise robust speech recognition system using Mel cepstral and genetic algorithm

Garg Mamta, Arora Ajat Shatru, Gupta Savita

2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) > 3151 - 3155

2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)

This paper suggested a technique based on MFCC analysis for audio signals with speech classification application. The proposed work used multi-resolution (wavelet) analysis and spectral analysis based features for feature extraction. The proposed approach uses a no. of features like Mel Frequency Cepstral Coefficient (MFCC), and FFT Coefficients combined with wavelet based features. In addition, accuracy...

chapter

Automatic speech annotation based on enhanced wavelet Packets Best Tree Encoding (EWPBTE) feature

Mohamed Hassan Mohamed, Ashraf Mohamed Ali Hassan, N.M. Hussein Hassan

2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) > 2611 - 2616

2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)

This paper aimed at introducing a completely automated Arabic phone recognition system based on Enhanced Wavelet Packets Best Tree Encoding (EWPBTE) 15-point speech feature. The process of enhancing of WPBTE is provided by adding energy component to WPBTE, which is implemented in Matlab software and makes an enhancement of 65 % to recognizer accuracy which is the most contribution in this paper. EWPBTE...

chapter

Glottal pathology discrimination using ANN and SVM

Ashwini Visave, Pramod Kachare, Amutha Jeyakumar, Alice Cheeran, more

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1377 - 1381

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Use of modern technological advances in real-time biomedical analysis is very crucial. Current work focuses on glottal pathology discrimination based on non-invasive speech analysis techniques. Primary set back in developing such method is irregular performance depreciation of several state of the art acoustic features. To excuse such problems, we have used glottal to noise excitation ratio, which...

chapter

Feature selection experiments on emotional speech classification

Piyawat Sukhummek, Sawit Kasuriya, Thanaruk Theeramunkong, Chai Wutiwiwatchai, more

2015 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) > 1 - 4

2015 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

This paper presents the experiments on feature selection for emotional speech classification. There are 152 features used in this experiment. The minimum redundancy maximum relevance (mRMR) feature selection is applied as the features selection. The experiments are constructed from two corpora; Interactive Emotional Dyadic Motion Capture (IEMOCAP) and Emotional Tagged Corpus on Lakorn (EMOLA) which...

chapter

Feature extraction analysis on Indonesian speech recognition system

Untari N. Wisesty, Adiwijaya, Widi Astuti

2015 3rd International Conference on Information and Communication Technology (ICoICT) > 54 - 58

2015 3rd International Conference on Information and Communication Technology (ICoICT )

Speech recognition is widely applied to speech to text, speech to emotion, in order to make gadget and computer easier to use, or to help people with hearing disability. Feature extraction is one of significant step in the performance of speech recognition. Therefore, the proper selection is really needed. In this paper, we analyze feature extraction that can have good performance for Indonesian speech...

chapter

Text-constrained speaker verification using fuzzy C means vector quantization

Debnath Saswati, Soni Badal, Das Pradip K.

2015 International Conference on Communications and Signal Processing (ICCSP) > 1511 - 1515

2015 International Conference on Communications and Signal Processing (ICCSP)

The most successful approach to speech and speaker recognition is to treat the speech signal as a stochastic pattern and to use a statistical pattern recognition technique for matching utterances. This paper attempts to study the performance of Text dependent speaker verification system using Delta-Delta Mel Frequency Cepstral Coefficients (MFCC-Δ-Δ) feature vector and Fuzzy C means (FCM) speaker...

chapter

Detection of depression in adolescents based on statistical modeling of emotional influences in parent-adolescent conversations

Melissa N Stolar, Margaret Lech, Nicholas B Allen

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 987 - 991

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The current benchmark speech-based depression detection techniques rely on acoustic speech parameters collected from large sets of representative speech recordings. This study for the first time investigates depression detection based on the higher order influence model (HOIM) coefficients and emotional transition parameters derived from a relatively small set of conversational speech recordings representing...

chapter

Cepstral noise subtraction for robust automatic speech recognition

Robert Rehr, Timo Gerkmann

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 375 - 378

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The robustness of speech recognizers towards noise can be increased by normalizing the statistical moments of the Mel-frequency cepstral coefficients (MFCCs), e. g. by using cepstral mean normalization (CMN) or cepstral mean and variance normalization (CMVN). The necessary statistics are estimated over a long time window and often, a complete utterance is chosen. Consequently, changes in the background...

chapter

A unified framework for filterbank and time-frequency basis vectors in ASR frontends

Xiaoyu Liu, Stephen A. Zahorian

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4659 - 4663

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

For many years, filterbanks have been widely used as one step of frontend feature extraction for Automatic Speech Recognition (ASR). In this paper, we propose a unified framework for ASR frontends, by first moving the nonlinear amplitude scaling, and then combining the filterbank weights with the cosine basis vectors. As part of this framework, we also show that the delta terms used to encode feature...

chapter

Acoustic and para-verbal indicators of persuasiveness in social multimedia

Han Suk Shim, Sunghyun Park, Moitreya Chatterjee, Stefan Scherer, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2239 - 2243

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Persuasive communication and interaction play an important and pervasive role in many aspects of our lives. With the rapid growth of social multimedia websites such as YouTube, it has become more important and useful to understand persuasiveness in the context of online social multimedia content. In this paper, we present our results of conducting various analyses of persuasiveness in speech with...

chapter

Content based clinical depression detection in adolescents

Lu-Shih Alex Low, Namunu C. Maddage, Margaret Lech, Lisa Sheeber, more

2009 17th European Signal Processing Conference > 2362 - 2366

2009 17th European Signal Processing Conference

This paper studies the effectiveness of speech contents for detecting clinical depression in adolescents. We also evaluated the performances of acoustic features such as Mel frequency cepstral coefficients (MFCC), short time energy (Energy), zero crossing rate (ZCR) and Teager energy operator (TEO) using Gaussian mixture models for depression detection. A clinical data set of speech from 139 adolescents,...

chapter

Automatic gender classification using the mel frequency cepstrum of neutral and whispered speech: A comparative study

G. Nisha Meenakshi, Prasanta Kumar Ghosh

2015 Twenty First National Conference on Communications (NCC) > 1 - 6

2015 Twenty First National Conference on Communications (NCC)

A whispered speech resembles an unvoiced speech due to the lack of vocal fold vibration unlike the neutral speech. Since information about the gender of a speaker typically lies in the pitch resulted from the vocal fold vibration (or source signal), identifying gender from the whispered speech is more challenging compared to that from the neutral speech. In the absence of the pitch, we study the use...

chapter

Speaker based Language Independent Isolated Speech Recognition System

Shanthi Therese S., Chelpa Lingam

2015 International Conference on Communication, Information & Computing Technology (ICCICT) > 1 - 7

2015 International Conference on Communication, Information & Computing Technology (ICCICT)

This paper presents a speaker based Language Independent Isolated Speech Recognition System (LIISRS). The most popular feature extraction technique Mel Frequency Cepstral Coefficients (MFCC) is used for training the system. Representative specific features are identified using K-Means algorithm. Distortion measure is calculated using Euclidian distance function. Pitch contour characteristics are used...

chapter

A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

Khan Suhail Ahmad, Anil S. Thosar, Jagannath H. Nirmal, Vinay S. Pande

2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR) > 1 - 6

2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR)

This paper motivates the use of combination of mel frequency cepstral coefficients (MFCC) and its delta derivatives (DMFCC and DDMFCC) calculated using mel spaced Gaussian filter banks for text independent speaker recognition. MFCC modeled on the human auditory system shows robustness against noise and session changes and hence has become synonymous with speaker recognition. Our main aim is to test...

chapter

Improvement of phone recognition accuracy using source and system features

K E Manjunath, K. Sreenivasa Rao, M Gurunath Reddy

2015 International Conference on Signal Processing and Communication Engineering Systems > 501 - 505

2015 International Conference on Signal Processing And Communication Engineering Systems (SPACES)

The goal of this work is to improve phone recognition accuracy using combination of source and system features. As speech is produced by exciting time varying vocal tract system with time varying excitation, we want to explore both source and system components of speech production system for phone recognition. The excitation source information is derived by processing linear prediction residual of...

chapter

Classification of emotions from speech using implicit features

Mohit Srivastava, Anupam Agarwal

2014 9th International Conference on Industrial and Information Systems (ICIIS) > 1 - 6

2014 9th International Conference on Industrial and Information Systems (ICIIS)

Human computer interaction with the time has extended its branches to many different other fields like engineering, cognition, medical etc. Speech analysis has also become an important area of concern. People involved are using this mode for the interaction with the machines to bridge the gap between physical and digital world. Speech emotion recognition has become an integral subfield in the domain...

chapter

Feature extraction using Spectral Centroid and Mel Frequency Cepstral Coefficient for Quranic Accent Automatic Identification

Noraziahtulhidayu Kamarudin, S.A.R Al-Haddad, Shaiful Jahari Hashim, Mohammad Ali Nematollahi, more

2014 IEEE Student Conference on Research and Development > 1 - 6

2014 IEEE Student Conference on Research and Development (SCOReD)

This paper presents the process of Quranic Accent Automatic Identification. Recent feature extraction technique that is used for Quranic verse rule identification/Tajweed include Mel Frequency Cepstral Coefficients (MFCC) which prone to additive noise and may reduce the classification result. Therefore, to improve the performance of MFCC with addition of Spectral Centroid features and is proposed...

chapter

Classification of emphatic consonants and their counterparts in Modern Standard Arabic using neural networks

Yasser M. Seddiq, Yousef A. Alotaibi, Sid-Ahmed Selouani

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 73 - 77

2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

This paper presents the work of acoustic analysis related to Modern Standard Arabic (MSA). The problem of classifying the consonant counterparts in MSA is tackled here. The study considers four phonemes: /d^ˤ, ð^ˤ/ and their non-emphatic counterparts /d, ð/ respectively. An accurate automatic classification for those phonemes is to be achieved. Artificial neural networks (ANNs) are used for that purpose...

chapter

Analyzing the Impact of MFCC and LDA for the Development of Isolated Pashto Spoken Numbers ASR

Tanzeela, Arbab Waseem Abbas, Zakir Ali, Burhan Uddin

2014 12th International Conference on Frontiers of Information Technology > 350 - 354

2014 12th International Conference on Frontiers of Information Technology (FIT)

This paper revealed the analysis of speaker independent isolated Pashto spoken numbers for determination of automatic speech recognition. Initially the database was developed, the database encompasses isolated Pashto numbers from sefer (0) to sul (100). Fifty speakers (25 male, 25 females with different ages) that can frequently speak yousafzai dialect were selected for recording. The recording has...

chapter

A new direct access framework for speaker identification system

Hery Heryanto, Saiful Akbar, Benhard Sitohang

2014 International Conference on Data and Software Engineering (ICODSE) > 1 - 5

2014 International Conference on Data and Software Engineering (ICODSE)

We present in this paper a new Direct Access Framework (DAF) for speaker identification system, to identify a speaker based on original characteristics of the human voice. Direct access method is a process to identify an object based on parts of the object itself, the parts called original characteristics. The proposed framework consists of two parts, the enrolment process and the identification process...

Keywords:
ACCURACY
SPEECH
MEL FREQUENCY CEPSTRAL COEFFICIENT

Publication date

Set your own date range

Publication type

book (132)
article (3)

Keywords

FEATURE EXTRACTION (92)
SPEECH RECOGNITION (73)
HIDDEN MARKOV MODELS (34)
SPEAKER RECOGNITION (29)
TRAINING (27)
SPEECH PROCESSING (26)
MFCC (25)
DATABASES (22)
SUPPORT VECTOR MACHINES (20)
CEPSTRAL ANALYSIS (17)
NOISE (15)
SPEAKER IDENTIFICATION (15)
CLASSIFICATION ALGORITHMS (12)
GMM (12)
EMOTION RECOGNITION (11)
GAUSSIAN PROCESSES (11)
AUTOMATIC SPEECH RECOGNITION (10)
FILTER BANKS (9)
GAUSSIAN MIXTURE MODEL (9)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (9)
ROBUSTNESS (9)
ARTIFICIAL NEURAL NETWORKS (8)
COMPUTATIONAL MODELING (8)
SIGNAL TO NOISE RATIO (8)
TESTING (8)
VECTORS (8)
AUDIO SIGNAL PROCESSING (7)
DATA MINING (7)
HIDDEN MARKOV MODEL (7)
SUPPORT VECTOR MACHINE CLASSIFICATION (7)
VECTOR QUANTIZATION (7)
ACOUSTICS (6)
CORRELATION (6)
FILTER BANK (6)
GAUSSIAN MIXTURE MODELS (6)
MATHEMATICAL MODEL (6)
NOISE ROBUSTNESS (6)
SIGNAL PROCESSING (6)
SPEECH ANALYSIS (6)
ALGORITHM DESIGN AND ANALYSIS (5)
COMPUTERS (5)
CONFERENCES (5)
EQUATIONS (5)
FILTERING THEORY (5)
OPTIMIZATION (5)
PRINCIPAL COMPONENT ANALYSIS (5)
SIGNAL CLASSIFICATION (5)
ADAPTATION MODEL (4)
CEPSTRUM (4)
EDUCATIONAL INSTITUTIONS (4)
ELECTRONIC MAIL (4)
ESTIMATION (4)
FILTERING (4)
MEL-FREQUENCY CEPSTRAL COEFFICIENT (4)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (4)
MUSIC (4)
PATTERN CLUSTERING (4)
PSYCHOLOGY (4)
SPECTRAL ANALYSIS (4)
SPEECH CLASSIFICATION (4)
SPEECH ENHANCEMENT (4)
SPEECH SIGNAL (4)
STATISTICAL ANALYSIS (4)
SUPPORT VECTOR MACHINE (4)
SVM (4)
TRAINING DATA (4)
TRANSFORMS (4)
ACCELERATION (3)
ACOUSTIC MODEL (3)
ACOUSTIC SIGNAL PROCESSING (3)
ADDITIVE NOISE (3)
CHANNEL BANK FILTERS (3)
CLUSTERING METHODS (3)
COGNITION (3)
DATA MODELS (3)
DISCRETE FOURIER TRANSFORMS (3)
DISTANCE MEASUREMENT (3)
ENCODING (3)
FAST FOURIER TRANSFORMS (3)
FEATURE SELECTION (3)
FINITE IMPULSE RESPONSE FILTER (3)
GAUSSIAN MIXTURE SPEAKER MODEL (3)
GENETIC ALGORITHMS (3)
HISTOGRAMS (3)
HMM (3)
LABORATORIES (3)
MODULATION (3)
MULTILAYER NEURAL NETWORK (3)
MULTIMEDIA COMMUNICATION (3)
NOISE MEASUREMENT (3)
NYQUIST FILTER (3)
PATHOLOGY (3)
PATTERN CLASSIFICATION (3)
PATTERN RECOGNITION (3)
POLYNOMIALS (3)
ROBUST SPEECH RECOGNITION (3)
SIGNAL PROCESSING ALGORITHMS (3)
more

INFONA - science communication portal

Advanced search

Advanced search

Noise robust speech recognition system using Mel cepstral and genetic algorithm

Automatic speech annotation based on enhanced wavelet Packets Best Tree Encoding (EWPBTE) feature

Glottal pathology discrimination using ANN and SVM

Feature selection experiments on emotional speech classification

Feature extraction analysis on Indonesian speech recognition system

Text-constrained speaker verification using fuzzy C means vector quantization

Detection of depression in adolescents based on statistical modeling of emotional influences in parent-adolescent conversations

Cepstral noise subtraction for robust automatic speech recognition

A unified framework for filterbank and time-frequency basis vectors in ASR frontends

Acoustic and para-verbal indicators of persuasiveness in social multimedia

Content based clinical depression detection in adolescents

Automatic gender classification using the mel frequency cepstrum of neutral and whispered speech: A comparative study

Speaker based Language Independent Isolated Speech Recognition System

A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

Improvement of phone recognition accuracy using source and system features

Classification of emotions from speech using implicit features

Feature extraction using Spectral Centroid and Mel Frequency Cepstral Coefficient for Quranic Accent Automatic Identification

Classification of emphatic consonants and their counterparts in Modern Standard Arabic using neural networks

Analyzing the Impact of MFCC and LDA for the Development of Isolated Pashto Spoken Numbers ASR

A new direct access framework for speaker identification system

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options