Advanced search

From:

To:

Items from 1 to 20 out of 41 results

chapter

On the use of EMD for automatic newborn cry segmentation

Lina Abou-Abbas, Leila Montazeri, Christian Gargour, Chakib Tadj

2015 International Conference on Advances in Biomedical Engineering (ICABME) > 262 - 265

2015 International Conference on Advances in Biomedical Engineering (ICABME)

Cry segmentation is an essential preprocessing step in any infant crying diagnosis system. Besides crying sounds consisting of expiration phases followed by short periods of inspiration episodes, each recording of newborn cries also includes silence sections as well as other sounds such as speech of caregivers, noise and sound of medical equipments. This paper is devoted to a newly developed Empirical...

chapter

Cepstral noise subtraction for robust automatic speech recognition

Robert Rehr, Timo Gerkmann

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 375 - 378

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The robustness of speech recognizers towards noise can be increased by normalizing the statistical moments of the Mel-frequency cepstral coefficients (MFCCs), e. g. by using cepstral mean normalization (CMN) or cepstral mean and variance normalization (CMVN). The necessary statistics are estimated over a long time window and often, a complete utterance is chosen. Consequently, changes in the background...

chapter

A reliable speaker verification system based on LPCC and DTW

Rekha Nair, Nirmala Salam

2014 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 4

2014 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Human voice can serve as a password/key for access to various services. This voice is used for verifying speaker in speaker verification system based on the features extracted from the voice signal. In automated speaker verification the speaker's voice signal is processed to extract speaker-specific information which is used to generate voiceprint also known as a template that cannot be replicated...

chapter

Perceptual Evaluation of Voice Quality and Its Correlation with Acoustic Measurement

Farideh Jalalinajafabadi, Chaitanya Gadepalli, Frances Ascott, Jarrod Homer, more

2013 European Modelling Symposium > 283 - 286

2013 European Modelling Symposium (EMS)

The GRBAS scale is a widely used subjective measure of voice quality. The aim of this paper is to investigate the correlation between the 'grade', 'roughness', 'breathiness', 'asthenia' and 'strain' dimensions of this scale and the objective measurements provided by the 'Analysis of Dysphonia in speech and Voice' (ADSV) software package. To do this, voice recordings of 107 samples were collected in...

chapter

Automatic detection of Parkinson's disease using noise measures of speech

E. A. Belalcazar-Bolanos, J. R. Orozco-Arroyave, J. D. Arias-Londono, J. F. Vargas-Bonilla, more

Symposium of Signals, Images and Artificial Vision - 2013: STSIVA - 2013 > 1 - 5

2013 XVIII Symposium of Image, Signal Processing, and Artificial Vision (STSIVA)

Parkinson's disease (PD) is a neurodegenerative disorder that is characterized by the loss of dopaminergic neurons in the mid brain. It is demonstrated that about 90% of the people with PD also develop speech impairments, exhibiting symptoms such as monotonic speech, low pitch intensity, inappropriate pauses, imprecision in consonants and problems in prosody; although they are already identify problems,...

chapter

Connected-digits recognition for an under-resourced language using Hidden Markov Models

Mabu Johannes Manaileng, Madimetja Jonas Manamela

Proceedings ELMAR-2013 > 211 - 214

2013 55th International Symposium ELMAR

This paper presents the development of a speech recognition system for automatically recognizing fluently spoken digit strings in Northern Sotho. The digit strings can be isolated or connected/continuous with known or unknown length. The digit recognition system has been trained with the aim of satisfying its potential end-users. Our main research focus was to enhance the robustness of a connected-digits...

chapter

A phone segmentation method and its evaluation on Mandarin speech corpus

Dac-Thang Hoang, Hsiao-Chuan Wang

2012 8th International Symposium on Chinese Spoken Language Processing > 373 - 377

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

This paper presents a phone segmentation method without a prior knowledge about the text contents. The proposed method is an unsupervised phone boundary detection based on band-energy tracing technique. It demonstrates a better performance than those previous works when the method was applied to TIMIT corpus. But the performance degrades when the method is applied to a Mandarin Chinese speech database,...

chapter

Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise

Cassia Valentini-Botinhao, Ranniery Maia, Junichi Yamagishi, Simon King, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 3997 - 4000

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure...

chapter

Unsupervised phone segmentation method using delta spectral function

Dac-Thang Hoang, Hsiao-Chuan Wang

2011 International Conference on Speech Database and Assessments (Oriental COCOSDA) > 152 - 156

2011 Oriental COCOSDA 2011 - International Conference on Speech Database and Assessments

Unsupervised phone segmentation means that the phone boundaries in an utterance can be detected without a prior knowledge about the text contents. Usually, a spectral change in the speech signal implies the existence of a phone boundary. In this paper, the Delta Spectral Function (DSF) is defined for each frame to represent the variation of band energy for a specific band. Then a number of bands that...

chapter

Hierarchical audio classification using cepstral modulation ratio regressions based on Legendre polynomials

Anil Nagathil, Peter Gottel, Rainer Martin

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2216 - 2219

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this work we present a scalable feature set which is obtained by fitting orthogonal polynomials to the normalized modulation spectrum of cepstral coefficients and which can be easily adapted to different classification tasks. The performance of the feature set is investigated in a hierarchically structured audio signal classification experiment and compared with other approaches reported in the...

chapter

Voice source features for cognitive load classification

Tet Fei Yap, Julien Epps, Eliathamby Ambikairajah, Eric H. C. Choi

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5700 - 5703

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Previous work in speech-based cognitive load classification has shown that the glottal source contains important information for cognitive load discrimination. However, the reliability of glottal flow features depends on the accuracy of the glottal flow estimation, which is a non-trivial process. In this paper, we propose the use of acoustic voice source features extracted directly from the speech...

chapter

Investigations into the incorporation of the Ideal Binary Mask in ASR

William Hartmann, Eric Fosler-Lussier

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4804 - 4807

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

While much work has been dedicated to exploring how best to incorporate the Ideal Binary Mask (IBM) in automatic speech recognition (ASR) for noisy signals, we demonstrate that the simple use of masked speech can outperform standard spectral reconstruction methods. We explore the effects of both the accuracy of the mask estimation and the strength of the language model on our results. The relative...

chapter

Study of automatic biosounds detection and classification using SVM and GMM

Bor Jenq Chua, Xue Jun Li, Huy Dat Tran

2011 IEEE/NIH Life Science Systems and Applications Workshop (LiSSA) > 155 - 158

2011 IEEE/NIH 5th Life Science Systems and Applications Workshop (LiSSA)

Ambulatory devices can be used to detect heart diseases and save lives in critical time. These devices are based on sound classification that usually adopts a suitable data mining algorithm. This paper investigates the performance of Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) classifiers in classifying sound samples. SVM classifier makes use of a linearly separable hyperplane to...

chapter

Comparison of features extracted using time-frequency and frequency-time analysis approach for text-independent speaker identification

N Sen, T Basu, S Chakroborty

2011 National Conference on Communications (NCC) > 1 - 5

2011 National Conference on Communications (NCC)

This paper compares the feature sets extracted using time-frequency analysis approach and frequency-time analysis approach for text-independent speaker identification. Mel-frequency cepstral coefficient (MFCC) feature set and Inverted Mel-frequency cepstral coefficient (IMFCC) feature set are extracted using time-frequency analysis approach. Temporal energy subband cepstral coefficient (TESBCC) feature...

chapter

Effect of MFCC normalization on vector quantization based speaker identification

M H Shirali-Shahreza, Sajad Shirali-Shahreza

The 10th IEEE International Symposium on Signal Processing and Information Technology > 250 - 253

2010 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2010)

Mel Frequency Cepstral Coefficients (MFCC) are widely used in speech recognition and speaker identification. MFCC features are usually pre-processed before being used for recognition. One of these pre-processing is creating delta and delta-delta coefficients and append them to MFCC to create feature vector. Another pre-processing is coefficients mean normalization. In this paper, the effect of these...

chapter

A novel segmentation method of Sound-Packets for Bangla speech signal

M A N R Rahaman, A Das, M Z Nayen, M S Rahman

International Conference on Electrical&Computer Engineering (ICECE 2010) > 510 - 513

2010 6th International Conference on Electrical & Computer Engineering (ICECE 2010)

This paper describes several Sound-Packet segmentation techniques, which will facilitate Automatic Speech Recognition (ASR) for Bangla speech signal. The approximate duration of a sound-packet has been determined and an envelope-detection method has been presented to determine the end-points of sound-packets. The 1^st difference method, based on moving average of 1^st difference of the signal, is then...

chapter

New robust speech recognition using DTW in noise

Zhang Yuxin, Y Miyanaga, C Siriteanu

2010 10th International Symposium on Communications and Information Technologies > 34 - 38

2010 10th International Symposium on Communications and Information Technologies (ISCIT 2010)

This paper proposes a new robust speech recognition method. Since the hidden Markov model (HMM) algorithm need a lot of training calculation, The dynamic time warping (DTW) algorithm based on median filter is used instead in our system. According to the short-term energy method, the non-speech segment can be removed. Recognition accuracy is thus improved. The cepstral mean subtraction (CMS), running...

chapter

Speech Emotion Analysis in Noisy Real-World Environment

A Tawari, M M Trivedi

2010 20th International Conference on Pattern Recognition > 4605 - 4608

2010 20th International Conference on Pattern Recognition (ICPR 2010)

Automatic recognition of emotional states via speech signal has attracted increasing attention in recent years. A number of techniques have been proposed which are capable of providing reasonably high accuracy for controlled studio settings. However, their performance is considerably degraded when the speech signal is contaminated by noise. In this paper, we present a framework with adaptive noise...

chapter

Auditory Features Revisited for Robust Speech Recognition

F Kelly, N Harte

2010 20th International Conference on Pattern Recognition > 4456 - 4459

2010 20th International Conference on Pattern Recognition (ICPR 2010)

Auditory based front-ends for speech recognition have been compared before, but this paper focuses on two of the most promising algorithms for noise robustness in automatic speech recognition (ASR). The feature sets are Zero-Crossings with Peak Amplitudes (ZCPA) and the recently introduced Power-Law Nonlinearity and Power-Bias Subtraction (PNCC). Standard Mel-Frequency Cepstral Coefficients (MFCC)...

chapter

Influence of acoustic low-level descriptors in the detection of clinical depression in adolescents

Lu-Shih Alex Low, Namunu C Maddage, Margaret Lech, Lisa Sheeber, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5154 - 5157

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

In this paper, we report the influence that classification accuracies have in speech analysis from a clinical dataset by adding acoustic low-level descriptors (LLD) belonging to prosodic (i.e. pitch, formants, energy, jitter, shimmer) and spectral features (i.e. spectral flux, centroid, entropy and roll-off) along with their delta (Δ) and delta-delta (Δ-Δ) coefficients to two baseline features of...

Keywords:
ACCURACY
SPEECH
CEPSTRAL ANALYSIS

Publication date

Set your own date range

Publication type

book (39)
article (2)

Keywords

FEATURE EXTRACTION (24)
MEL FREQUENCY CEPSTRAL COEFFICIENT (17)
SPEECH RECOGNITION (15)
SPEAKER RECOGNITION (13)
SPEECH PROCESSING (12)
HIDDEN MARKOV MODELS (11)
MFCC (9)
NOISE (9)
TRAINING (8)
AUTOMATIC SPEECH RECOGNITION (7)
DATABASES (7)
GAUSSIAN MIXTURE MODEL (7)
GAUSSIAN PROCESSES (7)
SPEAKER IDENTIFICATION (6)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (5)
NOISE MEASUREMENT (5)
ROBUSTNESS (5)
CLASSIFICATION (4)
ESTIMATION (4)
GMM (4)
HIDDEN MARKOV MODEL (4)
MEL-FREQUENCY CEPSTRAL COEFFICIENT (4)
SIGNAL TO NOISE RATIO (4)
VECTOR QUANTIZATION (4)
ACOUSTIC SIGNAL PROCESSING (3)
ACOUSTICS (3)
AUDIO SIGNAL PROCESSING (3)
CLASSIFICATION ALGORITHMS (3)
CORRELATION (3)
INDEXES (3)
MEDICAL SIGNAL PROCESSING (3)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (3)
NOISE ROBUSTNESS (3)
SIGNAL CLASSIFICATION (3)
SPEECH ANALYSIS (3)
SPEECH ENHANCEMENT (3)
SUPPORT VECTOR MACHINES (3)
TESTING (3)
ADAPTATION MODEL (2)
ADOLESCENTS (2)
ALGORITHM DESIGN AND ANALYSIS (2)
APPROXIMATION METHODS (2)
CEPSTRAL FEATURE (2)
CEPSTRUM (2)
CHANNEL BANK FILTERS (2)
CLINICAL DEPRESSION (2)
COMPUTATIONAL MODELING (2)
COMPUTERS (2)
DATA MINING (2)
DELTA COEFFICIENTS (2)
DELTA-DELTA COEFFICIENTS (2)
DISCRETE FOURIER TRANSFORMS (2)
DISEASES (2)
EDUCATIONAL INSTITUTIONS (2)
EMOTION RECOGNITION (2)
FAST FOURIER TRANSFORMS (2)
FILTER BANKS (2)
FILTERING THEORY (2)
FINITE IMPULSE RESPONSE FILTER (2)
FREQUENCY RESPONSE (2)
GAUSSIAN MIXTURE MODELS (2)
HARMONIC ANALYSIS (2)
MICROPHONES (2)
NIST (2)
NOISE REDUCTION (2)
PATTERN CLASSIFICATION (2)
PEDIATRICS (2)
POLYNOMIALS (2)
PREDICTION ALGORITHMS (2)
PSYCHOLOGY (2)
SET THEORY (2)
SIGNAL PROCESSING (2)
SILICON (2)
SPEAKER VERIFICATION (2)
SPECTRAL CENTROID (2)
SPEECH EMOTION ANALYSIS (2)
SUPPORT VECTOR MACHINE CLASSIFICATION (2)
VECTOR QUANTISATION (2)
VQ (2)
WHITE NOISE (2)
ACCELERATION (1)
ACOUSTIC ANALYSIS (1)
ACOUSTIC FEATURES (1)
ACOUSTIC LOW-LEVEL DESCRIPTORS (1)
ACOUSTIC MEASUREMENTS (1)
ACOUSTIC REFLECTION (1)
ACOUSTIC TRANSFER FUNCTION (1)
ACTIVE MICROPHONE (1)
ACTIVE OPERATION (1)
ADAPTIVE NOISE CANCELLATION (1)
ADDITIVE NOISE (1)
ADSV (1)
AFFECT ANALYSIS (1)
AFFECT EXPRESSION RECOGNITION (1)
AFFECTIVE COMPUTING (1)
AGE CLASSIFICATION (1)
AGE CLASSIFICATION METHOD (1)
more

INFONA - science communication portal

Advanced search

Advanced search

On the use of EMD for automatic newborn cry segmentation

Cepstral noise subtraction for robust automatic speech recognition

A reliable speaker verification system based on LPCC and DTW

Perceptual Evaluation of Voice Quality and Its Correlation with Acoustic Measurement

Automatic detection of Parkinson's disease using noise measures of speech

Connected-digits recognition for an under-resourced language using Hidden Markov Models

A phone segmentation method and its evaluation on Mandarin speech corpus

Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise

Unsupervised phone segmentation method using delta spectral function

Hierarchical audio classification using cepstral modulation ratio regressions based on Legendre polynomials

Voice source features for cognitive load classification

Investigations into the incorporation of the Ideal Binary Mask in ASR

Study of automatic biosounds detection and classification using SVM and GMM

Comparison of features extracted using time-frequency and frequency-time analysis approach for text-independent speaker identification

Effect of MFCC normalization on vector quantization based speaker identification

A novel segmentation method of Sound-Packets for Bangla speech signal

New robust speech recognition using DTW in noise

Speech Emotion Analysis in Noisy Real-World Environment

Auditory Features Revisited for Robust Speech Recognition

Influence of acoustic low-level descriptors in the detection of clinical depression in adolescents

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options