The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Even if the Vector Space Model used for document representation in information retrieval systems integrates a small quantity of knowledge it continues to be used due to its computational cost, speed execution and simplicity. We try to improve this document representation by adding some syntactic information such as the parts of speech. In this paper, we have evaluated three different tagging algorithms...
This paper revealed the analysis of speaker independent isolated Pashto spoken numbers for determination of automatic speech recognition. Initially the database was developed, the database encompasses isolated Pashto numbers from sefer (0) to sul (100). Fifty speakers (25 male, 25 females with different ages) that can frequently speak yousafzai dialect were selected for recording. The recording has...
This paper describes an implementation of speech recognition that recognizes and suppresses ten (10) defined profane and vulgar Filipino words. The adapted speech recognition architecture was that of the Oregon Graduate Institute's (OGI) Center for Spoken Language and Learning (CSLU). It utilizes a hybrid Hidden Markov Model/ Artificial Neural Network (HMM/ANN) keyword spotting framework. The feature...
The lack of speech resources in the Arabic language is one of the most important obstacles facing speech researchers. Previously, we designed two Arabic and English automatic speech recognition systems (ASR) using two corpora: TIMIT for English language and West Point for Arabic language. Cross-language experiments were conducted using the two systems, and the results were determined with respect...
Sampling has long been a prominent tool in statistics and analytics, first and foremost when very large amounts of data are involved. In the realm of very large file systems (and hierarchical data stores in general), however, sampling has mostly been ignored and for several good reasons. Mainly, running sampling in such an environment introduces technical challenges that make the entire sampling process...
Emotion Recognition from speech has evolved itself as the most significant research area in the field of affective computing. In this paper, two emotional speech datasets, have been analyzed, based on gender distinction (male and female speech). This paper introduces a new approach of speech-emotion recognition based on the use of AdaBoost classification Algorithm. Artificial neural network has been...
Speech recognition is one of the promising technologies of the future. Voice user interfaces play an important role in many real world applications. This paper presents speaker independent isolated digit recognition for Malayalam language and reveals some application areas of digit recognition. Mel-Frequency Cepstral Coefficient(MFCC) is used as feature and Hidden Markov Model(HMM) is used as the...
In speaker recognition tasks, one of the reasons for reduced accuracy is due to closely resembling speakers in the acoustic space. In order to increase the discriminative power of the classifier, the system must be able to use only the unique features of a given speaker with respect to his/her acoustically resembling speaker. This paper proposes a technique to reduce the confusion errors, by finding...
Research in speech recognition area has made considerable progress in achieving the task with tremendous growth of technology. Speech rate is one of the important factors which affect the speech recognition accuracy. In the present work, training is performed on different speech rates (Normal, Slow and Fast) and testing also done on different rates of speech. Error rate will increase when the major...
In this paper we present an overview of state-of-the-art approaches for speaker identification. Due to the increased number of dialogue system applications the interest in that field has grown significantly in recent years. Nevertheless, there are many open issues in the field of automatic speaker identification. Among them the choice of the appropriate speech signal features and machine learning...
This study investigated the effects of three different carriers on Mandarin tone perception. Three tone continua were constructed: Modified speech, synthesized speech, and nonspeech. Identification tests were conducted for the two speech continua, while discrimination tests were conducted for all the three continua. Results showed that category boundary position differed significantly between the...
Four multiclass Support Vector Machines (SVMs) methods were designed for the task of speaker independent phoneme recognition. These are the All-at-once, One-against-all, One-against-one, and the Directed Acyclic Graph SVM (DAGSVM). The Discrete Wavelet Transform (DWT) 8 frequency band power percentages are used for feature extraction. All tests were carried out on the TIMIT database. Comparable recognition...
A personalized emotion recognition system aims to tune the model to recognize the expressive behaviors of a targeted person. Such a system can play an important role in various domains including call center and health care applications. Adapting any general emotion recognition system for a particular individual requires speech samples and prior knowledge about their emotional content. These assumptions...
Previous studies of an automated detection of Major Depression in adolescents based on acoustic speech analysis identified the glottal and the Teager Energy features as the strongest correlates of depression. This study investigates the effectiveness of these features in an early prediction of Major Depression in adolescents using a fully automated speech analysis and classification system. The prediction...
This paper considers a learning framework for speech emotion classification using a discriminant function based on Gaussian mixture models (GMMs). The GMM parameter set is estimated by margin scaling with a loss function to reduce the risk of predicting emotions with high loss. Here, the loss function is defined as a function of a distance metric using the Watson and Tellegen's emotion model. Margin...
Previously, we compared several objective measures to estimate the subjective speech intelligibility scores of the Japanese Diagnostic Rhyme Test (DRT). PESQ-derived MOS, segmental SNR (SNRseg), frequency-weighed segmental SNR (fwSNRseg), and composite measures were tested. We mapped these measures to its corresponding intelligibility scores using quadratic equations trained on one speaker and one...
Ambulatory devices can be used to detect heart diseases and save lives in critical time. These devices are based on sound classification that usually adopts a suitable data mining algorithm. This paper investigates the performance of Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) classifiers in classifying sound samples. SVM classifier makes use of a linearly separable hyperplane to...
In speaker recognition tasks, the main reason for reduced accuracy is due to closely resembling speakers in the acoustic space. Conventional GMM-based modelling technique captures unique features along with common features among various classes. Further, it ignores knowledge of phonetic content of the speech. In order to increase the discriminative power of the classifier, the system must be able...
A feature extraction method is presented that is robust against vocal tract length changes. It uses the generalized cyclic transformations primarily used within the field of pattern recognition. In matching training and testing conditions the resulting accuracies are comparable to the ones of MFCCs. However, in mismatching training and testing conditions with respect to the mean vocal tract length...
The performance of speaker identification systems has improved due to recent advances in speech processing techniques but there is still need of improvement in term of text-independent speaker identification and suitable modelling techniques for voice feature vectors. It becomes difficult for person to recognize a voice when an uncontrollable noise adds in to it. In this paper, feature vectors from...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.