The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper investigates the use of MultiDimensional Voice Program (MDVP) parameters to automatically detect voice pathology in Arabic voice pathology database (AVPD). MDVP parameters are very popular among the physician / clinician to detect voice pathology; however, MDVP is a commercial software. AVPD is a newly developed speech database designed to suit a wide range of experiments in the field of...
Human computer interaction with the time has extended its branches to many different other fields like engineering, cognition, medical etc. Speech analysis has also become an important area of concern. People involved are using this mode for the interaction with the machines to bridge the gap between physical and digital world. Speech emotion recognition has become an integral subfield in the domain...
This paper investigates the contribution of frequency bands for automatic voice pathology detection. First, the input voice signal is passed through a number of time-domain band-pass filters. The center frequencies are spaced on an octave scale. Each filter output is then divided into overlapping frames. Auto-correlation function is applied to each block to find the first largest peak, in areas other...
The use of Electroencephalography (EEG) in the domain of Brain Computer Interface is a now common place. EEG for imagined speech reproduction and observation of brain response to audio stimuli are active areas of research. In this paper, we consider the case of imagined and mouthed non-audible speech recorded with EEG electrodes. We analyze different feature extraction techniques such as Mel Frequency...
In the past decade a lot of research has gone into Automatic Speech Emotion Recognition(SER). The primary objective of SER is to improve man-machine interface. It can also be used to monitor the psycho physiological state of a person in lie detectors. In recent time, speech emotion recognition also find its applications in medicine and forensics. In this paper 7 emotions are recognized using pitch...
Speech signal processing and its recognition system have gained a lot of attention from last few years due to its widespread application. In this study, we have conducted a comparative analysis for effective detection of Parkinson's disease using various machine learning classifiers from voice disorder known as dysphonia. To investigate robust detection process, three independent classifier topologies...
The use of digital technology is growing at a very fast pace which led to the emergence of systems based on the cognitive infocommunications. The expansion of this sector impose the use of combining methods in order to ensure the robustness in cognitive systems.
Detecting emotional traits in call centre interactions can be beneficial to the quality management of the services provided, since this reveals the positioning of both speakers, i.e. satisfaction or frustration and anger on the customers' side, and stress detection, disappointment mitigation or failure to provide the requested service on the operators' side. This paper describes a machine learning...
Parkinson's disease (PD) is a neurodegenerative brain disorder that occurs when approximately 60% to 80% of the dopamine-producing cells are damaged. PD is the second common neurodegenerative disorder after Alzheimer. PD could be diagnosed by various signals such as EEG, gait and speech. Approximately, 90 percent of people with PD suffer from speech disorder, thus it might be considered as the easiest...
In certain situations, speech might be shifted in the frequency domain amid the presence of noise. To be able to compensate for the spectral shift, it is important to know the amount of frequency shift present. A method based on Mel-frequency-cepstral-coefficient (MFCC) and Gaussian Mixture model (GMM) super vector is proposed for detecting frequency shifts in speech. MFCC or LFCC is extracted to...
Using melody and/or lyric to query a music retrieval system is convenient for users but challenging for developers. This paper proposes efficient schemes for realizing key algorithms in such a kind of system. Specifically, we characterize our system by adding lyric to query as follows: A Support Vector Machine (SVM) is employed to distinguish humming queries from singing queries, For a singing query,...
Speaker recognition is important for successful development of speech recognizers in various real world applications. In this paper, the speaker recognizer was developed using sizable collection of various speakers both male as well as female with pitch strength as the feature. We proposed Principal Factor Analysis (PFA) technique for dimensionality reduction for accurate speaker recognition system...
The paper presents the support vector machine binary decision tree scheme (SVM-BDT) used for broadcast news (BN) audio classification. The SVM-BDT architecture was designed to solve multi-class discrimination problem of considered acoustic events: pure speech, speech with music, speech with environment sound, music, and environment sound. Its performance was investigated by using Mel-frequency cepstral...
Monaural speech separation is a very challenging task. CASA-based systems utilize acoustic features to produce a time-frequency (T-F) mask. In this study, we propose a classification approach to monaural separation problem. Our feature set consists of pitch-based features and amplitude modulation spectrum features, which can discriminate both voiced and unvoiced speech from nonspeech interference...
Ambulatory devices can be used to detect heart diseases and save lives in critical time. These devices are based on sound classification that usually adopts a suitable data mining algorithm. This paper investigates the performance of Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) classifiers in classifying sound samples. SVM classifier makes use of a linearly separable hyperplane to...
Gaussian Mixture Model (GMM) is a widely used, simple and effective modeling approach for spoken language identification. Traditionally EM algorithm is used to train this model. In this paper we propose a new method named WA-GMM (Weight Adapted GMM) for estimating the weights of GMM Gaussian components using bag-of-unigram and Support Vector Machine (SVM): SVM weights which are trained on bag-of-unigram...
In this paper a hierarchical structure is proposed for automatic gender identification (AGI). In this structure two clustering techniques are used. The first technique is divisive clustering for dividing speakers from each gender to some classes of speakers. The second clustering technique is agglomerative clustering for creating a hierarchical structure. Feature reduction is done by SOAP feature...
In this paper, we propose an approach of multi-layered feature combination associated with support vector machine (SVM) for Chinese accent identification. The multi-layered features include both segmental and suprasegmental information, such as MFCC and pitch contour, to capture the diversity of variations in Chinese accented speech. The pitch contour is estimated using cubic polynomial method to...
This paper investigates lexical stress detection for Chinese learners of English, where a combined differential acoustic feature is developed to represent the lexical stress of polysyllabic words in continuous speech. The use of frame-averaged feature and the contextual information intra-word can be input to the classifiers without normalization. The word-based stress detection method proposed in...
In research work, we found that grammatical information in the Modern Chinese Grammar Information Dictionary is very effective to revise chunk border. So the Modern Chinese Grammar Information Dictionary used to extract the chunk Border Revised Rules (BRR). In this paper, a new method of chunking is proposed--combined with BRR and TBL, SVM used for chunking. We reduced the number of SVM feature vector,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.