The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper an efficient feature extraction methods and fuzzy logic based disorder assessment technique were used to investigate voice signals of patients suffering from functional dysphonia, hyperfunctional dysphonia, vocal cord paralysis and laryngitis. In this work, a vector made up from 28 acoustic parameters was an input for Principal Component Analysis, kernel Principal Component Analysis...
In spatial audio analysis-synthesis, one of the key issues is to decompose a signal into primary and ambient components based on their spatial features. Principal component analysis (PCA) has been widely employed in primary component extraction, and shifted PCA (SPCA) is employed to enhance the primary extraction for input signals involving the inter-channel time difference. However, SPCA generally...
Recent work has demonstrated the feasibility of extracting semantic categories directly from cortical measures (e.g., electroencephalography, EEG) during receptive tasks. Here, we automatically classify speech stimuli as either synonymous or non-synonymous with a prior prime in a speech-receptive task given only EEG data with up to 86.84% accuracy. An analysis of variance reveals no significant difference...
This paper motivates the use of combination of mel frequency cepstral coefficients (MFCC) and its delta derivatives (DMFCC and DDMFCC) calculated using mel spaced Gaussian filter banks for text independent speaker recognition. MFCC modeled on the human auditory system shows robustness against noise and session changes and hence has become synonymous with speaker recognition. Our main aim is to test...
This paper presents the process of Quranic Accent Automatic Identification. Recent feature extraction technique that is used for Quranic verse rule identification/Tajweed include Mel Frequency Cepstral Coefficients (MFCC) which prone to additive noise and may reduce the classification result. Therefore, to improve the performance of MFCC with addition of Spectral Centroid features and is proposed...
Parkinson's disease (PD) is a neurodegenerative brain disorder that occurs when approximately 60% to 80% of the dopamine-producing cells are damaged. PD is the second common neurodegenerative disorder after Alzheimer. PD could be diagnosed by various signals such as EEG, gait and speech. Approximately, 90 percent of people with PD suffer from speech disorder, thus it might be considered as the easiest...
in this paper, the principal component analysis (PCA) is applied to speech emotion recognition for improving the accuracy of the system. The traditional prosodic features like pitch-related features and formant-related features are extracted from the Berlin speech database [7] and a Chinese database. These collected feature data is processed by PCA to remove the irrelevant information. After that,...
In this study, semi automation prediction of PD is investigated based on twenty two features of voice samples extracted from 147 subjects. Firstly, the original features of voice are used for recognition of PD or otherwise with MLP as classifier and Levenberg Marquardt and Scaled Conjugate Gradient as training algorithm. Next, to identify the number of significant features amongst the original attributes,...
The number of speech features that are introduced to emotional speech recognition exceeds some thousands and this makes dimensionality reduction an inevitable part of an emotional speech recognition system. The elastic net, the greedy feature selection, and the supervised principal component analysis are three recently developed dimensionality reduction algorithms that we have considered their application...
Voice based call centers enable customers to query for information by speaking to agents in the call center. Most often these call conversations are recorded for analysis with the intent of trying to identify things that can help improve the performance of the call center to serve the customer better. Today the recorded conversations are analyzed by humans by listening to call conversations, which...
To improve the accuracy of visual speech recognition systems, forming a subset of relevant visual features, from a large set of extracted visual cues, is of fundamental importance. In this paper, two feature selection techniques, Principal Component Analysis (PCA) and a relatively recent method, Minimum Redundancy Maximum Relevance (mRMR), are separately applied on the extracted visual features. Prominent...
The quality of shared enjoyment in interactions is a key aspect related to Autism Spectrum Disorders (ASD). This paper discusses two types of enjoyment: the first refers to humorous events and is associated with one's positive affective state and the second is used to facilitate social interactions between people. These types of shared enjoyment are objectively specified by their proximity to a voiced...
Brain-computer interfaces (BCIs) based on event-related potentials (ERP) are promising tools to communicate with patients suffering from some severe disabled diseases. ERP is evoked by various stimuli such as auditory, olfactory, and visual stimuli. Some auditory based BCIs with certain synthetic tone have been proposed, however, it is still challenging to increase the number of commands in auditory-based...
Speaker recognition is important for successful development of speech recognizers in various real world applications. In this paper, the speaker recognizer was developed using sizable collection of various speakers both male as well as female with pitch strength as the feature. We proposed Principal Factor Analysis (PFA) technique for dimensionality reduction for accurate speaker recognition system...
This article presents the task of speaker identification in a closed group. It discusses main steps of the identification process ranging from the proper speech features to the classification methods and statistical signal processing. However, its main focus is on tuning the final system using KNN classification method by setting up the number of neighbors, and reducing the feature vector dimension...
Spoken emotion recognition is an interesting and challenging subject. In this paper, a new feature extraction method based on local Fisher discriminant analysis (LFDA) is proposed for spoken emotion recognition. LFDA is used to extract the low-dimensional discriminant embedded feature data from high-dimensional emotional speech features on spoken emotion recognition tasks. The performance of LFDA...
This paper presents two nonlinear feature dimensionality reduction methods based on neural networks for a HMM-based phone recognition system. The neural networks are trained as feature classifiers to reduce feature dimensionality as well as maximize discrimination among speech features. The outputs of different network layers are used for obtaining transformed features. Moreover, the training of the...
Using local features generally provides higher accuracies compared to a global feature vector in face identification. In this study, taking into account the fact that better multimodal systems generally include individually good experts, multimodal identification using speech and local feature based face experts is studied. Both spPCA and mPCA are considered for this purpose. Experiments on XM2VTS...
In recent years, the field of automatic speaker identification has begun to exploit high-level sources of speaker-discriminative information, in addition to traditional models of spectral shape. These sources include pronunciation models, prosodic dynamics, pitch, pause, and duration features, phone streams, and conversational interaction. As part of this broader thrust, we explore a new frame-level...
SVM is a novel statistical learning method that has been successfully applied in speaker recognition. However, Extractive feature vectors from the speech are overlapped and noisy is included in the original data space, these problems can lead to experience difficulties, training complication during training SVM, and the result will be reduced during the recognition phase. In this paper, a novel method...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.