The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Feature selection is a strategy that aims at making text classifiers more efficient and accurate. In this paper, we proposed a novel feature selection method based on Tibetan grammar for Tibetan classification. Tibetan language express grammatical meaning through the function words and word order, and the function word has large proportions. By analyzing the Tibetan grammar and distribution of part...
In speech development research, it's important to know how speech acoustic features vary as a function of age and the age when the variability and magnitude of acoustic features start to exhibit adult-like patterns. During the first few years of life, a child's speech changes from the cries and babbles of an infant to adult-like words and phrases of a young child. A number of acoustic studies observed...
Human computer interaction with the time has extended its branches to many different other fields like engineering, cognition, medical etc. Speech analysis has also become an important area of concern. People involved are using this mode for the interaction with the machines to bridge the gap between physical and digital world. Speech emotion recognition has become an integral subfield in the domain...
We present in this paper a new Direct Access Framework (DAF) for speaker identification system, to identify a speaker based on original characteristics of the human voice. Direct access method is a process to identify an object based on parts of the object itself, the parts called original characteristics. The proposed framework consists of two parts, the enrolment process and the identification process...
The classical front end analysis in speech recognition is a spectral analysis which parameterizes the speech signal into feature vectors. This paper proposes a voice recognition model that is able to automatically classify and recognize a voice signal with background noise. The model uses the concept of spectrogram, pitch period, short time energy, zero crossing rate, mel frequency scale and cepestral...
Aimed at the problem of real-time speech and music discrimination, this paper proposes a frame-level classification method by using a novel “butterfly-like” fusion strategy based on decision tree (D-Tree).In our method, some homotypes of long-term features but in different time lengths are extracted to train each sub-classifier and make the fusion resultful. A testing experiment indicates our approach...
Many past studies have been conducted on speech/music discrimination due to the potential applications for broadcast and other media; however, it remains possible to expand the experimental scope to include samples of speech with varying amounts of background music. This paper focuses on the development and evaluation of two measures of the ratio between speech energy and music energy: a reference...
Emotion Recognition from speech has evolved itself as the most significant research area in the field of affective computing. In this paper, two emotional speech datasets, have been analyzed, based on gender distinction (male and female speech). This paper introduces a new approach of speech-emotion recognition based on the use of AdaBoost classification Algorithm. Artificial neural network has been...
In this study, we utilized an improved version of the classical KNN algorithm which associates to each parameter from the features vectors weights according to their performance in the classification process. We obtained the recognition percents of emotions around 65–67%, for the Romanian language, on the SROL database, which are comparable with the results for other languages, with non-professional...
In this study, semi automation prediction of PD is investigated based on twenty two features of voice samples extracted from 147 subjects. Firstly, the original features of voice are used for recognition of PD or otherwise with MLP as classifier and Levenberg Marquardt and Scaled Conjugate Gradient as training algorithm. Next, to identify the number of significant features amongst the original attributes,...
Sound source localization plays a crucial role in many microphone arrays application, ranging from speech enhancement to human-computer interface in a reverberant noisy environment. The steered response power (SRP) using the phase transform (SRP-PHAT) method is one of the most popular modern localization algorithms. The SRP-based source localizers have been proved robust, however, the methods may...
In this paper we design a system that adopts a novel approach for emotional classification from human dialogue based on text and speech context. Our main objective is to boost the accuracy of speech emotional classification by accounting for the features extracted from the spoken text. The proposed system concatenates text and speech features and feeds them as one input to the classifier. The work...
Recent research has demonstrated the potential of using an articulation-based silent speech interface for command-and-control systems. Such an interface converts articulation to words that can then drive a text-to-speech synthesizer. In this paper, we have proposed a novel near-time algorithm to recognize whole-sentences from continuous tongue and lip movements. Our goal is to assist persons who are...
We present an objective acoustic feature selection for automatic affective sounds detection based on stochastic evolutionary optimization algorithms. Particle Swarm Optimization (PSO) as well as Genetic Algorithms (GA) are exploit to select the most appropriate audio features from a large set of available features. We performed experiments on a dataset containing about two hours of affective sounds...
This paper presents a classifier combination to solve telegraphese restoration problem. By implementing more than one classifier, it can support other classifier, and finally it can improve the performance. Using supplied development data, training data and testing data, the best model had an accuracy F = 79 %.
This paper proposes a method to build a robust speech emotion recognition system for consumer electronic applications. Traditional method of two-class (neutral/anger) emotion recognition is extended into two-step hierarchical structure by using emotional characteristics and gender difference. Experimental results confirm the very stable and successful emotion classification performance over the traditional...
A novel approach was developed to recognize vowels from continuous tongue and lip movements. Vowels were classified based on movement patterns (rather than on derived articulatory features, e.g., lip opening) using a machine learning approach. Recognition accuracy on a single-speaker dataset was 94.02% with a very short latency. Recognition accuracy was better for high vowels than for low vowels....
In this paper a hierarchical structure is proposed for automatic gender identification (AGI). In this structure two clustering techniques are used. The first technique is divisive clustering for dividing speakers from each gender to some classes of speakers. The second clustering technique is agglomerative clustering for creating a hierarchical structure. Feature reduction is done by SOAP feature...
We propose speaker gender recognition achieved by using score level fusion by AdaBoost. Soft biometrics has been focused on because recognition by fusing biometric systems and soft biometric traits may improve the accuracy of recognition and decrease the time for this. Gender recognition is important for speaker recognition and can provide important information to speaker recognition systems. Mel-frequency...
The field of Text Mining has evolved over the past years to analyze textual resources. However, it can be used in several other applications. In this research, we are particularly interested in performing text mining techniques on audio materials after translating them into texts in order to detect the speakers' emotions. We describe our overall methodology and present our experimental results. In...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.