The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recently speaker recognition system became high interesting by researchers for both software and hardware solutions. Different technologies have been adopted to implement speaker recognition system that has performance with optimal time response with acceptable accuracy. Research progresses are going on to provide highly durable and precise recognition system that can be embedded into critical implementation...
Segmentation and Part of Speech tagging of Oracle inscriptions are the premise and foundation for establishment of Oracle Corpus and computer-aided Oracle textual research and explication. As for segmentation of Oracle inscriptions, this paper proposes a positive match cut algorithm, which adopts language analyzer based on Lucene and supplemented with Oracle Dictionary. Then the segmented words by...
Digit speech recognition is important in many applications such as automatic data entry, PIN entry, voice dialing telephone, automated banking system, etc. This paper presents speaker independent speech recognition system for Malayalam digits. The system employs Mel frequency cepstrum coefficient (MFCC) as feature for signal processing and hidden Markov model (HMM) for recognition. The system is trained...
The paper provides a novel approach to emotion recognition from facial expression and voice of subjects. The subjects are asked to manifest their emotional exposure in both facial expression and voice, while uttering a given sentence. Facial features including mouth-opening, eye-opening, eyebrow-constriction, and voice features including, first three formants: F1, F2, and F3, and respective powers...
We have created and analyzed an elicited emotional database consisting of 340 emotional speech samples under four different emotions neutral, happy, sad and anger. Malayalam (one of the south Indian languages) was used for the experiment. Daubechies8 wavelet was used for feature extraction and artificial neural network was used for pattern recognition. An overall recognition accuracy of 72.055% obtained...
Indian languages such as Hindi is phonetic in nature. The text-to-speech (TTS) system for Hindi, exploits the phonetic nature of Hindi. The algorithm developed by us involves analysis of a sentence in terms of words and then symbols involving combination of pure consonants and vowel technique. Wave files are being merged as per the requirement to generate the modified consonants influenced by matras,...
Speech has recently been recognized as an attractive method for the measurement of cognitive load. Current speech-based cognitive load measurement systems utilize acoustic features derived from auditory-motivated frequency scales. This paper aims to investigate the distribution of speech information specific to cognitive load discrimination as a function of frequency. We found that this distribution...
The paper provides a novel approach to emotion recognition from facial expression and voice of subjects. The subjects are asked to manifest their emotional exposure in both facial expression and voice, while uttering a given sentence. Facial features including mouth-opening, eye-opening, eyebrow-constriction, and voice features including, first three formants: F1, F2, and F3, and respective powers...
The amount of speaker specific information in speech signal varies from frame to frame depending on spoken text and environmental conditions. A frame selection at the preprocessing stage can be an added advantage in this context. In pre-quantization (PQ) we select a new sequence of frames Y from the original frames X such that length of Y is less than X. In this paper, we first analyze a number of...
A number of techniques have been proposed in the literature for phoneme based speech recognition system. In this paper, a technique for automatic phoneme recognition using zero-crossings (ZC) and magnitude sum function (MSF) is proposed. The number of zero-crossings and magnitude sum function per frame are extracted and a minimum distance classifier is proposed to recognize the phonemes in each frame...
Sound localization systems (SLS) identify the direction of a sound source. However, most of approaches focus on near-field identification, i.e. 1~2 m. In this paper we develop a novel algorithm for far-field sound localization based on the average magnitude difference function (AMDF), thereby extending the distance to 5 m. The far-field SLS is implemented on a field programmable gate array (FPGA)...
Multi pattern Viterbi algorithm (MPVA) to jointly decode and recognize multiple speech patterns for automatic speech recognition (ASR) is proposed. The MPVA is a generalization of the Viterbi algorithm (VA) to jointly decode multiple patterns for a given standard hidden Markov model (HMM). Unlike our previously proposed constrained multi pattern Viterbi algorithm (CMPVA), the MPVA does not require...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.