The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Development of automatic speech recognition (ASR) systems robust to late reverberation action is urgent task. It is well known that a late reverberation reduction algorithm used as ASR pre-processor demands prior estimation of reverberation time. Blind reverberation time measurements are less accurate than ones for known room impulse response (RIR) direct measurements. As result, it is naturally expect...
Depression is considered as a psychosomatic state associated with the soft biometric features. People suffering from depression always behave abnormal. Depression is a clinically proven disorder that can overwhelm a person and his ability to perform even a simple task. Soft biometric provides important information about a person without being enough for their verification because they lack uniqueness...
In this letter, we present a novel speech separation scheme using two microphones. We divide the inter-phase information into sub-segments and statistic these directional segments. Then we construct the objective function by convolving the statistics information with a low pass filter. By the decreasing gradient algorithm and ideal binary mask, we obtain the separated speeches. The method is valid...
This paper deals with the sound event classification for automatic audio-based surveillance. To improve the performance, we proposed a feature vector combination scheme to use multiple feature vectors simultaneously. Then, the performance is evaluated by using the combination of three segment-based features. The result shows significant amount of improvement compare to the conventional method.
This study elaborates on the implementation of a strong noise suppression algorithm for speech-related applications. Three different structures of the linear minimum mean square error estimator are presented with parameter estimation. The reference voice sample is replaced with a similar one without degrading the performance.
The estimation of a filter that determines an echo path is a difficult problem when double-talk is present. We use a Blind Source Separation model based on signals' nonstationarity with a partially known mixing matrix to estimate the filter during the double talk. A second-order approximation of a log-likelihood function is used to derive a quadratic crite-rion. Then, we propose two methods to estimate...
Sound source localization algorithms commonly include assessment of inter-sensor (generalized) correlation functions to obtain direction-of-arrival estimates. Here, we present a classification-based method for source localization that uses discriminative support vector machine-learning of correlation patterns that are indicative of source presence or absence. Subsequent probabilistic modeling generates...
The problem of blind estimation of the room acoustic clarity index C50 from single-channel reverberant speech signals is presented in this paper. We analyze the performance of several machine learning methods for a regression task using 309 features derived from the speech signal and modeled with a Deep Belief Network (DBN), Classification And Regression Tree (CART) and Linear Regression (LR). These...
This paper presents a passive acoustic self-localization and synchronization system, which estimates the positions of wireless acoustic sensors utilizing the signals emitted by the persons present in the same room. The system is designed to utilize common off-the-shelf devices such as mobile phones. Once devices are self-localized and synchronized, the system could be utilized by traditional array...
Formants are able to define basic properties of speech efficiently by using very limited parameter sets; thus they have found important usage area at many applications of speech processing like coding, recognition, synthesis and enhancement. Estimation of formants is harder than simply tracking the peaks of the spectrum; as the output of the vocal tract's spectral peaks are dependent on the shape...
An audio recording, made in a real environment, carries an acoustical signature which changes according to the acoustical characteristics of the environment and the recording positions. This signature which is similar to a 3D room impulse response contains the directions, levels and arrival times of the direct source and reflections. Although it is easy to obtain reverberant recordings by convolving...
Blind separation techniques of sound sources, designed to work with voice signals, present a performance highly dependent on the number of coefficients of the separation system. In general, different environments require different lengths of separation filters. This paper proposes the use of reverberation time information arising from lateral blind estimation techniques for tuning the degree of freedom...
Audio source separation consists in recovering different unknown signals called sources by filtering their observed mixtures. In music processing, most mixtures are stereophonic songs and the sources are the individual signals played by the instruments, e.g. bass, vocals, guitar, etc. Source separation is often achieved through a classical generalized Wiener filtering, which is controlled by parameters...
Many application areas that use supervised machine learning make use of multiple raters to collect target ratings for training data. Usage of multiple raters, however, inevitably introduces the risk that a proportion of them will be unreliable. The presence of unreliable raters can prolong the rating process, make it more expensive and lead to inaccurate ratings. The dominant, "static" approach...
We consider the estimation of multiple room impulse responses from the simultaneous recording of several known sources. Existing techniques are restricted to the case where the number of sources is at most equal to the number of sensors. We relax this assumption in the case where the sources are known. To this aim, we propose statistical models of the filters associated with convex log-likelihoods,...
In this paper, we analyze the difficult problem of estimating low fundamental frequencies from periodic signals, like those produced by musical instruments. The problem arises when the fundamental frequency is low for a given number of samples as this causes the harmonics to overlap in the frequency domain. Moreover, we demonstrate how the performance of estimators can generally be improved by avoiding...
Discriminative Training (DT) methods for acoustic modeling, such as MMI, MCE, and SVM, have been proved effective in speaker recognition. In this paper we propose a DT method for GMM using soft frame margin estimation. Unlike other DT methods such as MMI or MCE, the soft frame margin estimation attempts to enhance the generalization capability of GMM to unseen data in case the mismatch exists between...
Speech-to-speech translation systems have made a great deal of progress in recent years. But users of such systems still face the problem of not knowing whether the system has translated their utterance correctly. Various confirmation strategies can be used to address this problem. Some of these generate a confirmation utterance for the user to approve, such as reading back the ASR result, or performing...
This paper prepares a review of ICA based approaches that are used for separation of components in functional MRI sequences. In previous works, the FastICA and the Infomax algorithms are investigated in more details; therefore, in this paper we focus on methods such as "radical ICA", "SDD ICA", "Erica" and "Evd" for separation purposes. This comparative study...
In this paper improvements to a previous work are presented. Removing the redundant artifacts in the fingerprint mask is introduced enhancing the final result. The proposed method is entirely adaptive process adjusting to each fingerprint without any further supervision of the user. Hence, the algorithm is insensitive to the characteristics of the fingerprint sensor and the various physical appearances...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.