The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents an new approach to noise reduction for voice communication over Bluetooth technology. In the literature, several authors have compared the performance of different filtering techniques, such as the well-known Spectral Subtraction (SS), and Wiener Filter (WF) using simulated data, whereas this research uses real-time data samples collected from cars subjected to a noisy environment...
Estimating speaker's physical parameters like height, weight and shoulder size can assist in voice forensics by providing additional knowledge about the speaker. In this work, statistics of the components of background GMM are employed as features in estimating the physical parameters. These features improved the performance of height and shoulder size estimation as compared to our earlier attempt...
We propose an efficient method to estimate source power spectral densities (PSDs) in a multi-source reverberant environment using a spherical microphone array. The proposed method utilizes the spatial correlation between the spherical harmonics (SH) coefficients of a sound field to estimate source PSDs. The use of the spatial cross-correlation of the SH coefficients allows us to employ the method...
In this paper, the problem of echo cancellation in long acoustic impulse responses (AIRs) is highlighted. Three of the mostly-used recent NLMS-based sparse adaptive filtering algorithms are presented; and their performances in the context of acoustic echo cancellation (AEC) are studied and compared. The algorithms of interest include the improved proportionate normalized least mean square (IPNLMS),...
In this paper, we propose a new time-frequency mask method for computational auditory scene analysis (CASA) based on convex optimization of the binary mask. In the proposed method, the pitch estimation and segment segregation in conventional CASA are completely replaced by the convex optimization of speech power. Considering the cross-correlation between the power spectra of noisy speech and noise...
While most dereverberation methods focus on how to estimate the magnitude of an anechoic signal in the time-frequency domain, we propose a method which also takes the phase into account. By applying a harmonic model to the anechoic signal, we derive a formulation to compute the amplitude and phase of each harmonic. These parameters are then estimated by our method in presence of reverberation. As...
Objective markers obtained from acoustic analysis of speech are of great importance for clinical evaluation of voice disorders because they are non-invasive and provide a severity index of the disorder which allows clinicians to monitor the progress of patients and documents quantitatively the degree of perceived hoarseness. The object of the present study is to introduce a fractional order long-term...
This report discusses the implementation of a computerized algorithm specifically designed to measure the syllables-per-minute rate of abnormal speech typically produced by persons suffering from an articulatory disorder known as dysarthria. This speech rate measurement application — which can also serve as a diagnostic tool in itself — has been integrated into the computerised Frenchay Dysarthria...
Artificial speech bandwidth extension (ABE) is an extremely effective means for speech enhancement at the receiver side of a narrowband telephony call. First approaches have been seen incorporating deep neural networks (DNNs) into the estimation of the upper band speech representation. In this paper we propose a regression-based DNN ABE being trained and tested on acoustically different speech databases,...
Noise reduction algorithms for head-mounted assistive listening devices are crucial to improve speech quality and intelligibility in background noise. For binaural hearing devices with one microphone per device, the noise power spectral density (PSD) is commonly estimated using various assumptions about the acoustic scenario. Since these methods lack robustness if the underlying assumptions are not...
A convolution neural network (CNN) based classification method for broadband DOA estimation is proposed, where the phase component of the short-time Fourier transform coefficients of the received microphone signals are directly fed into the CNN and the features required for DOA estimation are learned during training. Since only the phase component of the input is used, the CNN can be trained with...
This paper describes three methods for multiple fundamental frequencies estimation based on the multi-scale product analysis. The three methods use the autocorrelation of the multi-scale product analysis for the target pitch estimation. For the intrusion pitch, each one has its techniques. The first one uses the classic comb filtering. The second method employs the rectangular comb filter followed...
This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study,...
Glottal closure instant (GCI) is an important feature in many speech processing applications. Many algorithms have been proposed for GCI estimation from speech signals. The objective of the proposed work is to provide a comprehensive analysis of the performance of various GCI estimation algorithms for singing voice in Indian context. GCI estimation algorithms such as Dynamic Programming Phase Slope...
Due to significant signal attenuation, the speech signals collected at different distances show degradation in the estimation of speech parameters. Therefore the work presented in this paper proposes an alternate method for improving the F0 parameter estimation from distant speech (DS) signals which are collected through microphones at various distances. The proposed method achieves improved F0 estimation...
Blind separation of mixtures has been achieved by approximate joint diagonalization (AJD) approaches. This paper presents an approach for overdetermined blind source separation (BSS) using AJD. The approach is based on an alternative minimization of the indirect and direct least-squares criteria to the diagonal matrices in the first phase and to the mixing matrix in the second phase, respectively...
In this paper, we address the estimation of power spectral density (PSD) matrix. The accurate estimation of PSD matrix plays an important role in many speech enhancement methods. In traditional PSD estimation methods, only the information of previous frames is employed through a forgetting factor. In the current research, we consider the correlation of inter-band components and incorporate their information...
Hearable is a recently emerging term that describes a wireless earpiece that enhances the user's listening experience in various acoustic environment. Another important feature of hearable devices is their capability to improve speech communication in difficult social settings, which usually consist of a mixture of different non-stationary noise. In this paper, we present techniques to suppress a...
We propose a new noise estimation method using only the current frame of noisy speech. The proposed method utilizes an inverse comb filter for noisy speech to suppress the power of speech, and estimates the noise from the resulting spectrum. It is shown by experiments that the spectral subtraction combined with the proposed noise estimation method is superior to the conventional speech enhancement...
The fundamental building block of spoken languages is a list of phonemes from which syllables and, hence, also words are formed. A systematic distinction between these phonemes becomes possible by the characteristic frequency components that are included in each sound. On the one hand, voiced phonemes are characterized by several sharp frequency components. On the other hand, wide, typically blurred...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.