Search results for: Jahn Heymann

Items from 1 to 9 out of 9 results

article

A generic neural acoustic beamforming architecture for robust multi-channel speech processing

Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach

Computer Speech & Language > 2017 > 46 > C > 374-385

Acoustic beamforming can greatly improve the performance of Automatic Speech Recognition(ASR) and speech enhancement systems when multiple channels are available. We recently proposed a way to support the model-based Generalized Eigenvalue beamforming operation with a powerful neural network for spectral mask estimation. The enhancement system has a number of desirable properties. In particular, neither...

chapter

Multi-stage coherence drift based sampling rate synchronization for acoustic beamforming

Joerg Schmalenstroeer, Jahn Heymann, Lukas Drude, Christoph Boeddecker, more

2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP) > 1 - 6

2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)

Multi-channel speech enhancement algorithms rely on a synchronous sampling of the microphone signals. This, however, cannot always be guaranteed, especially if the sensors are distributed in an environment. To avoid performance degradation the sampling rate offset needs to be estimated and compensated for. In this contribution we extend the recently proposed coherence drift based method in two important...

chapter

Optimizing neural-network supported acoustic beamforming by algorithmic differentiation

Christoph Boeddeker, Patrick Hanebrink, Lukas Drude, Jahn Heymann, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 171 - 175

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we show how a neural network for spectral mask estimation for an acoustic beamformer can be optimized by algorithmic differentiation. Using the beamformer output SNR as the objective function to maximize, the gradient is propagated through the beamformer all the way to the neural network which provides the clean speech and noise masks from which the beamformer coefficients are estimated...

chapter

Beamnet: End-to-end training of a beamformer-supported multi-channel ASR system

Jahn Heymann, Lukas Drude, Christoph Boeddeker, Patrick Hanebrink, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5325 - 5329

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper presents an end-to-end training approach for a beamformer-supported multi-channel ASR system. A neural network which estimates masks for a statistically optimum beamformer is jointly trained with a network for acoustic modeling. To update its parameters, we propagate the gradients from the acoustic model all the way through feature extraction and the complex valued beamforming operation...

chapter

Neural network based spectral mask estimation for acoustic beamforming

Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 196 - 200

ICASSP 2016 - 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present a neural network based approach to acoustic beamforming. The network is used to estimate spectral masks from which the Cross-Power Spectral Density matrices of speech and noise are estimated, which in turn are used to compute the beamformer coefficients. The network training is independent of the number and the geometric configuration of the microphones. We further show that it is possible...

chapter

BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge

Jahn Heymann, Lukas Drude, Aleksej Chinaev, Reinhold Haeb-Umbach

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) > 444 - 451

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

We present a new beamformer front-end for Automatic Speech Recognition and apply it to the 3rd-CHiME Speech Separation and Recognition Challenge. Without any further modification of the back-end, we achieve a 53% relative reduction of the word error rate over the best baseline enhancement system for the relevant test data set. Our approach leverages the power of a bi-directional Long Short-Term Memory...

chapter

Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under mismatch conditions

Jahn Heymann, Reinhold Haeb-Umbach, Pavel Golik, Ralf Schluter

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5053 - 5057

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The parametric Bayesian Feature Enhancement (BFE) and a datadriven Denoising Autoencoder (DA) both bring performance gains in severe single-channel speech recognition conditions. The first can be adjusted to different conditions by an appropriate parameter setting, while the latter needs to be trained on conditions similar to the ones expected at decoding time, making it vulnerable to a mismatch between...

chapter

Iterative Bayesian word segmentation for unsupervised vocabulary discovery from phoneme lattices

Jahn Heymann, Oliver Walter, Reinhold Haeb-Umbach, Bhiksha Raj

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4057 - 4061

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we present an algorithm for the unsupervised segmentation of a lattice produced by a phoneme recognizer into words. Using a lattice rather than a single phoneme string accounts for the uncertainty of the recognizer about the true label sequence. An example application is the discovery of lexical units from the output of an error-prone phoneme recognizer in a zero-resource setting, where...

chapter

Unsupervised word segmentation from noisy input

Jahn Heymann, Oliver Walter, Reinhold Haeb-Umbach, Bhiksha Raj

2013 IEEE Workshop on Automatic Speech Recognition and Understanding > 458 - 463

2013 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

In this paper we present an algorithm for the unsupervised segmentation of a character or phoneme lattice into words. Using a lattice at the input rather than a single string accounts for the uncertainty of the character/phoneme recognizer about the true label sequence. An example application is the discovery of lexical units from the output of an error-prone phoneme recognizer in a zero-resource...

Filter options

Publication date

Set your own date range

INFONA - science communication portal

Search results for: Jahn Heymann

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options