2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Non-negative spectrogram decomposition and its variants have been extensively investigated for speech enhancement due to their efficiency in extracting perceptually meaningful components from mixtures. Usually, these approaches are implemented on the condition that training samples for one or more sources are available beforehand. However, in many real-world scenarios, it is always impossible for...

chapter

Local trajectory based speech enhancement for robust speech recognition with deep neural network

Yongbin You, Yanmin Qian, Kai Yu

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 5 - 9

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Deep neural network(DNN) has achieved a great success in automatic speech recognition(ASR), and it can be regarded as a joint model combining the nonlinear feature transformation and the log-linear classifier. Recently DNN is adopted as a regression model to enhance the distorted feature in noisy condition and the enhanced feature is utilized to improve the performance of DNN based ASR. Previous work...

chapter

Fractional processing-based active noise control algorithm for impulsive noise

Muhammad Tahir Akhtar, Muhammad AsifZahoor Raja

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 10 - 14

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

This paper deals with active noise control (ANC) for impulsive noise sources for which the filtered-x least mean square (FxLMS) algorithm becomes unstable. By minimizing the fractional lower order moment, the resulting filtered-x least mean p-power (FxLMP) algorithm has an update vector being computed using sign operator and fractional power of the residual error signal. This results in improved robustness...

chapter

Multi-pronounciation dictionary construction for Mandarin-English bilingual phrase speech recognition system

C. Wang, W. Shi, Y. X. Zou

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 15 - 19

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Generally, in multi-lingual communities, non-native speakers may produce speech sound which is either part of their own native language or established via merging characteristics of native pronunciation with non-native pronunciation. Recently, a Two-pass phone clustering based on Confusion Matrix (TCM) approach has been proposed to address the one-to-one phone mappings between Chinese syllables and...

chapter

Improving HMM/DNN in ASR of under-resourced languages using probabilistic sampling

Meixu Song, Qingqing Zhang, Jielin Pan, Yonghong Yan

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 20 - 24

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In HMM/DNN automatic speech recognition (ASR) systems, the DNNs model the posterior probabilities for triphone states. However, triphone states are unevenly distributed. In this situation, the training algorithm tends to converge to a local optimum more related to states with rich data than states with poor data. Thus, the imbalance of the training data decreases the ASR performances, especially for...

chapter

On statistical machine translation method for lexicon refinement in speech recognition

Haihua Xu, Xiong Xiao, Eng-Siong Chng, Haizhou Li

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 25 - 29

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In low resource Automatic Speech Recognition (ASR), one usually resorts to the Statistical Machine Translation (SMT) technique to learn transform rules to refine grapheme lexicon. To do this, we face two challenges. One is to generate grapheme sequences from the training data as the targets, which is paired with the original transcripts to train SMT models; the other is to effectively prune the learned...

chapter

An investigation on DNN-derived bottleneck features for GMM-HMM based robust speech recognition

Yongbin You, Yanmin Qian, Tianxing He, Kai Yu

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 30 - 34

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In recent years, deep neural network(DNN) has achieved great success when used as acoustic model in speech recognition. An important application of DNN is to derive bottleneck feature. In this paper, firstly we investigate the robustness of bottleneck features generated by three types of DNN structures on the Aurora 4 task without any explicit noise compensation. Secondly, we propose the node-pruning...

chapter

Nonnegative matrix factorization based noise robust speaker verification

S. H. Liu, Y. X. Zou, H. K. Ning

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 35 - 39

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

The performance of speaker verification system (SVS) declines dramatically in noisy environments. To suppress the adverse impact of the noise on SVS, this paper investigates employing the nonnegative matrix factorization (NMF) technique to reconstruct the speech based on the pre-trained speech basis matrix (SBM) and noise basis matrix (NBM). The contribution of this research lies in utilizing the...

chapter

A tag-level factor graph model for semantic music discovery

Qin Yan, Shuyu Deng, Qiuyu Tao, Luan Dong, more

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 40 - 44

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

This paper proposes a semantic music discovery system based on a tag-level factor graph (TFG) model with utilization of tag probability and content similarity in a unified fashion. The content similarities are calculated based on the extracted pitch features while tag probabilities are obtained from our previous auto-tagging system. The TFG model consists of a set of node and edge feature functions,...

INFONA - science communication portal

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Author index

Copyright page

Blank page

Conference committee

Technical program committee

Reviewers

Message from technical program chairs

Table of contents

Message from general chairs

Title page

Cover page

Unsupervised monaural speech enhancement using robust NMF with low-rank and sparse constraints

Local trajectory based speech enhancement for robust speech recognition with deep neural network

Fractional processing-based active noise control algorithm for impulsive noise

Multi-pronounciation dictionary construction for Mandarin-English bilingual phrase speech recognition system

Improving HMM/DNN in ASR of under-resourced languages using probabilistic sampling

On statistical machine translation method for lexicon refinement in speech recognition

An investigation on DNN-derived bottleneck features for GMM-HMM based robust speech recognition

Nonnegative matrix factorization based noise robust speaker verification

A tag-level factor graph model for semantic music discovery

Filter options

Publication date

Keywords

INFONA - science communication portal

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)