Search results

Items from 1 to 20 out of 1,468 results

chapter

Development and evaluation of the program for auditory training in the correction of central auditory processing disorders

Dmitriy I. Kaplun, Denis V. Gnezdilov, George A. Efimenko, Alexey A. Pochechuev, more

2017 IEEE II International Conference on Control in Technical Systems (CTS) > 106 - 109

2017 IEEE II International Conference on Control in Technical Systems (CTS)

The main indication for auditory training is central auditory processing disorder (CAPD), which inevitably develops in patients with the chronic sensorineural hearing loss as a consequence of auditory deprivation. Patients with CAPD have difficulties with understanding complex signals, especially, speech in background noise. The aim of the study was to create the optimal algorithm of auditory training...

chapter

An Improved Tibetan Lhasa Speech Recognition Method Based on Deep Neural Network

Wenbin Ruan, Zhenye Gan, Bin Liu, Yin Guo

2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA) > 303 - 306

2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA)

Deep Neural Networks (DNN) are the dominant technique widely used in English and Chinese speech recognition currently. However, Tibetan speech recognition research starts late and mainly uses Hidden Markov Model (HMM). In this paper, We show a better method of replacing Gaussian Mixture Models (GMM) by DNN to Tibetan Lhasa dialect speech recognition system. The system contains seven layers of features...

chapter

A new speaker verification algorithm based on identification results

Khettaoui Billal, Dahimene Abdelhakim

2017 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B) > 1 - 6

2017 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B)

In this paper, a text independent speaker recognition system based on Gaussian mixture models (GMM) was developed with a specific focus on the use of a voice activated detector (VAD) algorithm in the training and testing. At the training level, a modified estimation/maximization (EM) algorithm is used. It is less prone to get trapped around a local maximum and so, it will have more chance to converge...

chapter

A low complexity solution for epilepsy detection using an improved version of the reaction-diffusion transform

Radu Dogaru, Ioana Dogaru

2017 5th International Symposium on Electrical and Electronics Engineering (ISEEE) > 1 - 6

2017 5th International Symposium on Electrical and Electronics Engineering (ISEEE)

Recognition of epileptic seizures is an important issue and in certain circumstances it is desirable to have portable equipment implementing the algorithm in order to better monitor the patients. This work considers a widely used EEG database from University of Bonn as reference for comparing our recognition method with other previously reported. In order to perform epileptic seizures we combine a...

chapter

Low-Latency approximation of bidirectional recurrent networks for speech denoising

Gordon Wichern, Alexey Lukin

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 66 - 70

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

The ability to separate speech from non-stationary background disturbances using only a single channel of information has increased significantly with the adoption of deep learning techniques. In these approaches, a time-frequency mask that recovers clean speech from noisy mixtures is learned from data. Recurrent neural networks are particularly well-suited to this sequential prediction task, with...

chapter

Learning vocal mode classifiers from heterogeneous data sources

Zhao Shuyang, Toni Heittola, Tuomas Virtanen

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 16 - 20

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

This paper targets on a generalized vocal mode classifier (speech/singing) that works on audio data from an arbitrary data source. Previous studies on sound classification are commonly based on cross-validation using a single dataset, without considering training-recognition mismatch. In our study, two experimental setups are used: matched training-recognition condition and mismatched training-recognition...

chapter

A low complexity method based on reaction-diffusion transform for ultrasound echo-based shape object classification

Mihai Bucurica, Ioana Dogaru, Radu Dogaru

2017 5th International Symposium on Electrical and Electronics Engineering (ISEEE) > 1 - 5

2017 5th International Symposium on Electrical and Electronics Engineering (ISEEE)

This paper presents improvements in terms of accuracy for shape object classification using a new low complexity method compared to previous implementation [1]. The method is using echoes generated by a JAVA platform capable of emulate sound propagation in a controlled 2D virtual environment [2][3]. Echoes originate from the ultrasonic waves generated inside a virtual environment which contains geometrical...

chapter

Modeling intra-label dynamics in connectionist temporal classification

Ashkan Sadeghi Lotfabadi, Kamaledin Ghiasi-Shirazi, Ahad Harati

2017 7th International Conference on Computer and Knowledge Engineering (ICCKE) > 367 - 371

2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)

Most sequence processing tasks can be cast as a problem of mapping a sequence of observations into a sequence of labels. This is a very difficult problem since the association between input data sequences and output label sequences is not given at the frame level. Recurrent neural networks (RNNs) equipped with connectionist temporal classification (CTC) are among the best tools devised to handle this...

chapter

An incremental intelligent object recognition system based on deep learning

Long Yan, Yongxiong Wang, Tianzhong Song, Zhong Yin

2017 Chinese Automation Congress (CAC) > 7135 - 7138

2017 Chinese Automation Congress (CAC)

The accuracy of object recognition has been greatly improved due to the rapid development of deep learning, but the deep learning generally requires a lot of training data and the training process is very slow and complex. We propose an incremental object recognition system based on deep learning techniques and speech recognition technology with high learning speed and wide applicability. The system...

chapter

Research on voiceprint recognition based on weighted clustering recognition SVM algorithm

Yang Wu, Lihong Xu, Yandong Chen, Xueyang Zhang

2017 Chinese Automation Congress (CAC) > 1144 - 1148

2017 Chinese Automation Congress (CAC)

Support vector machine (SVM) algorithm received much attention in the research of voiceprint recognition, especially for small sample datasets. However, with the increase of recognition number and speech features number, the rate of model training and recognition is significantly reduced. In order to solve the problem, a new weighted clustering algorithm is proposed, which use “one to one” SVM model...

chapter

Application of convolution neural network to flow pattern identification of gas-liquid two-phase flow in small-size pipe

Zhiyong Yang, Haifeng Ji, Zhiyao Huang, Baoliang Wang, more

2017 Chinese Automation Congress (CAC) > 1389 - 1393

2017 Chinese Automation Congress (CAC)

Flow pattern is one of the most important parameters for gas-liquid two-phase flow. In this work, a new flow pattern identification method based on Convolution Neural Network (CNN) is presented. A 7-layer CNN structure is chosen, and the parameters of this network are determined by a training set. In order to verify the feasibility, experiments were carried out in horizontal pipe with the inner diameter...

chapter

Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR

Tsubasa Ochiai, Shinji Watanabe, Shigeru Katagiri

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

Recently we proposed a novel multichannel end-to-end speech recognition architecture that integrates the components of multichannel speech enhancement and speech recognition into a single neural-network-based architecture and demonstrated its fundamental utility for automatic speech recognition (ASR). However, the behavior of the proposed integrated system remains insufficiently clarified. An open...

chapter

Mel-Generalized cepstral regularization for discriminative non-negative matrix factorization

Li Li, Hirokazu Kameoka, Shoji Makino

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

The non-negative matrix factorization (NMF) approach has shown to work reasonably well for monaural speech enhancement tasks. This paper proposes addressing two shortcomings of the original NMF approach: (1) the objective functions for the basis training and separation (Wiener filtering) are inconsistent (the basis spectra are not trained so that the separated signal becomes optimal); (2) minimizing...

chapter

Instrumental shell for pronunciation training simulator design

Anastasiya G. Digor, Irina L. Artemeva, Ekaterina M. Lukina, Viktoriya L. Zavyalova

2017 Second Russia and Pacific Conference on Computer Technology and Applications (RPC) > 33 - 38

2017 Second Russian-Pacific Conference on Computer Technology and Applications (RPC)

The article describes the conceptual design of the instrumental shell for a pronunciation training simulator creation. The project is based on linguistic knowledge and methods of speech recognition. The discrepancies in phonetic systems of different (native and non-native) languages are taken into account. The main approaches for speech recognition are analyzed and the required components of the simulator...

chapter

Adversarial learning: A critical review and active learning study

D.J. Miller, X. Hu, Z. Qiu, G. Kesidis

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

This papers consists of two parts. The first is a critical review of prior art on adversarial learning, i) identifying some significant limitations of previous works, which have focused mainly on attack exploits and ii) proposing novel defenses against adversarial attacks. The second part is an experimental study considering the adversarial active learning scenario and an investigation of the efficacy...

chapter

Spoken word recognition using MFCC and learning vector quantization

Esmeralda C. Djamal, Neneng Nurhamidah, Ridwan Ilyas

2017 4th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI) > 1 - 6

2017 4th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI)

Identification of spoken word(s) can be used to control external device. This research was result word identification in speech using Mel-Frequency Cepstrum Coefficients (MFCC) and Learning Vector Quantization (LVQ). The output of system operated the computer in certain genre song appropriate with the identified word. Identification was divided into three classes contain words such as "Klasik",...

chapter

Residual neural networks for speech recognition

Hari Krishna Vydana, Anil Kumar Vuppala

2017 25th European Signal Processing Conference (EUSIPCO) > 543 - 547

2017 25th European Signal Processing Conference (EUSIPCO)

Recent developments in deep learning methods have greatly influenced the performances of speech recognition systems. In a Hidden Markov model-Deep neural network (HMM-DNN) based speech recognition system, DNNs have been employed to model senones (context dependent states of HMM), where HMMs capture the temporal relations among senones. Due to the use of more deeper networks significant improvement...

chapter

Polish whispery speech recognition — Minimum sampling frequency

Piotr Kozierski, Talar Sadalla, Szymon Drgas, Adam Dabrowski, more

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR) > 611 - 615

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)

The article presents studies on the automatic whispery speech recognition. In the performed research a new corpus with whispery speech has been used. It has been checked how is the speech recognition quality changing at variables sampling frequency and signal frame length. It has been found that the optimal sampling frequency of whispery speech is about 32–48 kHz, while the optimal signal frame length...

chapter

Monaural source separation based on adaptive discriminative criterion in neural networks

Yang Sun, Lei Zhu, Jonathon A. Chambers, Syed Mohsen Naqvi

2017 22nd International Conference on Digital Signal Processing (DSP) > 1 - 5

2017 22nd International Conference on Digital Signal Processing (DSP)

Monaural source separation is an important research area which can help to improve the performance of several real-world applications, such as speech recognition and assisted living systems. Huang et al. proposed deep recurrent neural networks (DRNNs) with discriminative criterion objective function to improve the performance of source separation. However, the penalty factor in the objective function...

chapter

Development of multilingual phone recognition system for Indian languages

K E Manjunath, K. Sreenivasa Rao, Dinesh Babu Jayagopi

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) > 1 - 6

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)

In this paper, the development of Multilingual Phone Recognition System (MPRS) in the context of Indian languages is described. MPRS is a language independent Phone Recognition System (PRS) that could recognise the phonetic units present in a speech utterance of any language. We have developed two Bilingual and a quadrilingual PRS using four Indian languages — Kannada, Telugu, Bengali, and Odia. International...

Keywords:
TRAINING
SPEECH RECOGNITION

Publication date

Set your own date range

Content availability

Available (1,462)
None (6)

Keywords

SPEECH (1,071)
HIDDEN MARKOV MODELS (756)
ACOUSTICS (384)
FEATURE EXTRACTION (377)
ACCURACY (187)
SPEECH PROCESSING (172)
MEL FREQUENCY CEPSTRAL COEFFICIENT (160)
DATABASES (147)
DATA MODELS (137)
SPEAKER RECOGNITION (136)
NEURAL NETWORKS (132)
ARTIFICIAL NEURAL NETWORKS (130)
COMPUTATIONAL MODELING (126)
NATURAL LANGUAGE PROCESSING (124)
TRAINING DATA (123)
SUPPORT VECTOR MACHINES (117)
AUTOMATIC SPEECH RECOGNITION (116)
TESTING (100)
VOCABULARY (91)
EMOTION RECOGNITION (90)
DATA MINING (87)
MATHEMATICAL MODEL (86)
ADAPTATION MODELS (85)
HIDDEN MARKOV MODEL (82)
DECODING (79)
ADAPTATION MODEL (78)
NOISE (78)
LEARNING (ARTIFICIAL INTELLIGENCE) (74)
ERROR ANALYSIS (69)
HMM (65)
CONTEXT (64)
CLASSIFICATION ALGORITHMS (57)
MAXIMUM LIKELIHOOD ESTIMATION (57)
GAUSSIAN PROCESSES (55)
PATTERN CLASSIFICATION (54)
LATTICES (53)
NEURAL NETS (53)
ROBUSTNESS (49)
CEPSTRAL ANALYSIS (48)
MFCC (48)
NOISE MEASUREMENT (48)
VECTORS (48)
DISCRIMINATIVE TRAINING (45)
PROBABILITY (45)
MACHINE LEARNING (43)
OPTIMIZATION (43)
STATISTICAL ANALYSIS (43)
KERNEL (41)
DICTIONARIES (40)
RECURRENT NEURAL NETWORKS (40)
TRANSFORMS (39)
SIGNAL TO NOISE RATIO (38)
DEEP NEURAL NETWORK (37)
LANGUAGE MODEL (37)
ACOUSTIC MODELING (36)
CONTEXT MODELING (36)
CORRELATION (36)
DEEP NEURAL NETWORKS (35)
NEURONS (35)
VISUALIZATION (34)
ENTROPY (33)
SUPPORT VECTOR MACHINE (33)
NATURAL LANGUAGES (32)
ACOUSTIC SIGNAL PROCESSING (31)
EQUATIONS (31)
SPEECH CODING (31)
SPEECH SYNTHESIS (30)
SPEECH ENHANCEMENT (29)
SUPPORT VECTOR MACHINE CLASSIFICATION (29)
GAUSSIAN MIXTURE MODEL (28)
VECTOR QUANTIZATION (28)
ESTIMATION (27)
NIST (27)
PATTERN RECOGNITION (27)
ROBUST SPEECH RECOGNITION (27)
SIGNAL CLASSIFICATION (27)
COMPUTERS (26)
SPEAKER IDENTIFICATION (25)
ACOUSTIC MODEL (24)
ALGORITHM DESIGN AND ANALYSIS (24)
COMPUTER ARCHITECTURE (24)
HUMANS (24)
PRINCIPAL COMPONENT ANALYSIS (24)
SPEECH EMOTION RECOGNITION (24)
DETECTORS (23)
STANDARDS (23)
COVARIANCE MATRIX (22)
MULTILAYER PERCEPTRONS (22)
TEXT ANALYSIS (22)
VITERBI ALGORITHM (22)
CLUSTERING ALGORITHMS (21)
LANGUAGE MODELING (21)
NEURAL NETWORK (21)
SIGNAL PROCESSING (21)
SPEAKER VERIFICATION (20)
WORD ERROR RATE (20)
ASR (19)
CONFERENCES (19)
more

INFONA - science communication portal

Search results

Development and evaluation of the program for auditory training in the correction of central auditory processing disorders

An Improved Tibetan Lhasa Speech Recognition Method Based on Deep Neural Network

A new speaker verification algorithm based on identification results

A low complexity solution for epilepsy detection using an improved version of the reaction-diffusion transform

Low-Latency approximation of bidirectional recurrent networks for speech denoising

Learning vocal mode classifiers from heterogeneous data sources

A low complexity method based on reaction-diffusion transform for ultrasound echo-based shape object classification

Modeling intra-label dynamics in connectionist temporal classification

An incremental intelligent object recognition system based on deep learning

Research on voiceprint recognition based on weighted clustering recognition SVM algorithm

Application of convolution neural network to flow pattern identification of gas-liquid two-phase flow in small-size pipe

Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR

Mel-Generalized cepstral regularization for discriminative non-negative matrix factorization

Instrumental shell for pronunciation training simulator design

Adversarial learning: A critical review and active learning study

Spoken word recognition using MFCC and learning vector quantization

Residual neural networks for speech recognition

Polish whispery speech recognition — Minimum sampling frequency

Monaural source separation based on adaptive discriminative criterion in neural networks

Development of multilingual phone recognition system for Indian languages

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options