Advanced search

From:

To:

Items from 1 to 18 out of 18 results

chapter

Part-of-speech labeling for Reuters database

R. Cretulescu, A. David, D. Morariu, L. Vintan

2015 19th International Conference on System Theory, Control and Computing (ICSTCC) > 117 - 122

2015 19th International Conference on System Theory, Control and Computing (ICSTCC)

Even if the Vector Space Model used for document representation in information retrieval systems integrates a small quantity of knowledge it continues to be used due to its computational cost, speed execution and simplicity. We try to improve this document representation by adding some syntactic information such as the parts of speech. In this paper, we have evaluated three different tagging algorithms...

chapter

ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks

Atsunori Ogawa, Takaaki Hori

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4370 - 4374

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recurrent neural networks (RNNs) have recently been applied as the classifiers for sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied for the first time to error detection in automatic speech recognition (ASR), which is a sequential labeling problem. We investigate three types of ASR error detection tasks, i.e. confidence estimation, out-of-vocabulary word detection...

chapter

Unnecessary utterance detection for avoiding digressions in discussion

Riki Yoshida, Takuya Hiraoka, Graham Neubig, Sakriani Sakti, more

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose a method for avoiding digressions in discussion by detecting unnecessary utterances and having a dialogue system intervene. The detector is based on the features using word frequency and topic shifts. The performance (i.e. accuracy, recall, precision, and F-measure) of the unnecessary utterance detector is evaluated through leave-one-dialogue-out cross-validation. In the...

chapter

Automatic pronunciation error detection of nonnative Arabic Speech

Afnan Al Hindi, Mansour Alsulaiman, Ghulam Muhammad, Saad Al-Kahtani

2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA) > 190 - 197

2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)

Computer assisted language learning (CALL) and, more specifically, computer assisted pronunciation training (CAPT) have received considerable attention in recent years. CAPT allows continuous feedback to the learner without requiring the sole attention of the teacher; it facilitates self study and encourages interactive use of the language in preference to rote learning. One of the important processes...

chapter

Automatic pitch accent detection using auto-context with acoustic features

Junhong Zhao, Wei-Qiang Zhang, Hua Yuan, Jia Liu, more

2012 8th International Symposium on Chinese Spoken Language Processing > 247 - 251

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In prosody event detection field, many local acoustic features have been proposed for representing the prosody characteristics of speech unit. The context information that represents some possible regularities underlying neighboring prosody events, however, hasn't been used effectively. The main difficulty to utilize prosodic context is that it's hard to capture the long-distance sequential dependency...

chapter

Automatic speaker role labeling in AMI meetings: Recognition of formal and social roles

Ashtosh Sapru, Fabio Valente

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5057 - 5060

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This work aims at investigating the automatic recognition of speaker role in meeting conversations from the AMI corpus. Two types of roles are considered: formal roles, fixed over the meeting duration and recognized at recording level, and social roles related to the way participants interact between themselves, recognized at speaker turn level. Various structural, lexical and prosodic features as...

chapter

Automatic error region detection and characterization in LVCSR transcriptions of TV news shows

Richard Dufour, Geraldine Damnati, Delphine Charlet

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4445 - 4448

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper addresses the issue of error region detection and characterization in LVCSR transcriptions. It is a well-known phenomenon that errors are not independent and tend to co-occur in automatic transcriptions. We are interested in automatically detecting these so-called error regions. Additionally, in the context of information extraction in TVBN shows, being able to automatically characterize...

chapter

Robust speaker turn role labeling of TV Broadcast News shows

Geraldine Damnati, Delphine Charlet

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5684 - 5687

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speaker role recognition in TV Broadcast News shows is addressed in this paper with a particular focus on speaker turn role labeling. A mixed approach combining speaker clustering and analysis of Automatic Speech Recognition output is proposed for assigning speaker turns a role among: anchor, reporter and other. 86% classification accuracy is obtained for automatically segmented speaker turns on a...

chapter

Using spoken utterance compression for meeting summarization: A pilot study

Fei Liu, Yang Liu

2010 IEEE Spoken Language Technology Workshop > 37 - 42

2010 IEEE Spoken Language Technology Workshop (SLT 2010)

Most previous work on meeting summarization focused on extractive approaches; however, directly concatenating the extracted spoken utterances may not form a good summary. In this paper, we investigate if it is feasible to compress the transcribed spoken utterances and if using the compressed utterances benefits meeting summarization. We model the utterance compression task as a sequence labeling problem,...

chapter

Active Learning and Semi-Supervised Learning in Tibetan Language Speech Recognition

Xiuqin Pan, Yongcun Cao, Yong Lu

2010 International Conference on Artificial Intelligence and Computational Intelligence > 1 > 369 - 372

2010 International Conference on Artificial Intelligence and Computational Intelligence (AICI 2010)

A key challenge in rapidly building Tibetan language speech recognition applications is minimizing the manual effort required in transcribing and labeling speech data. Accurate labeling of Tibetan speech utterances is extremely time consuming and requires trained linguists. For alleviate this problem, we present an approach that aims at reducing the amount of manually transcribed speech data required...

chapter

Fast construction of speech recognition model based on sample selection strategy

Xiuqin Pan, Yue Zhao, Yongcun Cao

2010 IEEE International Conference on Automation and Logistics > 515 - 517

2010 IEEE International Conference on Automation and Logistics (ICAL)

In the process of building speech recognition models, accurate labeling of speech utterances is extremely time consuming and requires trained linguists. For fast building the speech recognition models in some industrial applications, we present a novel sample selection strategy that can use very few labeled speech utterances to construct the effective recognition model. The experimental results show...

chapter

Tibetan Language Speech Recognition Model Based on Active Learning and Semi-Supervised Learning

Xiuqin Pan, Yongcun Cao, Yong Lu, Yue Zhao

2010 10th IEEE International Conference on Computer and Information Technology > 1225 - 1228

2010 IEEE 10th International Conference on Computer and Information Technology (CIT)

In the researches on Tibetan language speech recognition, accurate labeling of Tibetan speech utterances is extremely time consuming and requires trained linguists. For alleviate this problem, we present an approach that can use few labeled Tibetan speech utterances to construct the effective recognition model. The experimental results show that our approach has better performance than traditional...

chapter

Support Vector Machine for Chinese Part-Of-Speech Tagging in Speech Synthesis Systems

Xiang Wang, Jianping Zhang, Yonghong Yan

2010 International Conference on Biomedical Engineering and Computer Science > 1 - 4

International Conference on Biomedical Engineering and Computer Science (ICBECS 2010)

The paper presents a support vector machine based Part-Of-Speech tagging on Chinese database which is part of our speech synthesis system. The model can be classified as SVM model and uses many sequential features to predict the POS tag. The text database was download from the internet with 1,280,000 words and 33 parts of Speech. The total accuracy of our experiments is 99.31%.

chapter

Unsupervised broadcast conversation speaker role labeling

Brian Hutchinson, Bin Zhang, Mari Ostendorf

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5322 - 5325

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

We present an approach to unsupervised speaker role labeling in talk show data that makes use of two complementary sets of features: structural features that encode the participation patterns of speakers, and lexical features, which capture characteristic phrases. Techniques for using multiple clusterings are explored, leading to more robust results. Experiments on English and Mandarin talk shows...

chapter

Unknown example detection for example-based spoken dialog system

S. Takeuchi, H. Kawanami, H. Saruwatari, K. Shikano

2009 Oriental COCOSDA International Conference on Speech Database and Assessments > 122 - 125

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

In a spoken dialog system, the example-based response generation method generates a response by searching a dialog example database for the example question most similar to an input user utterance. That method has the advantage of ease of system expansion. It requires, however, a number of utterance examples whose correct responses are labeled. In this paper, we propose an approach to reducing the...

chapter

Rapid unsupervised adaptation using context independent phoneme model

S. Kobashikawa, A. Ogawa, Y. Yamaguchi, S. Takahashi

2009 IEEE 13th International Symposium on Consumer Electronics > 209 - 212

2009 IEEE 13th International Symposium on Consumer Electronics (ISCE)

Users require rapid and highly accurate speech recognition systems. Accuracy could be improved by unsupervised adaptation as provided by CMLLR (Constrained Maximum Likelihood Linear Regression). CMLLR-based batch-type unsupervised adaptation estimates a single global transformation matrix by utilizing unsupervised labeling; unfortunately, it needs prior labeling and so is not rapid. Our proposed technique...

chapter

Prosody Study with Context-Dependent Acoustic Models

Yue-Ning Hu, Min Chu

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

In this paper, we propose to study prosody with context-dependent acoustic models. We find that we can achieve better resolution on a specific aspect by training CDM with certain focus. For the tone recognition task, CDM with focus on tones should be used and it achieves 15.2% relative error reduction, when comparing with the traditional tri-phone models. For detecting prosody boundaries, CDM with...

chapter

Image Segmentation Based on Fussing Multi-feature and Spatial Spectral Clustering

S. P. Gou, P. J. Chen, X. Y. Yang, L. C. Jiao

2008 Congress on Image and Signal Processing > 3 > 667 - 671

International Congress on Image and Signal Processing (CISP 2008)

A new method for image feature extraction and segmentation is proposed in this paper. Abundant contour feature information of the image is expressed by contourlet transform while texture feature of the image is described by wavelet transform and Gray Level Co-occurrence Matrix (GLCM). The three type feature information compose feature matrix. The presented method describes different image information...

Filter options

Keywords:
ACCURACY
SPEECH
LABELING

Publication date

Set your own date range

Keywords

SPEECH RECOGNITION (11)
FEATURE EXTRACTION (6)
TRAINING (6)
HIDDEN MARKOV MODELS (4)
NATURAL LANGUAGE PROCESSING (4)
ACOUSTICS (3)
ACTIVE LEARNING (3)
LEARNING (ARTIFICIAL INTELLIGENCE) (3)
SPEECH PROCESSING (3)
CLUSTERING ALGORITHMS (2)
CLUSTERING METHODS (2)
COMPUTATIONAL MODELING (2)
CONTEXT (2)
DATABASES (2)
ENTROPY (2)
ESTIMATION (2)
INTERACTIVE SYSTEMS (2)
PARTITIONING ALGORITHMS (2)
SEMI-SUPERVISED LEARNING (2)
SEMISUPERVISED LEARNING (2)
SUPERVISED LEARNING (2)
SYNTACTICS (2)
TAGGING (2)
TIBETAN LANGUAGE SPEECH RECOGNITION (2)
ACOUSTIC (1)
ADAPTATION MODEL (1)
ALGORITHM DESIGN AND ANALYSIS (1)
AMI MEETINGS (1)
APPROXIMATION ALGORITHMS (1)
APPROXIMATION METHODS (1)
ARABIC LEARNER (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUTO-CONTEXT (1)
AUTOMATIC CLASSIFICATION (1)
AUTOMATIC SPEECH RECOGNITION (1)
BOOSTING (1)
BROADCAST CONVERSATION (1)
BROADCAST CONVERSATIONS (1)
BROADCAST NEWS AND CONVERSATION SPEAKER DISTILLATION (1)
CHINESE DATABASE (1)
CHINESE PART-OF-SPEECH TAGGING (1)
CLASSIFICATION ALGORITHMS (1)
COMPLEXITY THEORY (1)
COMPUTER ASSISTED PRONUNCIATION TRAINING (1)
COMPUTER LANGUAGES (1)
CONDITIONAL RANDOM FIELD (1)
CONDITIONAL RANDOM FIELDS (1)
CONFERENCES (1)
CONTEXT DEPENDENT ACOUSTIC MODELS (1)
CONTEXT INDEPENDENT PHONEME MODEL (1)
CONTEXT MODELING (1)
CORRELATION (1)
COUPLINGS (1)
CYBERNETICS (1)
DATA COMPRESSION (1)
DATA MINING (1)
DATABASE MANAGEMENT SYSTEMS (1)
DEEP BIDIRECTIONAL RECURRENT NEURAL NETWORKS (1)
DETECTORS (1)
DIALOG EXAMPLE DATABASE (1)
DOCUMENTS REPRESENTATION (1)
EIGENVALUES AND EIGENFUNCTIONS (1)
ENCODING (1)
ENGLISH TALK SHOWS (1)
ERROR CHARACTERIZATION (1)
ERROR DETECTION (1)
ERROR REGION DETECTION (1)
EXAMPLE-BASED RESPONSE GENERATION METHOD (1)
EXAMPLE-BASED SPOKEN DIALOG SYSTEM (1)
FILTER BANK (1)
FOCUSING (1)
FORMAL AND SOCIAL ROLES (1)
FRAME-BY-FRAME STATISTICS ACCUMULATION (1)
FUSES (1)
FUSSING MULTI-FEATURE (1)
GAIN (1)
GAMES (1)
GENERALIZATION ABILITY (1)
GOODNESS OF PRONUNCIATION (1)
HUMANS (1)
ICSI MEETING CORPUS (1)
IEEE TRANSACTIONS ON IMAGE PROCESSING (1)
IEEE TRANSACTIONS ON MEDICAL IMAGING (1)
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (1)
IMAGE PROCESSING (1)
IMAGE REPRESENTATION (1)
IMAGE RESOLUTION (1)
IMAGE RESTORATION (1)
IMAGE SEGMENTATION (1)
IMAGE TEXTURE (1)
INFORMATION RETRIEVAL (1)
LABELED SPEECH UTTERANCES (1)
LEXICAL AND PROSODIC FEATURE ANALYSIS (1)
LEXICAL FEATURES (1)
LINEAR ALGEBRA (1)
LINGUISTICS (1)
LOCAL AREA NETWORKS (1)
more

INFONA - science communication portal

Advanced search

Advanced search

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options