Search results

Items from 1 to 20 out of 708 results

chapter

Vietnamese news classification based on BoW with keywords extraction and neural network

Toan Pham Van, Ta Minh Thanh

2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES) > 43 - 48

2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES)

Nowadays, text classification (TC) becomes the main applications of NLP (natural language processing). Actually, we have a lot of researches in classifying text documents, such as Random Forest, Support Vector Machines and Naive Bayes. However, most of them are applied for English documents. Therefore, the text classification researches on Vietnamese still are limited. By using a Vietnamese news corpus,...

chapter

Evaluating automatic methods to extract patients' supplement use from clinical reports

Yadan Fan, Lu He, Rui Zhang

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) > 1258 - 1261

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

The widespread prevalence of dietary supplements has drawn extensive attention due to the safety and efficacy issue. Clinical notes document a great amount of detailed information on dietary supplement usage, thus providing a rich source for clinical research on supplement safety surveillance. Identification the use status of dietary supplements is one of the initial steps for the ultimate goal of...

chapter

Sentiment analysis on Twitter data with semi-supervised Doc2Vec

Metin Bilgin, Izzet Fatih Senturk

2017 International Conference on Computer Science and Engineering (UBMK) > 661 - 666

2017 International Conference on Computer Science and Engineering (UBMK)

Twitter is one of the most popular microblog sites developed in recent years. Feelings are analysed on the messages shared on Twitter so that users ideas on the products and companies can be determined. Sentiment analysis helps companies to improve their products and services based on the feedback obtained from the users through Twitter. In this study, it was aimed to perform sentiment analysis on...

chapter

Experience Report: Log Mining Using Natural Language Processing and Application to Anomaly Detection

Christophe Bertero, Matthieu Roy, Carla Sauvanaud, Gilles Tredan

2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE) > 351 - 360

2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE)

Event logging is a key source of information on a system state. Reading logs provides insights on its activity, assess its correct state and allows to diagnose problems. However, reading does not scale: with the number of machines increasingly rising, and the complexification of systems, the task of auditing systems' health based on logfiles is becoming overwhelming for system administrators. This...

chapter

Multilingual cyberbullying detection system: Detecting cyberbullying in Arabic content

Batoul Haidar, Maroun Chamoun, Ahmed Serhrouchni

2017 1st Cyber Security in Networking Conference (CSNet) > 1 - 8

2017 1st Cyber Security in Networking Conference (CSNet)

In the era of Internet and electronic devices bullying shifted its place from schools and backyards into the cyberspace; it is now known as Cyberbullying. Children of the Arab countries are suffering from cyberbullying same as children worldwide. Thus concerns from cyberbullying are elevating. A lot of research is done for the purpose of handling this situation. The current research is focusing on...

chapter

Semi-supervised approach for Persian word sense disambiguation

Mohamadreza Mahmoodvand, Maryam Hourali

2017 7th International Conference on Computer and Knowledge Engineering (ICCKE) > 104 - 110

2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)

Word-sense disambiguation is one of the key concepts in natural language processing. The main goal of a language is to present a specific concept to the audience. This concept is extracted from the meaning of words in that language. System should be able to identify role and meaning of words in order to identify the concepts in texts properly. This issue becomes more problematic if there are words...

chapter

Design and implementation of Word2Vec parallel algorithm based on HPC

Xianyong Yi, Rongge Zheng, Aoyu Wang, Hao Qin, more

2017 Chinese Automation Congress (CAC) > 585 - 590

2017 Chinese Automation Congress (CAC)

Word2Vec (Word to Vector) processes natural language by calculating the cosine similarity. However, the serial algorithm of original Word2Vec fails to satisfy the demands of training of corpus text because of the explosive growth of information. It has become the bottleneck owing to its comparatively low processing efficiency. The High Performance Computing (HPC) specializes in improving the calculation...

chapter

Semantic processing through distributed representation of Chinese words

Wenhui Wu, Zhiting Xiao, Yanni Wang, Xuan Guo, more

2017 Chinese Automation Congress (CAC) > 2214 - 2216

2017 Chinese Automation Congress (CAC)

In this paper, by learning the origin of the word distributed representation, knowing the distributed representation is one of the bridges of natural language processing mapping to mathematical calculations. Through the learning distributed representation model: neural network language model, CBOW model and Skip-gram model, the advantages and disadvantages of each model are clarified. Through the...

chapter

Deep learning methods for subject text classification of articles

Piotr Semberecki, Henryk Maciejewski

2017 Federated Conference on Computer Science and Information Systems (FedCSIS) > 357 - 360

2017 Federated Conference on Computer Science and Information Systems (FedCSIS)

This work presents a method of classification of text documents using deep neural network with LSTM (long short-term memory) units. We have tested different approaches to build feature vectors, which represent documents to be classified: we used feature vectors constructed as sequences of words included in the documents, or, alternatively, we first converted words into vector representations using...

chapter

Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques

Jayati Deshmukh, Annervaz K. M, Sanjay Podder, Shubhashis Sengupta, more

2017 IEEE International Conference on Software Maintenance and Evolution (ICSME) > 115 - 124

2017 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Duplicate Bug Detection is the problem of identifying whether a newly reported bug is a duplicate of an existing bug in the system and retrieving the original or similar bugs from the past. This is required to avoid costly rediscovery and redundant work. In typical software projects, the number of duplicate bugs reported may run into the order of thousands, making it expensive in terms of cost and...

chapter

Investigating the Use of Code Analysis and NLP to Promote a Consistent Usage of Identifiers

Bin Lin, Simone Scalabrino, Andrea Mocci, Rocco Oliveto, more

2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM) > 81 - 90

2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)

Meaningless identifiers as well as inconsistent use of identifiers in the source code might hinder code readability and result in increased software maintenance efforts. Over the past years, effort has been devoted to promoting a consistent usage of identifiers across different parts of a system through approaches exploiting static code analysis and Natural Language Processing (NLP). These techniques...

chapter

Chunking based malayalam paraphrase identification using unfolding recursive autoencoders

R. Praveena, M Anand Kumar, K. P. Soman

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 922 - 928

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Paraphrase Detection is the task of examining if two sentences convey the same meaning or not. Here, in this paper, we have chosen a sentence embedding by unsupervised RAE vectors for capturing syntactic as well as semantic information. The RAEs learn features from the nodes of the parse tree and chunk information along with unsupervised word embedding. These learnt features are used for measuring...

chapter

A learning method for coreference resolution using semantic role labeling features

G Veena, Deepa Gupta, Anna Neethu Daniel, S. Roshny

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 67 - 72

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Coreference resolution plays a significant role in natural language processing systems. It is the method of figuring out all the noun phrases that refer back to the identical real world entity. Several researches have been done in noun phrase coreference resolution by using certain machine learning techniques. Our paper proposes a machine learning approach using support vector machines (SVM) towards...

chapter

Detecting stance in kannada social media code-mixed text using sentence embedding

V. Srinidhi Skanda, M. Anand Kumar, K.P. Soman

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 964 - 969

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Popular social media sites like Facebook, Twitter, YouTube etc. has become a platform for the user to express their stance towards the target entity. Target entity can be a person, a location, an organization, a new government policy etc. The stance expressed by the user may be favor towards the target entity, against or sometimes neutral. For the first time, we present stance detection system implemented...

chapter

A comparison of different part-of-speech tagging technique for text in Bahasa Indonesia

Ahmad Zuli Amrullah, Rudy Hartanto, I Wayan Mustika

2017 7th International Annual Engineering Seminar (InAES) > 1 - 5

2017 7th International Annual Engineering Seminar (InAES)

Part of speech tagging has some different methods or techniques to the problem in assigning each word of a text with a part-of-speech tag. In this paper, we conducted some part-of-speech tagging techniques for Bahasa Indonesia experiments using statistical approach (Unigram, Hidden Markov Models) and Brill's tagger. In this study, we used Supervised POS Tagging approach requiring a large number of...

chapter

Learning Profiles in Duplicate Question Detection

Chakaveh Saedi, Joao Rodrigues, Joao Silva, Ant�onio Branco, more

2017 IEEE International Conference on Information Reuse and Integration (IRI) > 544 - 550

2017 IEEE International Conference on Information Reuse and Integration (IRI)

This paper presents the results of systematic and comparative experimentation with major types of methodologies for automatic duplicate question detection when these are applied to datasets of progressively larger sizes, thus allowing to study the learning profiles of this task under these different approaches and evaluate their merits. This study was made possible by resorting to the recent release...

chapter

Sentiment analysis of student feedback using machine learning and lexicon based approaches

Zarmeen Nasim, Quratulain Rajput, Sajjad Haider

2017 International Conference on Research and Innovation in Information Systems (ICRIIS) > 1 - 6

2017 5th International Conference on Research and Innovation in Information Systems (ICRIIS)

This paper presents a combination of machine learning and lexicon-based approaches for sentiment analysis of students feedback. The textual feedback, typically collected towards the end of a semester, provides useful insights into the overall teaching quality and suggests valuable ways for improving teaching methodology. The paper describes a sentiment analysis model trained using TF-IDF and lexicon-based...

chapter

Convolutional neural networks and multimodal fusion for text aided image classification

Dongzhe Wang, Kezhi Mao, Gee-Wah Ng

2017 20th International Conference on Information Fusion (Fusion) > 1 - 7

2017 20th International Conference on Information Fusion (Fusion)

With the exponential growth of web meta-data, exploiting multimodal online sources via standard search engine has become a trend in visual recognition as it effectively alleviates the shortage of training data. However, the web meta-data such as text data is usually not as cooperative as expected due to its unstructured nature. To address this problem, this paper investigates the numerical representation...

chapter

Prioritized active learning for malicious URL detection using weighted text-based features

Sreyasee Das Bhattacharjee, Ashit Talukder, Ehab Al-Shaer, Pratik Doshi

2017 IEEE International Conference on Intelligence and Security Informatics (ISI) > 107 - 112

2017 IEEE International Conference on Intelligence and Security Informatics (ISI)

Data analytics is being increasingly used in cyber-security problems, and found to be useful in cases where data volumes and heterogeneity make it cumbersome for manual assessment by security experts. In practical cyber-security scenarios involving data-driven analytics, obtaining data with annotations (i.e. ground-truth labels) is a challenging and known limiting factor for many supervised security...

chapter

Cost-Efficient Quality Assurance of Natural Language Processing Tools through Continuous Monitoring with Continuous Integration

Marc Schreiber, Bodo Kraft, Albert Zundorf

2016 IEEE/ACM 3rd International Workshop on Software Engineering Research and Industrial Practice (SER&IP) > 46 - 52

2016 IEEE/ACM 3rd International Workshop on Software Engineering Research and Industrial Practice (SER&IP)

More and more modern applications make use of natural language data, e. g. Information Extraction (IE) or Question Answering (QA) systems. Those application require preprocessing through Natural Language Processing (NLP) pipelines, and the output quality of these applications depends on the output quality of NLP pipelines. If NLP pipelines are applied in different domains, the output quality decreases...

Keywords:
TRAINING
NATURAL LANGUAGE PROCESSING

Publication date

Set your own date range

Content availability

Available (697)
None (11)

Keywords

DATA MINING (194)
FEATURE EXTRACTION (175)
HIDDEN MARKOV MODELS (175)
ACCURACY (155)
SPEECH (151)
TEXT ANALYSIS (134)
SPEECH RECOGNITION (124)
SUPPORT VECTOR MACHINES (110)
LEARNING (ARTIFICIAL INTELLIGENCE) (91)
MACHINE LEARNING (90)
CONTEXT (81)
TAGGING (80)
DICTIONARIES (78)
CLASSIFICATION ALGORITHMS (76)
COMPUTATIONAL MODELING (73)
ARTIFICIAL NEURAL NETWORKS (69)
SEMANTICS (69)
DATA MODELS (64)
TESTING (64)
TRAINING DATA (62)
COMPUTATIONAL LINGUISTICS (59)
PATTERN CLASSIFICATION (59)
STATISTICAL ANALYSIS (59)
SPEECH PROCESSING (51)
INFORMATION RETRIEVAL (49)
LANGUAGE TRANSLATION (49)
ENTROPY (47)
PROBABILITY (47)
ACOUSTICS (46)
MATHEMATICAL MODEL (41)
TEXT CATEGORIZATION (39)
VOCABULARY (39)
DATABASES (38)
LABELING (38)
CHARACTER RECOGNITION (36)
HIDDEN MARKOV MODEL (36)
INTERNET (36)
SUPPORT VECTOR MACHINE CLASSIFICATION (35)
ADAPTATION MODEL (34)
SYNTACTICS (34)
COMPUTERS (33)
GRAMMARS (32)
WORD PROCESSING (32)
SUPPORT VECTOR MACHINE (31)
CLASSIFICATION (30)
EDUCATIONAL INSTITUTIONS (30)
HMM (28)
ALGORITHM DESIGN AND ANALYSIS (27)
KERNEL (27)
NATURAL LANGUAGES (27)
NEURAL NETS (27)
HUMANS (26)
DECODING (25)
HANDWRITING RECOGNITION (24)
NEURAL NETWORKS (24)
LINGUISTICS (23)
MACHINE TRANSLATION (23)
CONFERENCES (21)
CONTEXT MODELING (21)
KNOWLEDGE BASED SYSTEMS (21)
CONDITIONAL RANDOM FIELDS (20)
ERROR ANALYSIS (20)
HANDWRITTEN CHARACTER RECOGNITION (20)
SPEECH SYNTHESIS (20)
GAUSSIAN PROCESSES (19)
PROBABILITY DENSITY FUNCTION (19)
RANDOM PROCESSES (19)
CONDITIONAL RANDOM FIELD (18)
FEATURE SELECTION (18)
NAMED ENTITY RECOGNITION (18)
PRAGMATICS (18)
SENTIMENT ANALYSIS (18)
WORD SENSE DISAMBIGUATION (18)
CRF (17)
LANGUAGE MODEL (17)
NIST (17)
ORGANIZATIONS (17)
TEXT CLASSIFICATION (17)
BAYES METHODS (16)
STATISTICAL MACHINE TRANSLATION (16)
SVM (16)
TEXT MINING (16)
DOCUMENT HANDLING (15)
INFORMATION EXTRACTION (15)
MAXIMUM ENTROPY METHODS (15)
NEURONS (15)
PATTERN CLUSTERING (15)
CHINESE WORD SEGMENTATION (14)
CORRELATION (14)
PATTERN RECOGNITION (14)
PREDICTIVE MODELS (14)
SEARCH ENGINES (14)
SPEAKER RECOGNITION (14)
STANDARDS (14)
AUTOMATIC SPEECH RECOGNITION (13)
CLUSTERING ALGORITHMS (13)
DECISION TREES (13)
EQUATIONS (13)
more

INFONA - science communication portal

Search results

Vietnamese news classification based on BoW with keywords extraction and neural network

Evaluating automatic methods to extract patients' supplement use from clinical reports

Sentiment analysis on Twitter data with semi-supervised Doc2Vec

Experience Report: Log Mining Using Natural Language Processing and Application to Anomaly Detection

Multilingual cyberbullying detection system: Detecting cyberbullying in Arabic content

Semi-supervised approach for Persian word sense disambiguation

Design and implementation of Word2Vec parallel algorithm based on HPC

Semantic processing through distributed representation of Chinese words

Deep learning methods for subject text classification of articles

Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques

Investigating the Use of Code Analysis and NLP to Promote a Consistent Usage of Identifiers

Chunking based malayalam paraphrase identification using unfolding recursive autoencoders

A learning method for coreference resolution using semantic role labeling features

Detecting stance in kannada social media code-mixed text using sentence embedding

A comparison of different part-of-speech tagging technique for text in Bahasa Indonesia

Learning Profiles in Duplicate Question Detection

Sentiment analysis of student feedback using machine learning and lexicon based approaches

Convolutional neural networks and multimodal fusion for text aided image classification

Prioritized active learning for malicious URL detection using weighted text-based features

Cost-Efficient Quality Assurance of Natural Language Processing Tools through Continuous Monitoring with Continuous Integration

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options