The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper reports the investigations and experimental procedures conducted for designing an automatic sleep classification tool basedconly in the features extracted with wavelets from EEG, EMG and EOG (electro encephalo-mio- and oculo-gram) signals, without any visual aid or context-based evaluation. Real data collected from infants was processed and classified by several traditional and bio-inspired...
Defining a boundary between inliers and outliers is a major challenge in unsupervised outlier detection. In the absence of labeled data, the true outliers set cannot be evaluated. This lays the burden on both the choice of an efficient outlier detection criterion, and parameter selection. While numerous unsupervised outlier detection criteria, with different parameters, have been proposed, an unsupervised...
There are numerous problems of increasing significance where a pattern can have several classes simultaneously associated. This kind of problems, usually called multi-label problems, should be tackled with specific techniques in order to generate models more accurate than those obtained with classical classification algorithms. This work presents the adaptation of the J48 algorithm to multi-label...
The selection of a particular neural network model belonging to the Pareto front is a problem that exists in all multi-objective algorithms. This paper proposes a novel solution to this problem based on a linear combination of the outputs of the two extremes in the Pareto front, which form an ensemble. The decision support TOPSIS method is used to determine which linear combination creates the best...
This paper presents a method of applying text mining techniques and data mining tools for pharmaceutical spam detection from Twitter data. A simple method based on a manually selected list of 65 pharmaceutical discriminating words is used for labeling spam training tweets. Preliminary experimental results show that J48 decision tree classifier has better performance over Naïve Bayesian algorithm.
Spino Cerebellar Ataxia type 2 is an autosomal dominant cerebellar hereditary ataxia with the highest prevalence in Cuba. Typical symptoms in patients of SCA2 ataxia include modifications in latency, peak velocity, and deviation in visual saccadic movements. After applying some electro-oculography based tests to both healthy and SCA2 afflicted individuals, differences in saccade morphology were found,...
This paper studies the suitability of Extreme Learning Machines (ELM) for resolving bioinformatic and biomedical classification problems. In order to test their overall performance, an experimental study is presented based on five gene microarray datasets found in bioinformatic and biomedical domains. The Fast Correlation-Based Filter (FCBF) was applied in order to identify salient expression genes...
The classification of imbalanced data is a well-studied topic in data mining. However, there is still a lack of understanding of the factors that make the problem difficult. In this work, we study the two main reasons that make the classification of imbalanced datasets complex: overlapping and data fracture. We present a Genetic Programming-based feature extraction method driven by Rough Set Theory...
A set of distributed continual range query requests, each defining a geographical region of interest, needs to be periodically reevaluated to provide up to date answers. Processing these continual queries efficiently and incrementally becomes important for location based services and applications. In this paper, we propose an efficient incremental method for continuous range query characterized by...
This paper discusses the application of two unsupervised methods in classifying type of soils. Soils that are suitable for agricultural activities can be classified into four classes which are hill soil, organic soil, alteration soil and alluvium soil. In addition, no specific support system is able to classify the type of soil and retrieve the information for location and suitable plants for local...
Real life datasets often suffer from the problem of class imbalance, which thwarts supervised learning process. In such data sets examples of positive (minority) class are significantly less than those of negative (majority) class leading to severe class imbalance. Constructing high quality classifiers for such imbalanced training data sets is one of the major challenges in machine learning, since...
In the work we consider the situation with exact classes and fuzzy information of object features. The classification error is presented for the two-class Bayes classifier. The results are received for the full probabilistic information. The new upper bound of the probability of an error is precise twice as much as the bound based on the information energy of fuzzy events.
In this paper we propose a rough classification modeling algorithm based on Ant Colony Optimization (ACO) reduction. We used ACO to compute the rough set reduct and later a modified rules generation method is employed to generate the classification rules. The rules generation algorithm used is the simplification of the Default Rules Generation Framework (DRGF) in order to fit with the ACO reduct....
Having an accurate Signature Detection Classification (SDC) Model has become highly demanding for Intrusion Detection Systems (IDS) to secure networks, especially when dealing with large and complex security audit data set. Selecting appropriate network features is one of the factors that influence the accuracy of SDC model. Past research has shown that the Hidden Marcov Chain, Genetic Algorithm,...
This paper proposes a new feature-selection strategy by integrating the Rough Set Theory (RST) and Particle Swarm Optimisation (PSO) algorithms to generate a set of discriminatory features for the classification problem. The proposed method is seen as a marriage between filter and wrapper approaches in which the RST is used to pre-reduce the feature set before optimisation by PSO, a meta-heuristic...
Individual protection, physically or mentally, is very important for someone living in a risk environment. Insurance is one of the individual protections due to accident, blaze, critical diseases or death. Insurance company plays a critical role in providing competitive product insurance that covers flexible features depend on customer requirements. In order to compete with other competitors and fulfill...
Paper deals with the problem of designing efficient classifiers for a special case of incremental concept drift. We focus on its classification based on the multiple classifier system. For the problem under consideration we propose four simple methods of combining classification and evaluate them via computer experiments.
Combining pattern recognition is the promising direction in designing an effective classifier systems. There are several approaches of collective decision-making, among them voting methods, where the decision is a combination of individual classifiers' outputs are quite popular. This article focuses on the problem of fuser design which uses continuous outputs of individual classifiers to make a decision...
Feature selection is a very important preprocessing step in data classification. By applying it we are able to reduce the dimensionality of the problem by removing redundant or irrelevant data. High dimensional data sets are becoming usual nowadays specially in bio-informatics, biology, signal processing or text classification, increasing the need for efficient feature selection methods. In this paper...
In this paper, we present an approach to automatically extract and classify opinions in texts. We propose a similarity measurement calculating semantically distances between a word and predefined subgroups of seed words. We have evaluated our algorithm on the semantic evaluation company “SemEval 2007” corpus, and we obtained the best value of Precision and F1 62% and 61%. As an improvement of 20 %...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.