The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Background: Many relevancy filters have been proposed to select training data for building cross-project defect prediction (CPDP) models. However, up to now, there is no consensus about which relevancy filter is better for CPDP. Goal: In this paper, we conduct a thorough experiment to compare nine relevancy filters proposed in the recent literature. Method: Based on 33 publicly available data sets,...
We develop an intelligent credit rating system that can provide debtors' rating information without involving credit rating agencies. Several models are used for credit scoring in our work, including the Duffie's model, logistic regression, and random forest. We compare the performance of these models and build an in-depth understanding of the evaluation of credit rating. Furthermore, we propose a...
By analyzing the disadvantages of the traditional KNN using lazy learning that directly classify the data based on the K neighboring classes using the majority voting method, a new Sigmoid weighted classification algorithm WKS (Weighted KNN Based On Sigmoid) was proposed. WKS provides a new method for learning and training, since each training data di ∊ D contributes to the correct classification...
The electrical system photovoltaic (PV) modules required special design considerations due to unpredictable and sudden changes in weather conditions such as the solar irradiation level as well as the cell operating temperature. Therefore, this study presents a practical and reliable approach for the prediction of PV power output using an intelligent-based technique namely Cuckoo Search Algorithm —...
The enhancement of speech degraded with the non-stationary noise types that typify real-world conditions has remained a challenging problem for several decades. However, recent use of data driven methods for this task has brought great performance improvements. In this paper, we develop a speech enhancement framework based on the extreme learning machine. Experimental results show that the proposed...
This paper presents a novel approach for remaining useful life (RUL) prediction of rotating machinery using hierarchical deep neural networks (DNN). The different health stages are classified by a DNN-based health stage classifier trained by segmented degradation signal. This method builds several RUL predictors based on the health stages of the degradation process. Instead of modeling the entire...
In this paper, we propose a classification model for learning state based on individual biometric data. In particular, we use the pupil size as a biometric data and the data has been collected from 72 participants. We also deploy the support vector machine (SVM) in conjunction with k-fold validation as an analysis tool. In order to improve the performance of the SVM, the we remove outliers from the...
Impressive image captioning results are achieved in domains with plenty of training image and sentence pairs (e.g., MSCOCO). However, transferring to a target domain with significant domain shifts but no paired training data (referred to as cross-domain image captioning) remains largely unexplored. We propose a novel adversarial training procedure to leverage unpaired data in the target domain. Two...
We propose a new method in rank level fusion for biometric identification. Our method is based on the pool adjacent violators (PAV) algorithm after the ranks have been transformed to the approximated scores. We then show that our method outperforms various approaches that commonly used in biometric rank level fusion on NIST BSSR1 multimodal database.
Data amount becomes rapidly increased in today's era. Data can be in form of text, picture, voice, and video. Social media is one factor of the data increase as everybody expresses, gives opinion, and even complains in social media. The first step is data collection used API twitter with each candidate names on Jakarta Governor Election. The collected data then became input for preprocessing step...
In this paper we present a human emotions classification technique based on EEG signals from single electrode. We designed an EEG experiment in which we collected training data. Applying the Daubechies 8 wavelet (db8) delta, theta, alpha, beta and gamma wave signal are obtained by decomposition and used as codewords for Bag-of-Words model. We represented each signal with its codewords histogram and...
Keystroke dynamics, which is a biometric characteristic that depends on typing style of users. In the past thirty years, dozens of classifiers have been proposed for distinguishing people using keystroke dynamics; many have obtained excellent results in evaluation. However, a more common case is that only normal instances are available and none of the rare classes are observed. It leads us to use...
Traditional machine learning approaches are based on the premise that the training and testing samples come from a common probability distribution. Transfer learning refers to situations where this assumption does not necessarily hold. Integrating biological data measured on diverse platforms is a major challenge. Transfer learning is a natural candidate for achieving such integration. In this paper,...
Due to the fact that web services spread around the world, new threats are increasing. The misuse intrusion detection system is not able to provide enough protection for the security of Web Services, because it only detects formerly known attacks and cannot detect new unknown attacks. Web logs contain a lot of valuable information that is useful in preventing intrusion. In this paper, we present a...
Facial recognition is a challenging problem in image processing and machine learning areas. Since widespread applications of facial recognition make it a valuable research topic, this work tries to develop some new facial recognition systems that have both high recognition accuracy and fast running speed. Efforts are made to design facial recognition systems by combining different algorithms. Comparisons...
Among dependency parsing algorithms available in MALTParser and MSTParser, the best accuracy for parsing Indonesian language is achieved by Chu-Liu-Edmonds algorithm. This is due to the long distance relation between head and dependent in Indonesian sentences. Most of inaccuracy parsing results is caused by the non-verb sentence root score where there are many cases in Indonesian sentence having a...
Color is one of the attributes that play a role in identifying specific objects, color processing including the extraction of information about the spectral properties of the object's surface and look for the best similarity of a set of descriptions which have been known to do an introduction. Therefore, the classification is needed right fuji apples to obtain good quality fruit. Fuzzy model is one...
With the evolution of large computer data, every corner of the society is filled with a variety of text information. Indeed, large data information that need manage by people has been unable to meet the rapid development of society. Therefore, the technology of efficient management and accurate positioning of vast quantities of text information has become a hot topic in the research community. In...
Case-Based Reasoning also known as CBR model has been widely used to solve the problem in various cases. This study aims to explain the implementation of K-Nearest Neighbor Algorithm in Case-Based Reasoning model. The research showed that KNN algorithm is suitable to be used in CBR model. The results of this study are to measure the accuracy level of automatic answer identity formation and search...
This paper addresses the problem of decision making when there is no or very vague knowledge about the probability models associated with the hypotheses. Such scenarios occur for example in Internet of Things (IoT), environmental surveillance and data analytics. The probability models are learned from the data by empirical distributions that provide an accurate approximation of the true model. Hence,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.