The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In data mining, link prediction for the networks is one of the areas of greatest interest today. Research achievements of link prediction problem can be applied in many fields such as study genetically transferred diseases, online marketing, e-commerce services, discover the structure of criminal networks, friend request in social networks … However, most of researchers focused on predicting the existence...
Thyroid nodules are common findings and thyroid cancer is projected to be one of the leading causes of cancer in women. The EHR includes the necessary data needed to connect clinical research with patient outcomes. The objective for this project was to develop and validate a usable informatics tool for clinicians and researchers to record, analyze, and be able to manipulate the clinical and research...
With the rapid advances in IoT technologies, the role of IoT gateways becomes even more important. Therefore, improving the reliability, availability and serviceability (RAS) of IoT gateways is crucial. Nowadays, Linux is widely adopted for core enterprise systems not only because it is a free operating system but also because it offers advantages in regards to operational stability. With many Linux...
This paper presents a rapidly and lower neural networks to treat those waste water index that is difficult to be measured. Model called soft sensor is composited two parts: one is used to estimate the principal linear output, the other one is used to adjust estimated error to obtain better accuracy. Selection of features that effects greatly computation scale and predict accuracy is discussed also...
A symptom is the physical indication of an unstable state or the beginning of diseases. Symptom analysis is an essential factor in the medical area, where it is used for disease diagnosis, drug prescription, and the development of new pharmaceuticals. Commensurate with its importance, symptom analysis has been the subject of various studies in recent years. However, prior literature on this topic...
Event relation knowledge is important for deep language understanding and inference. Previous work has established automatic acquisition methods of event relations that focus on common sense knowledge acquisition from large-scale unlabeled corpus. However, in the case of domain-specific knowledge acquisition, such a method can not acquire much knowledge due to the limited amount of available knowledge...
Many promising malware research projects focus on malware behaviour analysis, however, in the end they tend to build new detection systems and stick to measuring detection ratios. Our approach focuses on malware behavioural analysis for defining (characterising) malicious software on rather high level of abstraction, in order to break the endless cycle of evolving malware and malware analysts trying...
In this paper, we formally prove that the classification rules formed on the basis of contrast patterns are guaranteed to be of a high quality. We propose to use the new ‘Sets of Contrasting Rules’ pattern for the identification of local differences between the classes of the dataset. Being essentially a contrast pattern formed of several classification rules, ‘Sets of Contrasting Rules’ pattern is...
In this paper, we propose the new iris feature extraction method that uses local thresholding with block size fitting to achieve a reliable iris authentication technique. The dispersion index is used to analyze the degree of variation in the pixel intensities. By calculating the pixel intensity variance for the block size, it is possible to quantify the degree of contrast and brightness in the block...
This paper proposes a combination of data mining and natural language processing technology, try to analyze students' learning behavior and content in MOOCs interactive part, to dig their learning interest, difficulty, tendencies, to evaluate their homework effect, through the interaction between teachers and students, students posting, homework or answer content, preventing of cheating behavior,...
For the multi-index decision problem with uncertain information, this paper introduces the definition of interval distance of three-parameter interval grey number, proposes the relative degree of grey incidence based on interval distance of three-parameter interval grey number, constructs the grey incidence decision-making model with three-parameter interval grey number, measures the relative degree...
Wireless sensor network supports reliable monitoring of the given network based upon the data transmitted from the sensors. However, the feasibility can be maintained only when the delivered data are fairly reliable without any damage. Since data error is not an uncommon event in wireless sensor networks, the validity of the decision making strongly depends upon the stableness of the data analysis...
The detection of duplicate bug reports can help reduce the processing time of handling field crashes. This is especially important for software companies with a large client base where multiple customers can submit bug reports, caused by the same faults. There exist several techniques for the detection of duplicate bug reports; many of them rely on some sort of classification techniques applied to...
With the arrival of big data era, data mining techniques have been widely used to build models for cyber security applications such as spam filtering, malware or virus detection, and intrusion detection. This project proposes a novel approach that uses randomness to improve robustness of data mining models used in cyber security applications against attacks that try to evade detection by adapting...
The following paper investigates a multilevel approach to data integration using the widely accepted Consensus Theory. We focus on an issue related to an initial classification of raw input data into groups that can be integrated in parallel. A final consensus is a result of the integration of obtained partial outcomes. Our main research concerns an application of Fleiss' kappa value, which in the...
Modern mining approaches should be able to properly deal with the increased availability of structured data. Here we focus on the problem of processing streams of trees. Specifically, we cope with classification tasks. We show that by adopting a double concept drifting reaction mechanism in the context of a kernel-based ensemble of classifiers, it is actually possible to have an effective and efficient...
Continuous Integration (CI) implies that a whole developer team works together on the mainline of a software project. CI systems automate the builds of a software. Sometimes a developer checks in code, which breaks the build. A broken build might not be a problem by itself, but it has the potential to disrupt co-workers, hence it affects the performance of the team. In this study, we investigate the...
Methods for cleaning dirty data typically rely on additional information about the data, such as user-specified constraints that specify when a database is dirty. These constraints often involve domain restrictions and illegal value combinations. Traditionally, a database is considered clean if all constraints are satisfied. However, many real-world scenario's only have a dirty database available...
Electricity theft occurs around the world in both developed and developing countries and may range up to 40% of the total electricity distributed. More generally, electricity theft belongs to non-technical losses (NTL), which occur during the distribution of electricity in power grids. In this paper, we build features from the neighborhood of customers. We first split the area in which the customers...
Context: Cyber-physical systems (CPS) seamlessly integrate computational and physical components. Adaptability, realized through feedback loops, is a key requirement to deal with uncertain operating conditions in CPS. Objective: We aim at assessing state-of-art approaches to handle self-adaptation in CPS at the architectural level. Method: We conducted a systematic literature review by searching four...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.