The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Network attack graphs are a type of analysis tool that can be used to determine the impact that security vulnerabilities have on the network. It is important, then, for attack graphs to be able to represent enough information to aid this analysis. Moreover, they must be able to handle and integrate new vulnerabilities that are being discovered by the security community. We developed a prototype tool...
The aim of this paper is to introduce a semantic methodology using ontology in order to improve results of data mining in judicial decisions database. An intelligent and automatic method to search for sentences in lawsuits related to the one in trial is presented. A judicial ontology is built with and without rules from experts. The method can provide judiciary celerity, seeking to solve the yearning...
The aim of this paper is to show the strengths and the weakness of process mining tools in post-delivery validation. This is illustrated on two use-cases from a real-world system. We also indicate what type of research has to be done to make process mining tools more usable for validation purposes.
Developing new ideas and algorithms or comparing new findings in the field of requirements engineering and management implies a dataset to work with. Collecting the required data is time consuming, tedious, and may involve unforeseen difficulties. The need for datasets often forces re-searchers to collect data themselves in order to evaluate their findings. However, comparing results with other publications...
In this work, we analyze the usefulness of the normalized compression distance (NCD) as a similarity measure to bird species identification through audio samples. As a first approach we review the effect of different compression methods from 7z and CompLearn Toolkit, over subsets of bird audio samples obtained from the xeno-canto database. The performance of each compression method was measured applying...
The authors discuss the problem of distributed knowledge acquisition for the construction of complete and consistent knowledge bases in integrated expert systems, in particular dynamic integrated expert systems, via sharing of knowledge sources of different topologies (databases as electronic media, experts and problem-oriented texts). This work is focused on models and methods of distributed knowledge...
In this study, we present a method for extracting and representing knowledge of presentation slide creators based on the slide contents that are published on a slide sharing service. The proposed method regards the number of views, downloads, and likes from other users as the users rating for a presentation slide, and extract knowledge of the slide creator in terms of the usefulness and knowledge...
Recruitment and selection of new employees rank to the important processes of human potential management and development. Especially the process of employee selection prepares proper conditions for a successful work performance and decides on a future progress-ability of the organizations. In a unique sector of private security, the precise realization of employee selection can solve one of the most...
Nowadays, in the period of the digitalization and knowledge economy development, majority of activities result in the increase of data that should be captured. In each area of business, there is an increasing urge to extract the knowledge from data in a timely manner in order to be able to make shifts that ensure a competitive advantage. Thus, the knowledge of methods and techniques of big data processing...
The Unified Modeling Language (UML) is widely taught in academia and has good acceptance in industry. However, there is not an ample dataset of UML diagrams publicly available. Our aim is to offer a dataset of UML files, together with meta-data of the software projects where the UML files belong to. Therefore, we have systematically mined over 12 million GitHub projects to find UML files in them....
In Data Mining (DM) projects, more specifically in the Data Understanding and the Data Preparation phases, several techniques found in the literature are used to detect and handle data quality problems such as missing data, outliers, inconsistent data or time-variant data. However, the main limitation in the application of these techniques is the complexity caused by a lack of anticipation in the...
The identification of vulnerabilities relies on detailed information about the target infrastructure. The gathering of the necessary information is a crucial step that requires an intensive scanning or mature expertise and knowledge about the system even though the information was already available in a different context. In this paper we propose a new method to detect vulnerabilities that reuses...
As the popularity of mobile smart devices continues to climb the complexity of “apps” continues to increase, making the development and maintenance process challenging. Current bug tracking systems lack key features to effectively support construction of reports with actionable information that directly lead to a bug’s resolution. In this demo we present the implementation of a novel bug reporting...
There is a vast growth of generated event data being collected and stored by organizations. Within the field of Process Mining, this data has been used to discover, analyze and enhance processes from different domains. For this purpose there are hundreds of techniques available in different tools. These techniques are mostly focused on single processes. On the other hand, there are several proposals...
Many studies analyze issue tracking repositories to understand and support software development. To facilitate the analyses, we share a Mozilla issue tracking dataset covering a 15-year history. The dataset includes three extracts and multiple levels for each extract. The three extracts were retrieved through two channels, a front-end (web user interface (UI)), and a back-end (official database dump)...
Stack Overflow is a popular question answering site that is focused on programming problems. Despite efforts to prevent asking questions that have already been answered, the site contains duplicate questions. This may cause developers to unnecessarily wait for a question to be answered when it has already been asked and answered. The site currently depends on its moderators and users with high reputation...
Many software development projects have introduced manda-tory code review for every change to the code. This meansthat the project needs to devote a significant effort to re-view all proposed changes, and that their merging into thecode base may get considerably delayed. Therefore, all thoseprojects need to understand how code review is working, andthe delays it is causing in time to merge.This is...
Software regression testing verifies previous features on a software product when it is modified or new features are added to it. Because of the nature of regression testing it is a costly process. Different approaches have been proposed to reduce the costs of this activity, among which are: minimization, prioritization, and selection of test cases. Recently, soft computing techniques, such as data...
This paper gives details about web-based department automation system which will be implemented at educational institution level for maintaining faculty details and records. The proposed application aims at providing efficient and hassle-free working environment for faculty of the organization as it reduces the amount of paperwork involved. This system is based on the modern approach of data mining...
Software tend to evolve over time and so does the test-suite. Regression testing is aimed at assessing that the software evolution did not compromise the working of the existing software components. However, as the software and consequently the test-suite grow in size, the execution of the entire test-suite for each new build becomes infeasible. Techniques like test-suite selection, test-suite minimisation...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.