The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper addresses the problem of automated structured data records extraction from web pages. In particular, we focus on the extraction of posts from online forum sites. We show that variability in the HTML structure within user generated content in forum posts can negatively affect the extraction accuracy and propose the integration of a deep learning node classifier in the popular Mining Data...
Due to the increasing popularity of cooking-recipe sharing sites and the success of complex network science, attention has recently been devoted to developing an effective networkbased method of analyzing the characteristics of ingredient combinations used in recipes. Unlike previous approaches dealing with static properties, we aim at analyzing the dynamical changes in ingredient pairs jointly used...
Mining software repositories have frequently been investigated in recent research. Software modification in repositories are often recurring changes, similar but different changes across multiple locations. It is not easy for developers to find all the relevant locations to maintain such changes, including bug-fixes, new feature addition, and refactorings. Performing recurring changes is tedious and...
Ubiquitous availability of human mobility data has opened up new possibilities to address a multitude of application domains. However, so far, the visual analysis of this data has been hindered by the limited ability to explore and query complex movement sequences and to create models that allow meaningful aggregation. To address this problem, this paper presents a novel analytical approach that allows...
Data Jacket (DJ) is a technique for sharing information about data and for considering the potential value of datasets, with the data itself hidden, by describing the summary of data in natural language. In DJs, variables are described by variable labels (VLs), which are the names/meanings of variables. In the previous study, the matrix-based method for inferring VLs in DJs whose VLs are unknown,...
We present a novel and configurable synthetic data generator for evolving region trajectories that emulates certain characteristics of a given input dataset, such as the spatial position, velocity, lifespan, and geometry shape and size. This tool aims to facilitate faster prototyping and evaluation of new spatiotemporal data mining algorithms that operate on a specific type of trajectory data, of...
This paper presents a novel hierarchical spatiotemporal orientation representation for spacetime image analysis. It is designed to combine the benefits of the multilayer architecture of ConvNets and a more controlled approach to spacetime analysis. A distinguishing aspect of the approach is that unlike most contemporary convolutional networks no learning is involved; rather, all design decisions are...
Stack Overflow is a learning community for software developers to share and solve programming problems with each other. However, women are often deterred from contributing questions or answers. Research external to programming communities suggest the presence of peers can increase activity from underrepresented users in unfamiliar spaces. To investigate the concept of peer parity, we studied how women...
In this paper, syllabus visualization tool based on standard curriculum is proposed. To visualize the difference of syllabuses, the proposed tool uses correspondence analysis. Using the proposed tool, we can know the relationship between syllabuses and standard curriculum (CS2013).
In atmospheric sciences, sizes of data sets grow continuously due to increasing resolutions. A central task is the comparison of spa-tiotemporal fields, to assess different simulations and to compare simulations with observations. A significant information reduction is possible by focusing on geometric-topological features of the fields or on derived meteorological objects. Due to the huge size of...
Block-based programming environments make learning to program easier by allowing learners to focus on concepts rather than syntax. However, these environments offer little support when learners encounter difficulty with programming concepts themselves, especially in the absence of instructors. Textual programming environments increasingly use AI and data mining to provide intelligent, adaptive support...
Android's flexible communication model allows interactions among third-party apps, but it also leads to inter-app security vulnerabilities. Specifically, malicious apps can eavesdrop on interactions between other apps or exploit the functionality of those apps, which can expose a user's sensitive information to attackers. While the state-of-the-art tools have focused on detecting inter-app vulnerabilities...
Phishers often exploit users' trust on the appearance of a site by using webpages that are visually similar to an authentic site. In the past, various research studies have tried to identify and classify the factors contributing towards the detection of phishing websites. The focus of this research is to establish a strong relationship between those identified heuristics (content-based) and the legitimacy...
Together with the technology advancement, Computer Vision plays an important role in enhancing smart computing systems to help people overcome obstacles in their daily lives. One of the common troublesome problems is human memorization ability, especially memorizing things such as personal items. It is annoying for people to waste their time finding lost items manually by recall or notes. This motivates...
Communication among agile software development teams is knit around the requirements (user stories) and is considered vital for information sharing. Researchers have studied communication among agile teams from various perspectives including team distribution, distance, and communication patterns etc. It is worth noticing here that most of the advances done in this domain are for Scrum teams. However,...
Usability and user experience (UUX) strongly affect software quality and success. User reviews allow software users to report UUX issues. However, this information can be difficult to access due to the varying quality of the reviews, its large numbers and unstructured nature. In this work we propose an approach to automatically detect the UUX strengths and issues of software features according to...
Most of transport corporations today already use some business intelligence solutions. However, using of advanced data mining methods may result in higher efficiency, increased level of travel experience. This paper briefly review the potential vendors and technologies to probably select the best possible predictive analytics method for transport management purposes.
The correcting process for strokes extracted from Chinese characters is the necessary step to extract the errors of writing errors automatically. Visualization of extracted strokes is the prerequisite for manual correction. Therefore, visualization and adaptive correction methods are proposed. To reduce the cognitive burden of correcting, color, brightness, saturation and order number is comprehensively...
Crime matching process usually involves the time tedious and information intensive task of eliciting plausible associations among actors of crimes to identify potential suspects. Aiming towards the assistance of this procedure, we in this paper have exhibited the utilization of associative search; a relatively new search mining instrument to evoke conceivable associations from the information. We...
Recognition of vehicle types in real life traffic scenarios is a challenging task due to the diversity of vehicles and uncontrolled environments. Efficient methods and feature representations are needed to cope with these challenges. In this paper, we address the vehicle type classification problem in real life traffic scenarios and propose a multimodal method that uses efficient representations of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.