The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The application of Information Retrieval (IR) techniquesto software traceability link recovery has been the focusof many studies. These studies have formulated the task ofestablishing valid trace links between two types of softwareartifacts as a retrieval problem, where one type of artifacts isselected as the set of queries and the other as the corpus. Previouswork selected the sets of queries and...
We argue that verbose queries used for software retrieval contain many terms that follow specific discourse rules, yet hinder retrieval. We report the results of an empirical study on the effect of removing such terms from verbose queries in the context of Text Retrieval-based concept location. In the study, we remove terms from 424 queries, generated from bug reports of nine open source systems....
In this project we propose a new approach for emotion recognition using web-based similarity (e.g. confidence, PMI and PMING). We aim to extract basic emotions from short sentences with emotional content (e.g. news titles, tweets, captions), performing a web-based quantitative evaluation of semantic proximity between each word of the analyzed sentence and each emotion of a psychological model (e.g...
Many researchers use convolutional neural networks with small rectangular filters for music (spectrograms) classification. First, we discuss why there is no reason to use this filters setup by default and second, we point that more efficient architectures could be implemented if the characteristics of the music features are considered during the design process. Specifically, we propose a novel design...
The article describes computational experiment and further research work in the area of identification of destructive information influence in social networks. The problem of distribution of suicidal content via open sources is presented. On the basis of calculations there was made a conclusions about the prospects of using the methods of information retrieval in the task of identification of the...
Information Retrieval (IR) identifies trace links based on textual similarities among software artifacts. However, the vocabulary mismatch problem between different artifacts hinders the performance of IR-based approaches. A growing body of work addresses this issue by combining IR techniques with code dependency analysis such as method calls. However, so far the performance of combined approaches...
A search system that allows the users to search and find the most interesting software artifacts based on the current context of the user is highly desirable. This paper sets forth the requirements of contextual search for software engineering. A context for software engineering is defined by four dimensions in. A contextual search system is presented to address the requirements. We conclude that...
LSI and LDA are widely used techniques to uncover the underlying topical structure of text. They traditionally rely on bag-of-words representation of documents and term frequency-based (TF) weighting schemes. In this paper, we represent documents as graph-of-words to capture the relationships between close words and propose the number of contexts of co-occurrences as alternative term weights (TW)...
Measuring the similarity between strings plays an increasingly important role in many applications such as information retrieval, short answer grading, and conversational agent software. There has been much recent research interest in applying string similarity within Arabic language applications; however, the use of string similarity in Arabic poses a substantial challenge such as the complexity...
Relevance is one of the most interesting topics in the information retrieval domain. In this paper, we introduce another method of relevance calculation. We propose to use the implicit opinion of users to calculate relevance. The Implicit judgment of users is injected to the documents by calculating different kinds of weighting. These latter touch several criteria like as user's weight in the query's...
Among the enormous variety of data in recent years, transportation data contain significant potential for understanding the information requirements and intention of passengers. In this paper, we propose a new information ranking method for passenger intention prediction and service recommendation. The method includes three main features, which include (1) predicting the intention of a used based...
In recent years, the fast growth of Web pages and the constant evolution of internet technologies have lead to a significant increase in the number of pedagogical resources. Thus, the indexing and search problems have become crucial. To overcome this problem, it was proposed to use information coming from the norms and standards of educational metadata. However, this solution does not solve completely...
This paper presents a method named SoSVMRank, which integrates the social information of a Web document to generate a high-quality summarization. In order to do that, the summarization was formulated as a learning to rank task, in which the order of a sentence or comment was determined by its informative information. The informative information was measured by a set of local and social features in...
This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between...
Healthcare practitioners are increasingly using search functionality embedded in Electronic Medical Record (EMR) software to search for relevant evidence summaries at point of care. We introduce a learning to rank approach that exploits information carried in EMR data and UpToDate user accounts to (significantly) improve ranking results, compared to a comparable model that does not exploit such features.
Similar bugs are bugs that require handling of many common code files. Developers can often fix similar bugs with a shorter time and a higher quality since they can focus on fewer code files. Therefore, similar bug recommendation is a meaningful task which can improve development efficiency. Rocha et al. propose the first similar bug recommendation system named NextBug. Although NextBug performs better...
This paper presents a method for semantic class disambiguation for all words. Unlike the ordinary word sense disambiguation, a set of semantic classes or coarse grained senses is defined as a common sense inventory, then universal classifiers to select an appropriate semantic class of a target word in a given context, which can be applicable to all words, are trained by supervised learning. In the...
When using Information Retrieval (IR) systems, users often present search queries made of ad-hoc keywords. It is then up to the information retrieval systems (IRS) to obtain a precise representation of the user's information need and the context (preferences) of the information. To address this problem, we investigate optimization of IRS to individual information needs in order of relevance. The goal...
Search engines have turn out to be the most effective tools for acquiring useful information from the Internet. Different techniques are employed for searching desired documents from the World Wide Web. Most of them are either change the entered query or apply different ranking methods to adapt the searching results for particular person. This paper introduces a system (PSQCR) that produces personalized...
The main problem of rule-based information extraction technique is that the extraction rules tend to be specifically designed for specific information or document structure; hence it cannot be directly used in another without some proper modifications. Semi-structured documents like tables present another challenge to information extraction; since there are no standards on how to design it, the structure...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.