The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we present a novel method for predicting gene functional interactions. We study the effectiveness of various raw and derived features from neural word embedding learned from biomedical literature. Our evaluation results demonstrate that the information captured in neural word embedding is very useful and our learned classification models are capable of predicting gene functional interactions...
During maintenance, software developers deal with numerous change requests made by the users of a software system. Studies show that the developers find it challenging to select appropriate search terms from a change request during concept location. In this paper, we propose a novel technique-QUICKAR-that automatically suggests helpful reformulations for a given query by leveraging the crowdsourced...
The Semantic Web aims at building a Web where data is enriched with meaningful annotations. In other words, data is semantically organized in such a way that both human and machine can understand and query it, aiming at the creation of dynamic Web pages. Ontologies, as a keystone of the Semantic Web, have gained an ample acceptance as an information model, which can be used for several purposes, such...
Filtering and finding items of interest in large volumes of data, such as products in an e-commerce application or invoices in an ERP web platform can be a burdensome task, either for novice users that do not have insights on how the data is modeled or for those users who are already accustomed to the used system, but usually their filtering needs are significantly more complex. Natural language processing...
The Multiple Listing Service, commonly known as the MLS, is the singularly most important database where real estate agents and brokers list real estate properties for sale. It is common that agents include textual comments pertinent to the property. Although the information content of comments varies, it is usually expressed in good faith and in many cases is helpful in shedding light on the overall...
Learning vocabulary is both critical and boring for many EFL (English as foreign language) learners and one major difficulty is the lack of suitable context for learning new words. Although videos can provide a rich context for learning vocabulary, it often takes more time than textual media to learn. This paper presents the EVOV system which is to exploit the role of video in vocabulary learning...
This paper investigates the problem of defining the acoustic-phonetic unit set for flexible vocabulary continuous speech recognition systems. As an alternative to the classical modeling approach with biphones and triphones, a set of stationary/transitory state units is defined that is limited enough in number as to represent a closed set trainable once and for all. A major benefit of these units is...
The article presents a case study of applying data cleansing methods and segmentation procedures in order to correct and enhance the structure of the domain corpus of fire service. During the study we present our approach and the results in the task of correcting the misspellings, as well as the method of segmenting the corpus into sentences.
In this paper, we address a new scheme for symbol retrieval based on relation bag-of-features (BOFs) which are computed between the extracted visual primitives. Our feature consists of pair wise spatial relations from all possible combinations of individual visual primitives. The key characteristic of the overall process is to use topological information to guide directional relations. Consequently,...
Model checking problems for first- and monadic second-order logic on graphs have received considerable attention in the past, not the least due to their connections to problems in algorithmic graph structure theory. While the model checking problem for these logics on general graphs is computationally intractable, it becomes tractable on important classes of graphs such as those of bounded tree-width,...
When codeword frequency meets geographical location in landmark search applications, is it still discriminative for the search procedure. In this paper, we give a systematic investigation about how geographical location affects the effectiveness of codeword frequency. We explain why the standard IDF in the BoW models is less effective in location related search applications [11][12]. Consequently,...
In this paper, we report ongoing research on what we call locally complete XML databases. These data banks consist of an XML data tree together with a specification of the nodes of the tree that are completely represented in the document. The specification of completeness is defined by means of the defacto standard node selection language for XML, XPath, which makes this approach easy to implement...
The size of the publicly indexable World Wide Web (WWW) has probably surpassed 14.3 billion documents and as yet growth shows no sign of leveling off. Search engines encounter the problem of ambiguity in words; therefore, search engines use ontology to find pages with words that are syntactically different but semantically similar. The knowledge provided by ontology is extremely useful in defining...
We consider the task of merging datasets that have been organized using different, but aligned taxonomies. We assume such a merge is intended to create a single dataset that unambiguously describes the information in the source datasets using the alignment. We also assume that the merged result should reflect the observations of the datasets as specifically as possible. Typically, there will be no...
This paper investigates using lexical cohesion to generate a moderately fluent semantic summary from a collection of documents written in Chinese. Based on the algorithm of cohesion analysis using the relationship among the words in the HowNet knowledge database, the built system computes concept frequency rather than word frequency as a measurement of importance. It merges the analysis of lexical...
This paper first proposes a multi-agent architecture to mediate access to data sources. The mediator follows the classical approach to process user queries. However, in the background, it post-processes query results to gradually construct matchings between the export schemas and the mediated schema. The central theme of the paper is an extensional schema matching strategy based on similarity functions...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.