The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we explore the task of intellectual information system creation for content formation. We developed the general recommendations for design of the intellectual system. Our system has special modules for processing of information resources. We created the software for formation, control, and implementation of content and made experiments on “Online newspaper” and “Online magazine” systems...
In this paper we study the problem of key phrase extraction from short texts written in Russian. As texts we consider messages posted on Internet car forums related to the purchase or repair of cars. The main assumption made is: the construction of lists of stop words for key phrase extraction can be effective if performed on the basis of a small, expert-marked collection. The results show that even...
The clustering of web resources is a significant and difficult task. In previous work has suggested ways to unify web information sources, trying to solve problems like heterogeneity of content, lack of structure, availability, distribution, quantity and quality. However, there are more complex than other domains where the solution is an even more cumbersome problem, due to the quantity and variety...
With the rapid development of Internet, how to obtain valuable information from massive messages has become a major problem we need to be solved in the information explosive era. This paper introduces the development route of information extraction technology, and discusses four categories of Chinese entity relation extraction technologies in depth. Finally, the advantages and disadvantages of different...
Within the context of the ISO/IEEE 11073 family of standards for point-of-care (PoC) medical device communication, a communication protocol specification for a distributed system of PoC medical devices and medical IT systems that need to exchange data, or safely control networked PoC medical devices by profiling Web Service specifications, is defined by this standard. Additional Web Service specifications...
The article describes computational experiment and further research work in the area of identification of destructive information influence in social networks. The problem of distribution of suicidal content via open sources is presented. On the basis of calculations there was made a conclusions about the prospects of using the methods of information retrieval in the task of identification of the...
Recommender system refers to an information system that predicts the intuition of user observing behavior of all the users. The idea of collaborative filtering lies in producing a set of recommendations based on similarity as well as knowledge of users' relationships to items. In this paper, we combine some traditional similarity metrics to find three types of similar users which are super similar,...
The classification of text documents into a number of pre-defined categories has many application scenarios, for example the classification of news items into thematic sections. Documents to be classified are commonly represented by a bag-of-words feature vector. The bag-of-words model cannot handle two language phenomena: synonymy and polysemy, besides, dimensions of feature vectors are orthogonal...
We study decentralized searches in large-scale, self-organized peer-to-peer networks and investigate the influences of network size and degree distribution (neighborhood size) on search efficiency. Experimental results show that searches are efficient and scalable in large networks, especially with large neighborhood sizes (degrees). Analysis of the data supports a proposed scalability model, in which...
Relation discovery is a crucial task in ontology learning process. The classical approaches for relation extraction, based on statistical, syntactical or pattern matching techniques, focus typically on the taxonomic aspect. The discovery of non-taxonomic relationships is often neglected. We extend these approaches by taking into account the document structure which bears additional knowledge. This...
The K-12 learning space is evolving in both the United States and internationally. Students are given increasingly frequent access to the internet through various platforms such as desktop computers, laptops, tablets, and other mobile devices. Some schools are distributing mobile devices to students in order to facilitate the integration of technology in the classroom. These devices have a web filter...
Question answering (QA) is the task of automatically answering a question posed in natural language. Its applied to several domains, and it is a specific type of information retrieval, that has three components such as question processing, information retrieval, and answer extraction. By analysing the user question, we intend to improve the precision of Question answering systems by focusing namely...
The orientation is occupying an increasingly important role in the process of determining the future of the students, which leads to the obligation to make it automatic and accessible, both in terms of immediate accessibility throughout the world or in the offering of several languages that correspond to the language used by the majority of students or persons in need of guidance. These problems can...
When using Information Retrieval (IR) systems, users often present search queries made of ad-hoc keywords. It is then up to the information retrieval systems (IRS) to obtain a precise representation of the user's information need and the context (preferences) of the information. To address this problem, we investigate optimization of IRS to individual information needs in order of relevance. The goal...
This paper presents preliminary result of research project, which is aimed to combine ontology information retrieval technology and process mining tools. The ontologies describing both data domains and data sources are used to search news in the Internet and to extract facts. Process Mining tools allows finding regularities, relations between single events or event types to construct formal models...
The innovative brand “The internet of Me” is a recent research area that highlight the prevalence of personalization across the internet and focuses on the user habits and actions tracked from his interaction with the web content. This paradigm presents an efficient way to define the user experience, preferences useful in e-commerce, marketing, social and search purpose. In this paper we are interested...
Synonym-based searching is considered to be a complicated problem, as text mining from unstructured data of web is challenging. Finding useful information which matches user need from the bulk of web pages is a cumbersome task. In this paper, a novel and practical synonym retrieval technique is proposed for addressing this problem. For replacement of semantics, user intent is taken into consideration...
Social networks enable knowledge sharing that inevitably begs the question of expertise analysis. Many online profiles claim expertise, but possessing true expertise is rare. We characterize expertise as projected expertise (claims of a person), perceived expertise (how the crowd perceives the individual) and true expertise (factual). StockTwits, an investor-focused microblogging platform, allows...
We evaluate the suitability of latent and explicit semantic spaces of documents for Information Retrieval (IR) tasks using a dataset obtained from the Q&A community Stackexchange. In addition, the ability of the latent semantic spaces to reconstruct human relevance judgments is explored. The latent semantic spaces are generated with Latent Dirichlet Allocation (LDA), while explicit semantic spaces...
One of the most crucial problems in any Natural Language Processing (NLP) task is the representation of time. This includes applications such as Information Retrieval techniques (IR), Information Extraction (IE) and Question/answering systems (QA). This paper deals with temporal information involving several forms of inference in Arabic language.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.