The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this work, we describe the design, development, and deployment of NEREA (Named Entity Recognizer for spEcific Areas), an automatic Named Entity Recognizer and Disambiguation system, developed in collaboration with professional documentalists. The aim of NEREA is to keep accurate and current information about the entities mentioned in a local repository, and then support building appropriate infoboxes,...
The aim of this study is to construct a system for pre-processing a text, including tokenization and correction of statements from social network, in order to adapt them for further extraction of information and sentiment analysis. Correction algorithm uses dictionary methods, statistical methods and rules-based approach. The system was tested on 50 statements including different kinds of errors....
Entity Linking (EL) search and labeling are important research topics with various web applications. The challenge is to find and link the important concepts from web text to online encyclopedia databases instead of simple personal and place names. This paper presents a new approach to link concrete concepts from English texts with Wiki entities. Using part-of-speech tagging to detect concrete concepts,...
Formal concept analysis is a mathematics research field introduced in the beginning of the 1980s by Rudolf Wille, that has been applied in several different knowledge areas, including Computer Science. FCA is a data analysis theory that identifies conceptual structures within data sets or formal contexts. In this work, we propose an FCA-based approach to build minimal implication rules-based computational...
The advent of Web 2.0 gave birth to a new kind of application where content is generated through the collaborative contribution of many different users. This form of content generation is believed to generate data of higher quality since the “wisdom of the crowds” makes its way into the data. However, a number of specific data quality issues appear within such collaboratively generated data. Apart...
The Internet has created new opportunities for peer-to-peer (P2P) social lending platforms which have the potential to transform the way microfinance institutions (MFIs) raise and allocate funds used for poverty reduction. Depending upon where decision making rights are allocated, there is the potential for identification bias whereby lenders may be motivated to give to specific projects with which...
With the emergence of the Internet, a large quantity of data is generated by the communication network, largely triggered by the human activity. Adding to this, emerging technology like Internet-of-Things (IoT) wherein a large number of devices are getting connected to the Internet, thereby accelerating the rate of data generation. There are also future predictions that the number of devices connected...
The paper presents category classification of mobile travel applications accessible at the moment for tourists in application stores for most popular mobile operation systems (Android and iOS). The most interesting category is "Travel Guides" that combines "Information Resources" and "Location-Based Services" category. Authors propose application "Tourist assistant...
Current search engine performances need to be improved because often the result suggested by search engine are determine the popularity of a given page for its associated keywords but does not match specific user expectations. Previous researches have indicated that only 20% to 45% of the common search results are relevant. The search becomes harder when the keyword used is a homographic word, and...
The evolution simulation is a popular problem not only in biology but in the computer science as well. This paper will introduce the principles of the evolution units. An artificial organism, a digital evolutionary machine (DEM) will be constructed based on these principles. Properties and abilities of DEMs will be shown with resolved and pending questions. After the theoretical approach some experimental...
The main challenge of question answering is that the lack of task structure prohibits the use of simplified assumptions as in task-oriented dialogue systems. This problem was tackled by integrating a dialogue management environment into a question answering system. Firstly, Wizard of Oz studies were conducted to discover how users describe their music information needs in contextual situations as...
This work is presenting a study performed towards the web programming on integrated system, emphasizing the facilities offered by the SAP integrated system's environment. Because the architecture of the SAP NetWeaver Application Server system is a three-tier architecture, it's possible to develop web business applications of ABAP (Advanced Business Application Programming) and/or Java type. The web...
This paper proposes a new layer for a novel IPTV service that is focused on the distribution of personalized multimedia contents over IP networks based on the concept of content-zapping, in contrast to traditional channel-zapping. Our system aims to aggregate continuously the available multimedia contents in its providers and suggest them efficiently to its endusers. Thus, the system will allow users...
Web sites are often a mixture of static sites and programs that integrate relational databases as a back-end. As they evolve to meet ever-changing user needs, new versions of programs, interactions and functionalities may be added and existing ones may be removed or modified. Web sites require configuration and programming attention to assure security, confidentiality, and trust of the published information...
Due to the importance of high-quality customer service, many companies use intelligent helpdesk systems (e.g., case-based systems) to improve customer service quality. However, these systems face two challenges: 1) Case retrieval measures: most case-based systems use traditional keyword-matching-based ranking schemes for case retrieval and have difficulty to capture the semantic meanings of cases...
Internet has gained huge popularity over the last decade. It offers its users reliable, efficient and exciting online services. However, the users reveal a lot of their personal information by using these services. Websites that collect information state their practices with data in their privacy policies. However, it is difficult to ensure if the policies are enforced properly in their practices...
For better user personalization, this paper explores the collecting and processing method of network users' behavior data with multi-source heterogeneity and analyzes network user behavior on the basis of context by the theory called Situation Awareness. Then, a new efficient algorithm is presented by using internet context to ensure the model of user interest auxiliary. Further, a structure model...
In recent years, mid-roll advertisements which insert a video advertisement at the middle of the video content appear gradually. However, the mid-roll advertisements usually interrupt video viewing because it inserts a video advertisement at the fixed or random time. To solve this issue, we propose an algorithm for mid-roll advertisement insertion based on audience comments. The algorithm determines...
The proliferation of deep Web offers users a great opportunity to search high-quality information from Web. As a necessary step in deep Web data integration, the goal of duplicate entity identification is to discover the duplicate records from the integrated Web databases for further applications(e.g. price-comparison services). However, most of existing works address this issue only between two data...
The Web has been flooded with highly heterogeneous data sources that freely offer their data to the public. Careful design and compliance to standards is a way to cope with the heterogeneity. However, any agreement and compliance is practically hard to achieve across different communities. In this work we describe a framework that enables the exploitation of content across different scientific disciplines...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.