The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Visual Analytics (VA) is a discipline that integrates computational and human efforts, allowing for effective data exploration with interactive and insightful user interfaces. Human- Data Interaction (HDI) is a relatively new term that we interpret here as the interactive interface between a human and the visual representation of the data, analogous to human computer interaction. Advanced user interfaces...
We present a framework to analyze various aspects of models for video question answering (VideoQA) using customizable synthetic datasets, which are constructed automatically from gameplay videos. Our work is motivated by the fact that existing models are often tested only on datasets that require excessively high-level reasoning or mostly contain instances accessible through single frame inferences...
Currently, front-end web developers spend countless hours overcoming programming challenges while debugging unexpected asynchronous behaviors, writing code to interact with a framework's API, or fixing faults. Such problems demand rethinking programming tools, and for that, we systematically analyzed 301 posts from Stack Overflow, and sought to identify the programming activities developers struggled...
Natural language questions are inherently compositional, and many are most easily answered by reasoning about their decomposition into modular sub-problems. For example, to answer “is there an equal number of balls and boxes?” we can look for balls, look for boxes, count them, and compare the results. The recently proposed Neural Module Network (NMN) architecture [3, 2] implements this approach to...
Digital organizations are now ruled by the flexibility offered by remote working environments. The availability of people to work from different places, with digital tools, empower them very often not only to perform their work better and faster, but also to reach higher levels of satisfaction in balancing work and private life. However, from an organizational perspective, there are concerns regarding...
The inherent abstractness in nature and intangibility in essence of computer programming concepts, present weak mental models that make them to be intuitively challenging to be easily understood by students. This remains a key factor in general underperformance of students in computer programming courses. Pedagogical use of metaphors is widely acknowledged as a means of addressing the challenge. As...
In this paper, we experimentally examine the relationship between visual cognition difficulty and target-tracking eye movements, which recorded during moving target cognition. Generally, such eye movements are observed when humans perceive a moving object and they vary widely due to many factors, such as target shape, backgrounds, illumination conditions, and so on. Several systems have been proposed...
In this paper we report on our study of the performance of Deep Reinforcement Learning (DRL) agents in performing tasks that are illustrative for human Sensor Operators (SOs) in Remotely Piloted Aircraft Systems (RPASs). Our hypothesis is that the descriptive and predictive qualities of the agent's learning process potentially allow us to identify human task requirements, training needs, selection...
While formal mathematical reasoning is the cornerstone of computer science, undergraduates often fail to appreciate the value of mathematical proof in their studies. To alleviate this problem, we propose a novel pedagogy uniting logical reasoning with proofs of program correctness along with a proof assistant, ORC2A, that helps students author proofs in this domain. One of the defining features of...
Multiple-choice (MC) question is an important form of test to assess the students' academic achievement, especially in the e-learning applications. However, the classical evaluation metrics on MC questions (such as the correctness ratio) only consider the correctness of the final selection but ignore the solving progress of the testee. In the existing literature, the eye-tracking based visual attention...
Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. As a result, these black-box models often learn to exploit biases in the data rather than learning to perform visual reasoning. Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program...
In visual question answering (VQA), an algorithm must answer text-based questions about images. While multiple datasets for VQA have been created since late 2014, they all have flaws in both their content and the way algorithms are evaluated on them. As a result, evaluation scores are inflated and predominantly determined by answering easier questions, making it difficult to compare different methods...
A game space is designed and placed by considering the beats (Events or Quests) inducing the behaviors of a player. Beats of FPS game can be divided into three types. It is possible to evaluate the placement of the pieces of space in case of analyzing the visual cognition information obtained by the player depending on each kind of beat. In this paper, players were divided into two groups (Novices...
A natural image usually conveys rich semantic content and can be viewed from different angles. Existing image description methods are largely restricted by small sets of biased visual paragraph annotations, and fail to cover rich underlying semantics. In this paper, we investigate a semi-supervised paragraph generative framework that is able to synthesize diverse and semantically coherent paragraph...
Problem based learning, utilizing simulations and virtual reality tools represents one of the approaches integrated into the education of medicine to prepare medical students for both the bedside teaching and their later clinical praxis. On the other side, implementation of innovative didactic materials may be useful also for veterinary medicine students. Thus, veterinary topics can be introduced...
Requirements Engineering (RE) is closely tied to other development activities and is at the heart and foundation of every software development process. This makes RE the most data and communication intensive activity compared to other development tasks. The highly demanding communication makes task switching and interruptions inevitable in RE activities. While task switching often allows us to perform...
This paper presents a serious game designed for children suffering from profound intellectual and multiple disabilities (PIMD) also know as multihandicap, for their evaluation and cognitive training. The specificities of these children must be taken into account for the choice of both the game feedbacks and interfaces.
In this article, we develop two visual impression models: recognition model and generalization model to simulate the cognition process of human visual systems. We show how the visual impression learned with a deep neural network can be efficiently transferred to other visual recognition tasks. By reusing the hidden layers trained in an unsupervised way, we show that we can largely reduce the number...
We propose Dual Attention Networks (DANs) which jointly leverage visual and textual attention mechanisms to capture fine-grained interplay between vision and language. DANs attend to specific regions in images and words in text through multiple steps and gather essential information from both modalities. Based on this framework, we introduce two types of DANs for multimodal reasoning and matching,...
We introduce the task of Multi-Modal Machine Comprehension (M3C), which aims at answering multimodal questions given a context of text, diagrams and images. We present the Textbook Question Answering (TQA) dataset that includes 1,076 lessons and 26,260 multi-modal questions, taken from middle school science curricula. Our analysis shows that a significant portion of questions require complex parsing...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.