Search results

chapter

Perceptual Analysis of Perspective Projection for Viewport Rendering in 360° Images

Falah Jabar, Joao Ascenso, Maria Paula Queluz

2017 IEEE International Symposium on Multimedia (ISM) > 53 - 60

2017 IEEE International Symposium on Multimedia (ISM)

Omnidirectional, also referred to as 360º, visual content provides an immersive experience since it allows users to view a visual scene from different directions. The overall content typically covers a full sphere, and omnidirectional videos or images are processed to obtain a projection on a 2D plane of a fraction of the sphere (aka viewport), which is shown to the user. Therefore, users can look...

chapter

Summarization of News Videos Considering the Consistency of Auditory and Visual Contents

Ichiro Ide, Ye Zhang, Ryunosuke Tanishige, Keisuke Doman, more

2017 IEEE International Symposium on Multimedia (ISM) > 193 - 199

2017 IEEE International Symposium on Multimedia (ISM)

Since news videos are valuable sources of multimedia information on real-world events, there is a demand for viewing them efficiently. However, there is a problem that summarization methods based on auditory contents do not take into account the visual contents. In the case of news videos, due to its presentation style where audio contents and visual contents do not necessarily come from the same...

chapter

Multi-level Multiple Attentions for Contextual Multimodal Sentiment Analysis

Soujanya Poria, Erik Cambria, Devamanyu Hazarika, Navonil Mazumder, more

2017 IEEE International Conference on Data Mining (ICDM) > 1033 - 1038

2017 IEEE International Conference on Data Mining (ICDM)

Multimodal sentiment analysis involves identifying sentiment in videos and is a developing field of research. Unlike current works, which model utterances individually, we propose a recurrent model that is able to capture contextual information among utterances. In this paper, we also introduce attentionbased networks for improving both context learning and dynamic feature fusion. Our model shows...

chapter

Assessing the Intuitiveness of Qualitative Contribution Relationships in Goal Models: An Exploratory Experiment

Sotirios Liaskos, Alexis Ronse, Mehrnaz Zhian

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) > 466 - 471

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

[Background]: Developing conceptual models is an integral part of the requirements engineering (RE) process. Goal models are requirements engineering conceptual models that allow diagrammatic representation of stakeholder intentions and how they affect each other. A specific goal modeling language construct, the contribution of goal satisfaction of one goal to another, plays a central role in supporting...

chapter

A Lightweight Discriminative Tracker Based on Classification and Similarity

Weinong Wang, Fei Wang, Yu Guo

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA) > 1 - 8

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA)

Convolutional neural network (CNN) based trackers have achieved significant performances in tracking recently. Most existing CNN-based trackers regard tracking as a classification or similarity searching problem. The two methods have their respective superiorities and limitations because of different supervised objectives. In this paper, we propose a multi-task CNN for visual tracking, not only fully...

chapter

A Novel Quality Metric Using Spatiotemporal Correlational Data of Human Eye Maneuver

Pallab Kanti Podder, Manoranjan Paul, Manzur Murshed

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA) > 1 - 8

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA)

The popularly used subjective estimator- mean opinion score (MOS) is often biased by the testing environment, viewers mode, domain expertise, and many other factors that may actively influence on actual assessment. We therefore, devise a no- reference subjective quality assessment metric by exploiting the nature of human eye browsing on videos. The participants' eye-tracker recorded gaze-data indicate...

chapter

MarioQA: Answering Questions by Watching Gameplay Videos

Jonghwan Mun, Paul Hongsuck Seo, Ilchae Jung, Bohyung Han

2017 IEEE International Conference on Computer Vision (ICCV) > 2886 - 2894

2017 IEEE International Conference on Computer Vision (ICCV)

We present a framework to analyze various aspects of models for video question answering (VideoQA) using customizable synthetic datasets, which are constructed automatically from gameplay videos. Our work is motivated by the fact that existing models are often tested only on datasets that require excessively high-level reasoning or mostly contain instances accessible through single frame inferences...

chapter

Discovery-based praxes: Channelling the user-interface of an industrial-strength programming environment to formally teach programming

Prasun Dewan

2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) > 341 - 342

2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)

Many instructors consider a programming environment replete with a large variety of interactive objects and commands a liability for teaching introductory programming. In fact, such an environment is an important pedagogical tool, whose instructional capabilities can be amplified by discovery-based programming praxes — predefined programs with embedded comments that instruct students to browse and...

chapter

A New Pooling Strategy Based on Local Feature Distribution: A Case Study for Human Action Classification

Raquel Almeida, Zenilton Kleber Goncalves Do Patrocinio, Silvio Jamil F. Guimaraes

2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) > 149 - 154

2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)

Mid-level representations are used to map sets of local features into one global representation for a given media descriptor. In visual pattern recognition tasks, Bag-of-Words (BoW) is one popular strategy, among many methods available in literature, due mainly by the simplicity in concept and implementation. Despite the overall good results achieved by BoW in many tasks, the method is unstable in...

chapter

Transitive Invariance for Self-Supervised Visual Representation Learning

Xiaolong Wang, Kaiming He, Abhinav Gupta

2017 IEEE International Conference on Computer Vision (ICCV) > 1338 - 1347

2017 IEEE International Conference on Computer Vision (ICCV)

Learning visual representations with self-supervised learning has become popular in computer vision. The idea is to design auxiliary tasks where labels are free to obtain. Most of these tasks end up providing data to learn specific kinds of invariance useful for recognition. In this paper, we propose to exploit different self-supervised approaches to learn representations invariant to (i) inter-instance...

chapter

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization

Krishna Kumar Singh, Yong Jae Lee

2017 IEEE International Conference on Computer Vision (ICCV) > 3544 - 3553

2017 IEEE International Conference on Computer Vision (ICCV)

We propose ‘Hide-and-Seek’, a weakly-supervised framework that aims to improve object localization in images and action localization in videos. Most existing weakly-supervised methods localize only the most discriminative parts of an object rather than all relevant parts, which leads to suboptimal performance. Our key idea is to hide patches in a training image randomly, forcing the network to seek...

chapter

Spatio-Temporal Person Retrieval via Natural Language Queries

Masataka Yamaguchi, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada

2017 IEEE International Conference on Computer Vision (ICCV) > 1462 - 1471

2017 IEEE International Conference on Computer Vision (ICCV)

In this paper, we address the problem of spatio-temporal person retrieval from videos using a natural language query, in which we output a tube (i.e., a sequence of bounding boxes) which encloses the person described by the query. For this problem, we introduce a novel dataset consisting of videos containing people annotated with bounding boxes for each second and with five natural language descriptions...

chapter

Generative Modeling of Audible Shapes for Object Perception

Zhoutong Zhang, Jiajun Wu, Qiujia Li, Zhengjia Huang, more

2017 IEEE International Conference on Computer Vision (ICCV) > 1260 - 1269

2017 IEEE International Conference on Computer Vision (ICCV)

Humans infer rich knowledge of objects from both auditory and visual cues. Building a machine of such competency, however, is very challenging, due to the great difficulty in capturing large-scale, clean data of objects with both their appearance and the sound they make. In this paper, we present a novel, open-source pipeline that generates audiovisual data, purely from 3D object shapes and their...

chapter

Teacher training for the creation of accessible courses at atutor

Femando Martinez Rodriguez

2017 Twelfth Latin American Conference on Learning Technologies (LACLO) > 1 - 8

2017 Twelfth Latin American Conference on Learning Technologies (LACLO)

This article shares the obtained results in the teacher training phase for the analysis, development and publication of accessible courses making use of the learning management platform: ATutor. This training phase is carried out within the framework of a research project: “Didactic and technological development in teaching scenarios for the training of teachers who welcome diversity: factors for...

chapter

TALL: Temporal Activity Localization via Language Query

Jiyang Gao, Chen Sun, Zhenheng Yang, Ram Nevatia

2017 IEEE International Conference on Computer Vision (ICCV) > 5277 - 5285

2017 IEEE International Conference on Computer Vision (ICCV)

This paper focuses on temporal localization of actions in untrimmed videos. Existing methods typically train classifiers for a pre-defined list of actions and apply them in a sliding window fashion. However, activities in the wild consist of a wide combination of actors, actions and objects; it is difficult to design a proper activity list that meets users’ needs. We propose to localize activities...

chapter

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

Shanghang Zhang, Guanhang Wu, Joao P. Costeira, Jose M. F. Moura

2017 IEEE International Conference on Computer Vision (ICCV) > 3687 - 3696

2017 IEEE International Conference on Computer Vision (ICCV)

In this paper, we develop deep spatio-temporal neural networks to sequentially count vehicles from low quality videos captured by city cameras (citycams). Citycam videos have low resolution, low frame rate, high occlusion and large perspective, making most existing methods lose their efficacy. To overcome limitations of existing methods and incorporate the temporal information of traffic video, we...

chapter

Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge

Ryota Hinami, Tao Mei, Shin'ichi Satoh

2017 IEEE International Conference on Computer Vision (ICCV) > 3639 - 3647

2017 IEEE International Conference on Computer Vision (ICCV)

This paper addresses the problem of joint detection and recounting of abnormal events in videos. Recounting of abnormal events, i.e., explaining why they are judged to be abnormal, is an unexplored but critical task in video surveillance, because it helps human observers quickly judge if they are false alarms or not. To describe the events in the human-understandable form for event recounting, learning...

chapter

Mutual Enhancement for Detection of Multiple Logos in Sports Videos

Yuan Liao, Xiaoqing Lu, Chengcui Zhang, Yongtao Wang, more

2017 IEEE International Conference on Computer Vision (ICCV) > 4856 - 4865

2017 IEEE International Conference on Computer Vision (ICCV)

Detecting logo frequency and duration in sports videos provides sponsors an effective way to evaluate their advertising efforts. However, general-purposed object detection methods cannot address all the challenges in sports videos. In this paper, we propose a mutual-enhanced approach that can improve the detection of a logo through the information obtained from other simultaneously occurred logos...

chapter

Anticipating Daily Intention Using On-wrist Motion Triggered Sensing

Tz-Ying Wu, Ting-An Chien, Cheng-Sheng Chan, Chan-Wei Hu, more

2017 IEEE International Conference on Computer Vision (ICCV) > 48 - 56

2017 IEEE International Conference on Computer Vision (ICCV)

Anticipating human intention by observing one’s actions has many applications. For instance, picking up a cellphone, then a charger (actions) implies that one wants to charge the cellphone (intention) (Fig. 1). By anticipating the intention, an intelligent system can guide the user to the closest power outlet. We propose an on-wrist motion triggered sensing system for anticipating daily intentions,...

chapter

Beyond Standard Benchmarks: Parameterizing Performance Evaluation in Visual Object Tracking

Luka Cehovin Zajc, Alan Lukezic, Ales Leonardis, Matej Kristan

2017 IEEE International Conference on Computer Vision (ICCV) > 3343 - 3351

2017 IEEE International Conference on Computer Vision (ICCV)

Object-to-camera motion produces a variety of apparent motion patterns that significantly affect performance of short-term visual trackers. Despite being crucial for designing robust trackers, their influence is poorly explored in standard benchmarks due to weakly defined, biased and overlapping attribute annotations. In this paper we propose to go beyond pre-recorded benchmarks with post-hoc annotations...

INFONA - science communication portal

Search results

Perceptual Analysis of Perspective Projection for Viewport Rendering in 360° Images

Summarization of News Videos Considering the Consistency of Auditory and Visual Contents

Multi-level Multiple Attentions for Contextual Multimodal Sentiment Analysis

Assessing the Intuitiveness of Qualitative Contribution Relationships in Goal Models: An Exploratory Experiment

A Lightweight Discriminative Tracker Based on Classification and Similarity

A Novel Quality Metric Using Spatiotemporal Correlational Data of Human Eye Maneuver

MarioQA: Answering Questions by Watching Gameplay Videos

Discovery-based praxes: Channelling the user-interface of an industrial-strength programming environment to formally teach programming

A New Pooling Strategy Based on Local Feature Distribution: A Case Study for Human Action Classification

Transitive Invariance for Self-Supervised Visual Representation Learning

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization

Spatio-Temporal Person Retrieval via Natural Language Queries

Generative Modeling of Audible Shapes for Object Perception

Teacher training for the creation of accessible courses at atutor

TALL: Temporal Activity Localization via Language Query

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge

Mutual Enhancement for Detection of Multiple Logos in Sports Videos

Anticipating Daily Intention Using On-wrist Motion Triggered Sensing

Beyond Standard Benchmarks: Parameterizing Performance Evaluation in Visual Object Tracking

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options