2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

book

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

IEEE

chapter

DeepSpace: Mood-Based Image Texture Generation for Virtual Reality from Music

Misha Sra, Prashanth Vijayaraghavan, Ognjen Rudovic, Pattie Maes, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2289 - 2298

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Affective virtual spaces are of interest for many VR applications in areas of wellbeing, art, education, and entertainment. Creating content for virtual environments is a laborious task involving multiple skills like 3D modeling, texturing, animation, lighting, and programming. One way to facilitate content creation is to automate sub-processes like assignment of textures and materials within virtual...

chapter

DyadGAN: Generating Facial Expressions in Dyadic Interactions

Yuchi Huang, Saad M. Khan

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2259 - 2266

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Generative Adversarial Networks (GANs) have been shown to produce synthetic face images of compelling realism. In this work, we present a conditional GAN approach to generate contextually valid facial expressions in dyadic human interactions. In contrast to previous work employing conditions related to facial attributes of generated identities, we focused on dyads in an attempt to model the relationship...

chapter

It’s Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation

Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2299 - 2308

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Eye gaze is an important non-verbal cue for human affect analysis. Recent gaze estimation work indicated that information from the full face region can benefit performance. Pushing this idea further, we propose an appearance-based method that, in contrast to a long-standing line of work in computer vision, only takes the full face image as input. Our method encodes the face image using a convolutional...

chapter

Temporally Steered Gaussian Attention for Video Understanding

Shagan Sah, Thang Nguyen, Miguel Dominguez, Felipe Petroski Such, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2208 - 2216

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Recent advances in video understanding are enabling incredible developments in video search, summarization, automatic captioning and human computer interaction. Attention mechanisms are a powerful way to steer focus onto different sections of the video. Existing mechanisms are driven by prior training probabilities and require input instances of identical temporal duration. We introduce an intuitive...

chapter

Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection

Mohammadamin Barekatain, Miquel Marti, Hsueh-Fu Shih, Samuel Murray, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2153 - 2160

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Despite significant progress in the development of human action detection datasets and algorithms, no current dataset is representative of real-world aerial view scenarios. We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 12 action classes. Okutama-Action features many challenges missing in...

chapter

Crowd-11: A Dataset for Fine Grained Crowd Behaviour Analysis

Camille Dupont, Luis Tobias, Bertrand Luvison

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2184 - 2191

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Crowd behaviour analysis is a challenging task in computer vision, mainly due to the high complexity of the interactions between groups and individuals. This task is particularly crucial given the magnitude of manual monitoring required for effective crowd management. Within this context, a key challenge is to conceive a highly generic, fine and context-independent characterisation of crowd behaviours...

chapter

Recurrent Memory Addressing for Describing Videos

Arnav Kumar Jain, Abhinav Agarwalla, Kumar Krishna Agrawal, Pabitra Mitra

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2200 - 2207

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

In this paper, we introduce Key-Value Memory Networks to a multimodal setting and a novel key-addressing mechanism to deal with sequence-to-sequence models. The proposed model naturally decomposes the problem of video captioning into vision and language segments, dealing with them as key-value pairs. More specifically, we learn a semantic embedding (v) corresponding to each frame (k) in the video,...

chapter

Concurrence-Aware Long Short-Term Sub-Memories for Person-Person Action Recognition

Xiangbo Shu, Jinhui Tang, Guo-Jun Qi, Yan Song, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2176 - 2183

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Recently, Long Short-Term Memory (LSTM) has become a popular choice to model individual dynamics for single-person action recognition. However, existing RNN models only focus on capturing the temporal dynamics of the person-person interactions by naively combining the activity dynamics of individuals or modeling them as a whole. This neglects the inter-related dynamics of how person-person interactions...

chapter

SANet: Structure-Aware Network for Visual Tracking

Heng Fan, Haibin Ling

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2217 - 2224

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Convolutional neural network (CNN) has drawn increasing interest in visual tracking owing to its powerfulness in feature extraction. Most existing CNN-based trackers treat tracking as a classification problem. However, these trackers are sensitive to similar distractors because their CNN models mainly focus on inter-class classification. To address this problem, we use self-structure information of...

chapter

Enhancing Detection Model for Multiple Hypothesis Tracking

Jiahui Chen, Hao Sheng, Yang Zhang, Zhang Xiong

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2143 - 2152

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Tracking-by-detection has become a popular tracking paradigm in recent years. Due to the fact that detections within this framework are regarded as points in the tracking process, it brings data association ambiguities, especially in crowded scenarios. To cope with this issue, we extended the multiple hypothesis tracking approach by incorporating a novel enhancing detection model that included detection-scene...

chapter

Multi-scale Fully Convolutional Network for Face Detection in the Wild

Yancheng Bai, Bernard Ghanem

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2078 - 2087

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Face detection is a classical problem in computer vision. It is still a difficult task due to many nuisances that naturally occur in the wild. In this paper, we propose a multi-scale fully convolutional network for face detection. To reduce computation, the intermediate convolutional feature maps (conv) are shared by every scale model. We up-sample and down-sample the final conv map to approximate...

chapter

Convolutional Experts Constrained Local Model for Facial Landmark Detection

Amir Zadeh, Tadas Baltrusaitis, Louis-Philippe Morency

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2051 - 2059

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Constrained Local Models (CLMs) are a well-established family of methods for facial landmark detection. However, they have recently fallen out of favor to cascaded regressionbased approaches. This is in part due to the inability of existing CLM local detectors to model the very complex individual landmark appearance that is affected by expression, illumination, facial hair, makeup, and accessories...

chapter

CoMaL Tracking: Tracking Points at the Object Boundaries

Santhosh K. Ramakrishnan, Swarna Kamlam Ravindran, Anurag Mittal

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2133 - 2142

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Traditional point tracking algorithms such as the KLT use local 2D information aggregation for feature detection and tracking, due to which their performance degrades at the object boundaries that separate multiple objects. Recently, CoMaL Features have been proposed that handle such a case. However, they proposed a simple tracking framework where the points are re-detected in each frame and matched...

chapter

Face Detection, Bounding Box Aggregation and Pose Estimation for Robust Facial Landmark Localisation in the Wild

Zhen-Hua Feng, Josef Kittler, Muhammad Awais, Patrik Huber, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2106 - 2115

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

We present a framework for robust face detection and landmark localisation of faces in the wild, which has been evaluated as part of `the 2nd Facial Landmark Localisation Competition'. The framework has four stages: face detection, bounding box aggregation, pose estimation and landmark localisation. To achieve a high detection rate, we use two publicly available CNN-based face detectors and two proprietary...

chapter

Unconstrained Face Alignment Without Face Detection

Xiaohu Shao, Junliang Xing, Jiangjing Lv, Chunlin Xiao, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 2069 - 2077

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

This paper introduces our submission to the 2nd Facial Landmark Localisation Competition. We present a deep architecture to directly detect facial landmarks without using face detection as an initialization. The architecture consists of two stages, a Basic Landmark Prediction Stage and a Whole Landmark Regression Stage. At the former stage, given an input image, the basic landmarks of all faces are...

chapter

Linearizing the Plenoptic Space

Gregoire Nieto, Frederic Devernay, James Crowley

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 1714 - 1725

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

The plenoptic function, also known as the light field or the lumigraph, contains the information about the radiance of all optical rays that go through all points in space in a scene. Since no camera can capture all this information, one of the main challenges in plenoptic imaging is light field reconstruction, which consists in interpolating the ray samples captured by the cameras to create a dense...

chapter

Decoding the Deep: Exploring Class Hierarchies of Deep Representations Using Multiresolution Matrix Factorization

Vamsi K. Ithapu

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 1695 - 1704

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

The necessity of depth in efficient neural network learning has led to a family of designs referred to as very deep networks (e.g., GoogLeNet has 22 layers). As the depth increases even further, the need for appropriate tools to explore the space of hidden representations becomes paramount. For instance, beyond the gain in generalization, one may be interested in checking the change in class compositions...

chapter

Richardson-Lucy Deblurring for Moving Light Field Cameras

Donald G. Dansereau, Anders Eriksson, Jurgen Leitner

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 1783 - 1794

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

We generalize Richardson-Lucy (RL) deblurring to 4-D light fields by replacing the convolution steps with light field rendering of motion blur. The method deals correctly with blur caused by 6-degree-of-freedom camera motion in complex 3-D scenes, without performing depth estimation. We introduce a novel regularization term that maintains parallax information in the light field while reducing noise...

chapter

Dataset and Pipeline for Multi-view Light-Field Video

Neus Sabater, Guillaume Boisson, Benoit Vandame, Paul Kerbiriou, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 1743 - 1753

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

The quantity and diversity of data in Light-Field videos makes this content valuable for many applications such as mixed and augmented reality or post-production in the movie industry. Some of such applications require a large parallax between the different views of the Light-Field, making the multi-view capture a better option than plenoptic cameras. In this paper we propose a dataset and a complete...

INFONA - science communication portal

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)