2017 IEEE International Conference on Computer Vision (ICCV)

book

2017 IEEE International Conference on Computer Vision (ICCV)

IEEE

chapter

Learning Action Recognition Model from Depth and Skeleton Videos

Hossein Rahmani, Mohammed Bennamoun

2017 IEEE International Conference on Computer Vision (ICCV) > 5833 - 5842

2017 IEEE International Conference on Computer Vision (ICCV)

Depth sensors open up possibilities of dealing with the human action recognition problem by providing 3D human skeleton data and depth images of the scene. Analysis of human actions based on 3D skeleton data has become popular recently, due to its robustness and view-invariant representation. However, the skeleton alone is insufficient to distinguish actions which involve human-object interactions...

chapter

Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization

Kamran Ghasedi Dizaji, Amirhossein Herandi, Cheng Deng, Weidong Cai, more

2017 IEEE International Conference on Computer Vision (ICCV) > 5747 - 5756

2017 IEEE International Conference on Computer Vision (ICCV)

In this paper, we propose a new clustering model, called DEeP Embedded Regularized ClusTering (DEPICT), which efficiently maps data into a discriminative embedding subspace and precisely predicts cluster assignments. DEPICT generally consists of a multinomial logistic regression function stacked on top of a multi-layer convolutional autoencoder. We define a clustering objective function using relative...

chapter

Localizing Moments in Video with Natural Language

Lisa Anne Hendricks, Oliver Wang, Eli Shechtman, Josef Sivic, more

2017 IEEE International Conference on Computer Vision (ICCV) > 5804 - 5813

2017 IEEE International Conference on Computer Vision (ICCV)

We consider retrieving a specific temporal segment, or moment, from a video given a natural language text description. Methods designed to retrieve whole video clips with natural language determine what occurs in a video but not when. To address this issue, we propose the Moment Context Network (MCN) which effectively localizes natural language queries in videos by integrating local and global video...

chapter

3DCNN-DQN-RNN: A Deep Reinforcement Learning Framework for Semantic Parsing of Large-Scale 3D Point Clouds

Fangyu Liu, Shuaipeng Li, Liqiang Zhang, Chenghu Zhou, more

2017 IEEE International Conference on Computer Vision (ICCV) > 5679 - 5688

2017 IEEE International Conference on Computer Vision (ICCV)

Semantic parsing of large-scale 3D point clouds is an important research topic in computer vision and remote sensing fields. Most existing approaches utilize hand-crafted features for each modality independently and combine them in a heuristic manner. They often fail to consider the consistency and complementary information among features adequately, which makes them difficult to capture high-level...

chapter

Image2song: Song Retrieval via Bridging Image Content and Lyric Words

Xuelong Li, Di Hu, Xiaoqiang Lu

2017 IEEE International Conference on Computer Vision (ICCV) > 5650 - 5659

2017 IEEE International Conference on Computer Vision (ICCV)

Image is usually taken for expressing some kinds of emotions or purposes, such as love, celebrating Christmas. There is another better way that combines the image and relevant song to amplify the expression, which has drawn much attention in the social network recently. Hence, the automatic selection of songs should be expected. In this paper, we propose to retrieve semantic relevant songs just by...

chapter

End-to-End Face Detection and Cast Grouping in Movies Using Erdös-Rényi Clustering

SouYoung Jin, Hang Su, Chris Stauffer, Erik Learned-Miller

2017 IEEE International Conference on Computer Vision (ICCV) > 5286 - 5295

2017 IEEE International Conference on Computer Vision (ICCV)

We present an end-to-end system for detecting and clustering faces by identity in full-length movies. Unlike works that start with a predefined set of detected faces, we consider the end-to-end problem of detection and clustering together. We make three separate contributions. First, we combine a state-of-the-art face detector with a generic tracker to extract high quality face tracklets. We then...

chapter

Refractive Structure-from-Motion Through a Flat Refractive Interface

Francois Chadebecq, Francisco Vasconcelos, George Dwyer, Rene Lacher, more

2017 IEEE International Conference on Computer Vision (ICCV) > 5325 - 5333

2017 IEEE International Conference on Computer Vision (ICCV)

Recovering 3D scene geometry from underwater images involves the Refractive Structure-from-Motion (RSfM) problem, where the image distortions caused by light refraction at the interface between different propagation media invalidates the single view point assumption. Direct use of the pinhole camera model in RSfM leads to inaccurate camera pose estimation and consequently drift. RSfM methods have...

chapter

DCTM: Discrete-Continuous Transformation Matching for Semantic Flow

Seungryong Kim, Dongbo Min, Stephen Lin, Kwanghoon Sohn

2017 IEEE International Conference on Computer Vision (ICCV) > 4539 - 4548

2017 IEEE International Conference on Computer Vision (ICCV)

Techniques for dense semantic correspondence have provided limited ability to deal with the geometric variations that commonly exist between semantically similar images. While variations due to scale and rotation have been examined, there is a lack of practical solutions for more complex deformations such as affine transformations because of the tremendous size of the associated solution space. To...

chapter

Moving Object Detection in Time-Lapse or Motion Trigger Image Sequences Using Low-Rank and Invariant Sparse Decomposition

Moein Shakeri, Hong Zhang

2017 IEEE International Conference on Computer Vision (ICCV) > 5133 - 5141

2017 IEEE International Conference on Computer Vision (ICCV)

Low-rank and sparse representation based methods have attracted wide attention in background subtraction and moving object detection, where moving objects in the scene are modeled as pixel-wise sparse outliers. Since in real scenarios moving objects are also structurally sparse, recently researchers have attempted to extract moving objects using structured sparse outliers. Although existing methods...

chapter

Dense and Low-Rank Gaussian CRFs Using Deep Embeddings

Siddhartha Chandra, Nicolas Usunier, Iasonas Kokkinos

2017 IEEE International Conference on Computer Vision (ICCV) > 5113 - 5122

2017 IEEE International Conference on Computer Vision (ICCV)

In this work we introduce a structured prediction model that endows the Deep Gaussian Conditional Random Field (G-CRF) with a densely connected graph structure. We keep memory and computational complexity under control by expressing the pairwise interactions as inner products of low-dimensional, learnable embeddings. The G-CRF system matrix is therefore low-rank, allowing us to solve the resulting...

chapter

Makeup-Go: Blind Reversion of Portrait Edit

Ying-Cong Chen, Xiaoyong Shen, Jiaya Jia

2017 IEEE International Conference on Computer Vision (ICCV) > 4511 - 4519

2017 IEEE International Conference on Computer Vision (ICCV)

Virtual face beautification (or markup) becomes common operations in camera or image processing Apps, which is actually deceiving. In this paper, we propose the task of restoring a portrait image from this process. As the first attempt along this line, we assume unknown global operations on human faces and aim to tackle the two issues of skin smoothing and skin color change. These two tasks, intriguingly,...

chapter

Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval

Yuming Shen, Li Liu, Ling Shao, Jingkuan Song

2017 IEEE International Conference on Computer Vision (ICCV) > 4117 - 4126

2017 IEEE International Conference on Computer Vision (ICCV)

Cross-modal hashing is usually regarded as an effective technique for large-scale textual-visual cross retrieval, where data from different modalities are mapped into a shared Hamming space for matching. Most of the traditional textual-visual binary encoding methods only consider holistic image representations and fail to model descriptive sentences. This renders existing methods inappropriate to...

chapter

Cross-Modal Deep Variational Hashing

Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, Jie Zhou

2017 IEEE International Conference on Computer Vision (ICCV) > 4097 - 4105

2017 IEEE International Conference on Computer Vision (ICCV)

In this paper, we propose a cross-modal deep variational hashing (CMDVH) method for cross-modality multimedia retrieval. Unlike existing cross-modal hashing methods which learn a single pair of projections to map each example as a binary vector, we design a couple of deep neural network to learn non-linear transformations from image-text input pairs, so that unified binary codes can be obtained. We...

chapter

Online Video Deblurring via Dynamic Temporal Blending Network

Tae Hyun Kim, Kyoung Mu Lee, Bernhard Scholkopf, Michael Hirsch

2017 IEEE International Conference on Computer Vision (ICCV) > 4058 - 4067

2017 IEEE International Conference on Computer Vision (ICCV)

State-of-the-art video deblurring methods are capable of removing non-uniform blur caused by unwanted camera shake and/or object motion in dynamic scenes. However, most existing methods are based on batch processing and thus need access to all recorded frames, rendering them computationally demanding and time-consuming and thus limiting their practical use. In contrast, we propose an online (sequential)...

chapter

Learning Dense Facial Correspondences in Unconstrained Images

Ronald Yu, Shunsuke Saito, Haoxiang Li, Duygu Ceylan, more

2017 IEEE International Conference on Computer Vision (ICCV) > 4733 - 4742

2017 IEEE International Conference on Computer Vision (ICCV)

We present a minimalists but effective neural network that computes dense facial correspondences in highly unconstrained RGB images. Our network learns a per-pixel flow and a matchability mask between 2D input photographs of a person and the projection of a textured 3D face model. To train such a network, we generate a massive dataset of synthetic faces with dense labels using renderings of a morphable...

chapter

Efficient Algorithms for Moral Lineage Tracing

Markus Rempfler, Jan-Hendrik Lange, Florian Jug, Corinna Blasse, more

2017 IEEE International Conference on Computer Vision (ICCV) > 4705 - 4714

2017 IEEE International Conference on Computer Vision (ICCV)

Lineage tracing, the joint segmentation and tracking of living cells as they move and divide in a sequence of light microscopy images, is a challenging task. Jug et al. [21] have proposed a mathematical abstraction of this task, the moral lineage tracing problem (MLTP), whose feasible solutions define both a segmentation of every image and a lineage forest of cells. Their branch-and-cut algorithm,...

chapter

FLaME: Fast Lightweight Mesh Estimation Using Variational Smoothing on Delaunay Graphs

W. Nicholas Greene, Nicholas Roy

2017 IEEE International Conference on Computer Vision (ICCV) > 4696 - 4704

2017 IEEE International Conference on Computer Vision (ICCV)

We propose a lightweight method for dense online monocular depth estimation capable of reconstructing 3D meshes on computationally constrained platforms. Our main contribution is to pose the reconstruction problem as a non-local variational optimization over a time-varying Delaunay graph of the scene geometry, which allows for an efficient, keyframeless approach to depth estimation. The graph can...

chapter

Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources

Adrian Bulat, Georgios Tzimiropoulos

2017 IEEE International Conference on Computer Vision (ICCV) > 3726 - 3734

2017 IEEE International Conference on Computer Vision (ICCV)

Our goal is to design architectures that retain the groundbreaking performance of CNNs for landmark localization and at the same time are lightweight, compact and suitable for applications with limited computational resources. To this end, we make the following contributions: (a) we are the first to study the effect of neural network binarization on localization tasks, namely human pose estimation...

chapter

Fast Face-Swap Using Convolutional Neural Networks

Iryna Korshunova, Wenzhe Shi, Joni Dambre, Lucas Theis

2017 IEEE International Conference on Computer Vision (ICCV) > 3697 - 3705

2017 IEEE International Conference on Computer Vision (ICCV)

We consider the problem of face swapping in images, where an input identity is transformed into a target identity while preserving pose, facial expression and lighting. To perform this mapping, we use convolutional neural networks trained to capture the appearance of the target identity from an unstructured collection of his/her photographs. This approach is enabled by framing the face swapping problem...

INFONA - science communication portal

2017 IEEE International Conference on Computer Vision (ICCV)