2017 IEEE International Conference on Multimedia and Expo (ICME)

book

2017 IEEE International Conference on Multimedia and Expo (ICME)

IEEE

chapter

Select-additive learning: Improving generalization in multimodal sentiment analysis

Haohan Wang, Aaksha Meghawat, Louis-Philippe Morency, Eric P. Xing

2017 IEEE International Conference on Multimedia and Expo (ICME) > 949 - 954

2017 IEEE International Conference on Multimedia and Expo (ICME)

Multimodal sentiment analysis is drawing an increasing amount of attention these days. It enables mining of opinions in video reviews which are now available aplenty on online platforms. However, multimodal sentiment analysis has only a few high-quality data sets annotated for training machine learning algorithms. These limited resources restrict the generalizability of models, where, for example,...

chapter

Estimating political leanings from mass media via graph-signal restoration with negative edges

Benjamin Renoust, Gene Cheung, Shin'Ichi Satoh

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1009 - 1014

2017 IEEE International Conference on Multimedia and Expo (ICME)

Politicians in the same political party often share the same views on social issues and legislative agendas. By mining patterns in TV news co-appearances and Twitter followers, in this paper we estimate political leanings (left / right) of unknown individuals, and detect outlier politicians who have views different from their colleagues in the same party, from a graph signal processing (GSP) perspective...

chapter

DLML: Deep linear mappings learning for face super-resolution with nonlocal-patch

Tao Lu, Lanlan Pan, Junjun Jiangs, Yanduo Zhang, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1362 - 1367

2017 IEEE International Conference on Multimedia and Expo (ICME)

Learning-based face super-resolution approaches rely on representative dictionary as self-similarity prior from training samples to estimate the relationship between the low-resolution (LR) and high-resolution (HR) image patches. The most popular approaches, learn mapping function directly from LR patches to HR ones but neglects the multi-layered nature of image degradation process (resolution down-sampling)...

chapter

Deep convolutional recurrent neural network with attention mechanism for robust speech emotion recognition

Che-Wei Huang, Shrikanth Shri Narayanan

2017 IEEE International Conference on Multimedia and Expo (ICME) > 583 - 588

2017 IEEE International Conference on Multimedia and Expo (ICME)

We present a deep convolutional recurrent neural network for speech emotion recognition based on the log-Mel filterbank energies, where the convolutional layers are responsible for the discriminative feature learning. Based on the hypothesis that a better understanding of the internal configuration within an utterance would help reduce misclassification, we further propose a convolutional attention...

chapter

A unified model for improving depth accuracy in kinect sensor

Li Peng, Yanduo Zhang, Huabing Zhou, Deng Chen, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 223 - 228

2017 IEEE International Conference on Multimedia and Expo (ICME)

The Microsoft Kinect sensor has been widely used in many applications, but it suffers from the drawback of low depth accuracy. In this paper, we present a unified depth modification model to improve the Kinect depth accuracy by registering depth and color images in an iterative manner. Specifically, in each iteration, we first establish a coarse correspondence based on the feature descriptor of the...

chapter

Decoder-side HEVC quality enhancement with scalable convolutional neural network

Ren Yang, Mai Xu, Zulin Wang

2017 IEEE International Conference on Multimedia and Expo (ICME) > 817 - 822

2017 IEEE International Conference on Multimedia and Expo (ICME)

The latest High Efficiency Video Coding (HEVC) has been increasingly used to generate video streams over Internet. However, the decoded HEVC video streams may incur severe quality degradation, especially at low bit-rates. Thus, it is necessary to enhance visual quality of HEVC videos at the decoder side. To this end, we propose in this paper a Decoder-side Scalable Convolutional Neural Network (DS-CNN)...

chapter

Deep learning for multimodal-based video interestingness prediction

Yuesong Shen, Claire-Heiene Demarty, Ngoc Q. K. Duong

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1003 - 1008

2017 IEEE International Conference on Multimedia and Expo (ICME)

Predicting interestingness of media content remains an important, but challenging research subject. The difficulty comes first from the fact that, besides being a high-level semantic concept, interestingness is highly subjective and its global definition has not been agreed yet. This paper presents the use of up-to-date deep learning techniques for solving the task. We perform experiments with both...

chapter

Impact of video resolution changes on QoE for adaptive video streaming

Avsar Asan, Werner Robitza, Is-haka Mkwawa, Lingfen Sun, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 499 - 504

2017 IEEE International Conference on Multimedia and Expo (ICME)

HTTP adaptive streaming (HAS) has become the de-facto standard for video streaming to ensure continuous multimedia service delivery under irregularly changing network conditions. Many studies already investigated the detrimental impact of various playback characteristics on the Quality of Experience of end users, such as initial loading, stalling or quality variations. However, dedicated studies tackling...

chapter

Knowledge-guided recurrent neural network learning for task-oriented action prediction

Liang Lin, Lili Huang, Tianshui Chen, Yukang Gan, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 625 - 630

2017 IEEE International Conference on Multimedia and Expo (ICME)

This paper aims at task-oriented action prediction, i.e., predicting a sequence of actions towards accomplishing a specific task under a certain scene, which is a new problem in computer vision research. The main challenges lie in how to model task-specific knowledge and integrate it in the learning procedure. In this work, we propose to train a recurrent longshort term memory (LSTM) network for handling...

chapter

The OUC-vision large-scale underwater image database

Muwei Jian, Qiang Qi, Junyu Dong, Yinlong Yin, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1297 - 1302

2017 IEEE International Conference on Multimedia and Expo (ICME)

In this paper, a large-scale underwater image database for underwater salient object detection or saliency detection is presented in detail. This database is called the OUC-VISION underwater image database, which contains 4400 underwater images of 220 individual objects. Each object is captured with four pose variations (the frontal-, the opposite-, the left-, and the right-views of each underwater...

chapter

A deep convolutional neural network approach for complexity reduction on intra-mode HEVC

Tianyi Li, Mai Xu, Xin Deng

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1255 - 1260

2017 IEEE International Conference on Multimedia and Expo (ICME)

The High Efficiency Video Coding (HEVC) standard significantly saves coding bit-rate over the proceeding H.264 standard, but at the expense of extremely high encoding complexity. In fact, the coding tree unit (CTU) partition consumes a large proportion of HEVC encoding complexity, due to the brute-force search for rate-distortion optimization (RDO). Therefore, we propose in this paper a complexity...

chapter

Efficient low rank matrix approximation via orthogonality pursuit and ℓ² regularization

Siyuan Li, Jiawan Zhang, Xiaojie Guo

2017 IEEE International Conference on Multimedia and Expo (ICME) > 871 - 876

2017 IEEE International Conference on Multimedia and Expo (ICME)

Low rank matrix approximation, in the presence of missing data and outliers, has previously shown its significance as a theoretic foundation in a wide spectrum of tabulated information processing applications. To fit low rank models, minimizing the nuclear norm of matrices is a popular scheme, the computational load of which, however, is heavy. While bilinear factorization can largely mitigate the...

chapter

Edge-preserving disparity map estimation from stereo videos for bokeh synthesis

Wei-Lun Lan, Shih-Hsuan Yao, Shang-Hong Lai

2017 IEEE International Conference on Multimedia and Expo (ICME) > 745 - 750

2017 IEEE International Conference on Multimedia and Expo (ICME)

We present a new method of estimating disparity maps from stereo videos for bokeh effect synthesis. In this work, we develop an improved total variation regularization and the robust L¹ norm in the data fidelity term (TV-L¹) [4] based method to estimate edge-preserving disparity map without stereo rectification. The proposed algorithm improves the TV-L¹ approach by incorporating structure edge detection,...

chapter

End-to-end learning for dimensional emotion recognition from physiological signals

Gil Keren, Tobias Kirschstein, Erik Marchi, Fabien Ringeval, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 985 - 990

2017 IEEE International Conference on Multimedia and Expo (ICME)

Dimensional emotion recognition from physiological signals is a highly challenging task. Common methods rely on hand-crafted features that do not yet provide the performance necessary for real-life application. In this work, we exploit a series of convolutional and recurrent neural networks to predict affect from physiological signals, such as electrocardiogram and electrodermal activity, directly...

chapter

An accurate deep convolutional neural networks model for no-reference image quality assessment

Bahetiyaer Bare, Ke Li, Bo Yan

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1356 - 1361

2017 IEEE International Conference on Multimedia and Expo (ICME)

The goal of image quality assessment (IQA) is to use computational models to measure the consistency between image quality and subjective evaluations. In recent years, convolutional neural networks (CNNs) have been widely used in image processing community and have achieved performance leaps than non CNNs-based methods. In this work, we describe an accurate deep CNNs model for no-reference IQA. Taking...

chapter

Learning a multi-center convolutional network for unconstrained face alignment

Zhiwen Shao, Hengliang Zhu, Yangyang Hao, Min Wang, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 109 - 114

2017 IEEE International Conference on Multimedia and Expo (ICME)

In this paper, we propose a novel multi-center convolutional neural network for unconstrained face alignment. To utilize structural correlations among different facial landmarks, we determine several clusters based on their spatial position. We pre-train our network to learn generic feature representations. We further fine-tune the pre-trained model to emphasize on locating a certain cluster of landmarks...

chapter

Optimized video coding for omnidirectional videos

Minhao Tang, Yu Zhang, Jiangtao Wen, Shiqiang Yang

2017 IEEE International Conference on Multimedia and Expo (ICME) > 799 - 804

2017 IEEE International Conference on Multimedia and Expo (ICME)

The ever widening application of virtual reality requires the ultra high resolution omnidirectional videos (OVs) to be transmitted over the wired and wireless Internet at low cost (i.e. bitrate). Various solutions have been proposed to intelligently reduce the bitrate, e.g. adapting the spatial resolution of the video for different directions of the panorama with regard to current direction that the...

chapter

Compressing deep neural networks for efficient visual inference

Shiming Ge, Zhao Luo, Shengwei Zhao, Xin Jin, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 667 - 672

2017 IEEE International Conference on Multimedia and Expo (ICME)

The deployments of deep neural network models on mobile or embedded devices have been challenged due to two main reasons: 1) the large model size for storage, and 2) the large memory bandwidth for inference. To address these issues, this paper develops a deep neural network compression framework to reduce the resource usage for efficient visual inference. By reviewing the trained deep model, we propose...

chapter

Robust and real-time deep tracking via multi-scale domain adaptation

Xinyu Wang, Hanxi Li, Yi Li, Fumin Shen, more

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1338 - 1343

2017 IEEE International Conference on Multimedia and Expo (ICME)

Visual tracking is a fundamental problem in computer vision. Recently, some deep-learning-based tracking algorithms have been achieving record-breaking performances. However, due to the high complexity of deep learning, most deep trackers suffer from low tracking speed, and thus are impractical in many real-world applications. Some new deep trackers with smaller network structure achieve high efficiency...

INFONA - science communication portal

2017 IEEE International Conference on Multimedia and Expo (ICME)