The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The CNN-RNN design pattern is increasingly widely applied in a variety of image annotation tasks including multi-label classification and captioning. Existing models use the weakly semantic CNN hidden layer or its transform as the image embedding that provides the interface between the CNN and RNN. This leaves the RNN overstretched with two jobs: predicting the visual concepts and modelling their...
Over the last decade, machine learning algorithms have proven to be useful tools for exploring neural representations of percepts and concepts in the brain. An important but often neglected next step is it to relate neural representations to human behavior. Here, we introduce a novel approach to definitively linking neural representations to structural properties of stimuli as well as human behavior...
In this paper, we propose to learn object representations with inference from temporal correlation in videos to achieve effective visual tracking. Unlike traditional methods which perform feature learning either at image level or based on intuitive temporal constraint, we employ the recurrent network with Long Short Term Memory (LSTM) units to directly learn temporally correlated representations of...
This study addresses neural decoding of a code modulated visual evoked potentials (c-VEPs). c-VEP was recently developed, and applied to brain computer interfaces (BCIs). c-VEP BCI exhibits faster communication speed than existing VEP-based BCIs. In c-VEP BCI, the canonical correlation analysis (CCA) that maximizes the correlation between an averaged signal and single trial signals is often used for...
Recently, the latest advances in compact feature representation and feature learning have provided an efficient framework for several visual analysis tasks, such as object recognition. However, when multiple cameras with overlapping fields-of-view are employed, other visual analysis tasks such as depth estimation can be supported and object recognition accuracy can be improved. In this paper the problem...
The video captured by different visual sensor in a visual sensor network is first compressed using the block-based compressive sensing algorithm. All the videos are encoded independently at different sub-rates and transmitted to a host workstation for reconstruction. Then, the proposed multi-phase joint reconstruction framework is applied to improve the reconstruction of lower subrate videos. In this...
Considering the scarce resources in visual sensor networks, video coding solutions are necessary to address these constraints, namely the limited energy, memory, and bandwidth. The distributed video coding (DVC) paradigm is able to address these limitations by shifting most of the complexity to the decoder, typically a central location with a significant amount of resources. In a DVC context, the...
The state of the art in digital watermarking of visual data is briefly reviewed. A communication perspective is adopted to identify the main issues in digital watermarking and to present the most common solutions adopted by the research community. We first consider the various approaches to watermark embedding and hiding. The communication channel is then taken into account, and the main research...
The application of multivariate approaches to neuroimaging data analysis is providing cognitive neuroscientists with a new perspective on the neural substrate of conceptual knowledge. In this paper we show how the combined use of decoding models and of representational similarity analysis (RSA) can enhance our ability to investigate the inter-categorical distinctions as well as the intra-categorical...
We present a survey of recent research works on multiview image compression and transmission techniques developed for Wireless Multimedia Sensor Networks (WMSNs). We classify them into two categories with respect to the coding methods adopted: (i) in-network processing with joint coding schemes, and (ii) distributed source coding schemes. The survey also includes a comprehensive evaluation of the...
DVC is a suitable code scheme to overcome the limited memory capacity and low power consumption of video sensor nodes in Wireless Multimedia Sensor Network (WMSN). To further improve the coding performance, we propose a distributed video coding (DVC) scheme with the human visual system (HVS) which seldom senses any changes below the just noticeable distortion (JND) threshold around a pixel due to...
Brain activity patterns as well as anatomical structure differ from person to person. Although anatomical normalization techniques have been used for functional magnetic resonance imaging studies, there are no standard methods to deal with individual differences in activity patterns. In this study, we propose a method to convert brain activity patterns from one person to another by predicting the...
This paper presents a novel algorithm for computing the relative motion between images from compressed linear measurements. We propose a geometry based correlation model that describes the relative motion between images by translational motion of visual features. We focus on the problem of estimating the motion field from a reference image and a highly compressed image given by means of random projections,...
In a partial report paradigm, subjects observe during a brief presentation a cluttered field and after some time - typically ranging from 100 ms to a second - are asked to report a subset of the presented elements. A vast buffer of information is transiently available to be broadcasted which, if not retrieved in time, fades rapidly without reaching consciousness. An interesting feature of this experiment...
At low bit-rate video communications, packet loss may easily cause whole-frame loss that, in return, leads to annoying frame drop phenomenon. In this paper, a novel error concealment algorithm is specifically developed for stereoscopic video, called the disparity-based frame difference projection (DFDP), to recover the lost frames at the decoder. The proposed DFDP contains three key components: 1)...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.