The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We tackle the problem of learning robotic sensorimotor control policies that can generalize to visually diverse and unseen environments. Achieving broad generalization typically requires large datasets, which are difficult to obtain for task-specific interactive processes such as reinforcement learning or learning from demonstration. However, much of the visual diversity in the world can be captured...
Unsupervised learning of visual similarities is of paramount importance to computer vision, particularly due to lacking training data for fine-grained similarities. Deep learning of similarities is often based on relationships between pairs or triplets of samples. Many of these relations are unreliable and mutually contradicting, implying inconsistencies when trained without supervision information...
Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting and training CNNs so that their learned features are compositional. It encourages networks to form representations that disentangle objects from their surroundings...
In recent years, deep convolutional neural networks have achieved state of the art performance in various computer vision tasks such as classification, detection or segmentation. Due to their outstanding performance, CNNs are more and more used in the field of document image analysis as well. In this work, we present a CNN architecture that is trained with the recently proposed PHOC representation...
Evidence is mounting that ConvNets are the best representation learning method for recognition. In the common scenario, a ConvNet is trained on a large labeled dataset and the feed-forward units activation, at a certain layer of the network, is used as a generic representation of an input image. Recent studies have shown this form of representation to be astoundingly effective for a wide range of...
Logging food and calorie intake has been shown to facilitate weight management. Unfortunately, current food logging methods are time-consuming and cumbersome, which limits their effectiveness. To address this limitation, we present an automated computer vision system for logging food and calorie intake using images. We focus on the "restaurant" scenario, which is often a challenging aspect...
Near-duplicate image discovery is the task of detecting all clusters of images which duplicate at a significant region. Previous work generally take divide and conquer approaches composed of two steps: generating cluster seeds using min-hashing, and growing the seeds by searching the entire image space with the seeds as queries. Since the computational complexity of the seed growing step is generally...
In [1] Lienhart and Maydt introduced the calculation of rotated Haar-Wavelets by 45°. They showed that the extended set of possible wavelets improves the object recognition method of Viola et al. [2]. In this paper, we introduce a novel integral image structure which holds the information of a standard and a 45° rotated integral image. We use this image structure to improve a simple and efficient...
The adoption of NAO humanoid robots in the RoboCup Standard Platform League (SPL) broguht a new set of challenges on this league on the computer vision area. This paper presents a new color indexing mode and a study of the impact of the reduction of the color spectrum, to be processed on the classification, segmentation and object detection, in a NAO robot, playing on the SPL league. The experiments...
We present the Delft Assessment Instrument for Strabismus in Young children (DAISY) a device designed to measure angles of strabismus in young children fast and accurately. DAISY allows for unrestrained head movements by the mean of a triple camera vision system that simultaneously estimates the head rotation and the eye pose. The device combines two different methods to record bilateral eye position:...
Date fruits are small fruits that are abundant and popular in the Middle East, and have growing international presence. There are many different types of dates, each with different features. Sorting of dates is a key process in the date industry, and can be a tedious job. In this paper, we present a method for automatic classification of date fruits based on computer vision and pattern recognition...
The visual vocabulary, which is the key component of Bag-of-Words(BoW) model, plays an important role for representing visual content in both effectiveness and efficiency. Although various construction methods have been proposed in previous work, less effects have been paid on discovering the key factors that impacts the performance of visual vocabulary, especially in the case of building large scale...
Intuitive and easily interpretable performance measures, repeatability and matching performance, for local feature detectors and descriptors were introduced by Mikolajczyk et al. [10, 9]. They, however, measured performance in a wide baseline setting that does not correspond to the visual object categorisation problem which is a popular application of the detectors and descriptors. The limitation...
Visual attention of human visual system has significant impact on the performance of video and image processing, encoding and decoding, quality assessment, machine vision, and so on. Visual attention is classified into two types: task-driven and data-driven. The difference between these two types of visual attention is essential to develop computational model. And so far the influential factors for...
We review some recent techniques for 3D tracking and occlusion handling for computer vision-based augmented reality. We discuss what their limits for real applications are, and why object recognition techniques are certainly the key to further improvements.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.