The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recognizing 3D objects in the presence of clutter and occlusion is a challenging task. This paper presents a 3D free form object recognition system based on a novel local surface feature descriptor. For a randomly selected feature point, a local reference frame (LRF) is defined by calculating the eigenvectors of the covariance matrix of a local surface, and a feature descriptor called rotational projection...
Many computer vision tasks such as large-scale image retrieval and nearest-neighbor classification perform similarity searches using Approximate Nearest Neighbor (ANN) indexes. These applications rely on the quality of ANN retrieval for success. Popular indexing methods for ANN queries include forests of kd-trees (KDT) and hierarchical k-means (HKM). The dominance of these two methods has led to implementations...
Event recognition has been an important topic in computer vision research due to its many applications. However, most of the work has focused on videos taken from a fixed camera, known environments and basic events. Here, we focus on classification of unconstrained, web videos into much higher level activities. We follow the approach of constructing fixed length feature vectors from local feature...
In this paper we propose the use of nonuniformly resized image patch exemplars for solving low level vision problems like denoising and super-resolution. While patch-based methods have been shown to be successful for several such applications, these methods have so far assumed uniform sizes for image patches. In this paper we address this restriction. We use an integral image representation for efficient...
We address the problem of automatic face detection and tracking in uncontrolled scenarios using a pan-tilt-zoom (PTZ) network camera, which could prove most helpful in forensic applications. The detected faces are associated with the corresponding people and trajectories. The dynamic nature of real-world scenarios and real-time restrictions complicate our task. Different from previous work which use...
We propose a segmentation algorithm for the purposes of large-scale flower species recognition. Our approach is based on identifying potential object regions at the time of detection. We then apply a Laplacian-based segmentation, which is guided by these initially detected regions. More specifically, we show that 1) recognizing parts of the potential object helps the segmentation and makes it more...
To recognize faces in video, face appearances have been widely modeled as piece-wise local linear models which linearly approximate the smooth yet non-linear low dimensional face appearance manifolds. The choice of representations of the local models is crucial. Most of the existing methods learn each local model individually meaning that they only anticipate variations within each class. In this...
Over the years, a large number of methods have been proposed to analyze human pose and motion information from images, videos, and recently from depth data. Most methods, however, have been evaluated on datasets that were too specific to each application, limited to a particular modality, and more importantly, captured under unknown conditions. To address these issues, we introduce the Berkeley Multimodal...
Recognizing continuous action composition in human behavior is an important and yet challenging problem. In this paper we tackle the task by developing both reliable image features and classification algorithms. For image features, we introduce the Embedded Optical Flow (EOF) feature based on embedding optical flow using Locality-constrained Linear Coding with weighted average pooling. The EOF feature...
This paper presents a framework for N-view triangulation of scene points, which improves processing time and final reprojection error with respect to standard methods, such as linear triangulation. The framework introduces an angular error-based cost function, which is robust to outliers and inexpensive to compute, and designed such that simple adaptive gradient descent can be applied for convergence...
The increasing popularity of video consumption from mobile devices requires an effective video coding strategy. To overcome diverse communication networks, video services often need to maintain sustainable quality when the available bandwidth is limited. One of the strategy for a visually-optimised video adaptation is by implementing a region-of-interest (ROI) based scalability, whereby important...
This paper presents a fully automatic system which exploits the dynamics of 3D videos and is capable of recognizing six basic facial expressions. Local video-patches of variable lengths are extracted from different locations of the training videos and represented as points on the Grass-mannian manifold. An efficient spectral clustering based algorithm is used to separately cluster points for each...
The Anti-Nuclear Antibody (ANA) clinical pathology test is commonly used to identify the existence of various diseases. A hallmark method for identifying the presence of ANAs is the Indirect Immunofluorescence method on Human Epithelial (HEp-2) cells, due to its high sensitivity and the large range of antigens that can be detected. However, the method suffers from numerous shortcomings, such as being...
We propose a new action and gesture recognition method based on spatio-temporal covariance descriptors and a weighted Riemannian locality preserving projection approach that takes into account the curved space formed by the descriptors. The weighted projection is then exploited during boosting to create a final multiclass classification algorithm that employs the most useful spatio-temporal regions...
A recent trend in computer vision is to represent images through covariance matrices, which can be treated as points on a special class of Riemannian manifolds. A popular way of analysing such manifolds is to embed them in Euclidean spaces, a process which can be interpreted as warping the feature space. Embedding manifolds is not without problems, as the manifold structure may not be accurately preserved...
Automatic evaluation of human facial attractiveness is a challenging problem that has received relatively little attention from the computer vision community. Previous work in this area have posed attractiveness as a classification problem. However, for applications that require fine-grained relationships between objects, learning to rank has been shown to be superior over the direct interpretation...
Inventories of traffic signs are acquired from street-level images in a semi-automated fashion, employing object detection and classification techniques. This is a challenging task, as signs are captured from different viewpoints and under various weather conditions. Furthermore, many similar signs exist, only differing in minor details, and moreover, sign-like objects occur frequently. Consequently,...
Real-world scenes involve many objects that interact with each other in complex semantic patterns. For example, a bar scene can be naturally described as having a variable number of chairs of similar size, close to each other and aligned horizontally. This high-level interpretation of a scene relies on semantically meaningful entities and is most generally described using relational representations...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.