The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Underwater image formation is degraded by several factors, which causes the ocean to be a challenging environment for image processing. This paper aims to improve the visual servoing capability of an autonomous underwater vehicle by using pre-processing algorithms to improve the image quality. We used artificial fiducial markers to feed the visual controller. Therefore, three different methods for...
This paper presents a performance comparison of several state-of-the-art visual feature extraction algorithms when applied in a poorly-structured environment as found on the planet Mars. So far, no systematic evaluation of feature extraction algorithms in extraterrestrial environments is available. The algorithms in this paper are evaluated using the Devon Island dataset which is said to have one...
Since news videos are valuable sources of multimedia information on real-world events, there is a demand for viewing them efficiently. However, there is a problem that summarization methods based on auditory contents do not take into account the visual contents. In the case of news videos, due to its presentation style where audio contents and visual contents do not necessarily come from the same...
Multi-object model-free tracking is challenging because the tracker is not aware of the objects' type (not allowed to use object detectors), and needs to distinguish one object from background as well as other similar objects. Most existing methods keep updating their appearance model individually for each target, and their performance is hampered by sudden appearance change and/or occlusion. We propose...
Computer Vision and Machine Learning are the key to develop autonomous robots. While engaged with a IEEE Open Challenge, in which the robots need to recognize a miniature of a cow, we saw a solution in these areas. The main contribution of this paper is the algorithm implemented to identify and follow a known object, the miniature of a cow. We are constructing an application based on Image Processing...
This paper introduces a novel approach for modeling visual relations between pairs of objects. We call relation a triplet of the form (subject; predicate; object) where the predicate is typically a preposition (eg. ’under’, ’in front of’) or a verb (’hold’, ’ride’) that links a pair of objects (subject; object). Learning such relations is challenging as the objects have different spatial configurations...
In this paper, we investigate a weakly-supervised object detection framework. Most existing frameworks focus on using static images to learn object detectors. However, these detectors often fail to generalize to videos because of the existing domain shift. Therefore, we investigate learning these detectors directly from boring videos of daily activities. Instead of using bounding boxes, we explore...
We propose “Areas of Attention”, a novel attentionbased model for automatic image captioning. Our approach models the dependencies between image regions, caption words, and the state of an RNN language model, using three pairwise interactions. In contrast to previous attentionbased approaches that associate image regions only to the RNN state, our method allows a direct association between caption...
Research efforts have been devoted to extraction and visualization of vortices in an unsteady (turbulent) flow. Characterizing the behaviors of the flow, vortices are identifiable as regions using a vortex detector known as the lambda2-criterion. Isosurface visualization renders vortex regions based on a chosen isovalue. However, it is highly challenging to choose one isovalue suitable for visualizing...
Nowadays, visual features play a key role, as they can provide a concise representation of visual data that is efficient for multiple tasks, notably content retrieval and object recognition. In parallel, visual sensors have been improving, targeting richer acquisitions of the light in a visual scene. In this context, the so-called light field cameras, which have recently emerged, are able to go beyond...
Fruit flies are of huge biological and economic importance for the farming of different countries in the World, especially for Brazil. Brazil is the third largest fruit producer in the world with 44 million tons in 2016. The direct and indirect losses caused by fruit flies can exceed USD 2 billion, putting these pests as one of the biggest problems of the world agriculture. In Brazil, it is estimated...
Visual sensors are widely used in automatic parking systems, so this paper proposes an algorithm for the visual detection of available parking slots. The proposed system consists of two stages: parking slot recognition and slot occupancy classification. The parking slot recognition stage generates parking slots using the corner features of parking slot markings. The slot occupancy classification stage...
This paper addresses the problem of joint detection and recounting of abnormal events in videos. Recounting of abnormal events, i.e., explaining why they are judged to be abnormal, is an unexplored but critical task in video surveillance, because it helps human observers quickly judge if they are false alarms or not. To describe the events in the human-understandable form for event recounting, learning...
This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues. We model the appearance, size, and position of entity bounding boxes, adjectives that contain attribute information, and spatial relationships between pairs of entities connected by verbs or prepositions. Special attention is given to relationships between people...
A major impediment in rapidly deploying object detection models for instance detection is the lack of large annotated datasets. For example, finding a large labeled dataset containing instances in a particular kitchen is unlikely. Each new environment with new instances requires expensive data collection and annotation. In this paper, we propose a simple approach to generate large annotated instance...
Weakly supervised object localization remains challenging, where only image labels instead of bounding boxes are available during training. Object proposal is an effective component in localization, but often computationally expensive and incapable of joint optimization with some of the remaining modules. In this paper, to the best of our knowledge, we for the first time integrate weakly supervised...
Instead of using HOG feature on cells or blocks, the extraction of HOG features on corner points is proposed for multiple object visual tracking system in which single or multiple moving objects could be classified. Background subtraction and extraction of corner feature are applied to track and classify the moving objects. Firstly, moving objects will be detected in the form of regions from background...
The region-based Convolutional Neural Network (CNN) detectors such as Faster R-CNN or R-FCN have already shown promising results for object detection by combining the region proposal subnetwork and the classification subnetwork together. Although R-FCN has achieved higher detection speed while keeping the detection performance, the global structure information is ignored by the position-sensitive...
Imagery texts are usually organized as a hierarchy of several visual elements, i.e. characters, words, text lines and text blocks. Among these elements, character is the most basic one for various languages such as Western, Chinese, Japanese, mathematical expression and etc. It is natural and convenient to construct a common text detection engine based on character detectors. However, training character...
Currently used motion estimation is usually based on a computation of optical flow from individual images or short sequences. As these methods do not require an extraction of the visual description in points of interest, correspondence can be deduced only by the position of such points. In this paper, we propose an alternative motion estimation method solely using a binary visual descriptor. By tuning...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.