The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Kotenseki is a collection of classical and ancient Japanese literature. It is comprised of image books that express Japanese stories by using comic drawings of different characters, such as humans, nature, and animals. To effectively store them for posterity, a search system is important. We propose an efficient CBIR system to assist the users in easily accessing the information and have an enjoyable...
Affordance learning in general, is to identify the purpose, use, and ways to interact with an object, based on information gained from observing the object. Most of the existing affordance learning approaches assume the object target has been cropped individually from images. However, the object could not be easily separated from others due to occlusion or noise. Actually, two or more neighboring...
For remote sensing image understanding, target detection is one of the most important tasks. In this paper, we propose one object detection method based on region proposal detection via active contour model and detection based on one-class classification method. The large scale remote sensing image is split into several connected components. And then, the proposed algorithm detects the object from...
We present Deeply Supervised Object Detector (DSOD), a framework that can learn object detectors from scratch. State-of-the-art object objectors rely heavily on the off the-shelf networks pre-trained on large-scale classification datasets like Image Net, which incurs learning bias due to the difference on both the loss functions and the category distributions between classification and detection tasks...
Subitizing (i.e., instant judgement on the number) and detection of salient objects are human inborn abilities. These two tasks influence each other in the human visual system. In this paper, we delve into the complementarity of these two tasks. We propose a multi-task deep neural network with weight prediction for salient object detection, where the parameters of an adaptive weight layer are dynamically...
According to the needs of users, Home Service Robots gradually work outside. As a result, new requirements for the detection and recognition performance of Home Service Robots are put forward. Compared with indoor environment, outdoor environment is more complex, which brings difficulties to detect objects. But extracting features by Histogram of Oriented Gradient (HOG) method can not work well in...
Object detection in Very High Resolution (VHR) optical remote sensing images is a challenged work for objects are usually dense and tiny. With random orientation, various backgrounds as well as unpredictable noise make traditional image processing methods perform badly. In this paper, we propose using state-of-art Region-based fully convolutional networks to solve object detection tasks in aerial...
With the development of unmanned aerial vehicles (UAVs) and the relevant techniques, UAVs become common and popular for civilian applications such as remote sensing tasks. The reason is because they are cheap, flexible, and easy to set up. Car park occupancy analysis is important for authorities to make decisions on the design, plan and management of car parks. To have a quick knowledge of current...
There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account...
Pedestrian detection is an important topic in object detection. Compared with other object detectors, YOLOv2 achieves high accuracy and fast speed for general object detection, however it degrades accuracy when detecting crowed pedestrians. In this paper, combining with the skip structure of FCN, we tailor the YOLOv2 network to improve the accuracy in detecting small pedestrians which appear in groups...
We present an Automatic License Plate Recognition system designed around Convolutional Neural Networks (CNNs) and trained over synthetic plate images. We first design CNNs suitable for plate and character detection, sharing a common architecture and training procedure. Then, we generate synthetic images that account for the varying illumination and pose conditions encountered with real plate images...
This paper proposes an object detection strategy with a deep reinforcement learning method Double DQN in which, given an image window, a deep reinforcement learning agent is trained to determine which predefined region candidates to focus the attention on. In the Double DQN framework, the first DQN is used to select an action to search the target region and the second is to evaluate the selected action...
In many computer vision tasks, for example saliency prediction or semantic segmentation, the desired output is a foreground map that predicts pixels where some criteria is satisfied. Despite the inherently spatial nature of this task commonly used learning objectives do not incorporate the spatial relationships between misclassified pixels and the underlying ground truth. The Weighted F-measure, a...
In light of the powerful learning capability of deep neural networks (DNNs), deep (convolutional) models have been built in recent years to address the task of salient object detection. Although training such deep saliency models can significantly improve the detection performance, it requires large-scale manual supervision in the form of pixel-level human annotation, which is highly labor-intensive...
We aim to tackle a novel vision task called Weakly Supervised Visual Relation Detection (WSVRD) to detect “subject-predicate-object” relations in an image with object relation groundtruths available only at the image level. This is motivated by the fact that it is extremely expensive to label the combinatorial relations between objects at the instance level. Compared to the extensively studied problem,...
Cascade is a widely used approach that rejects obvious negative samples at early stages for learning better classifier and faster inference. This paper presents chained cascade network (CC-Net). In this CC-Net, there are many cascade stages. Preceding cascade stages are placed at shallow layers. Easy hard examples are rejected at shallow layers so that the computation for deeper or wider layers is...
A major impediment in rapidly deploying object detection models for instance detection is the lack of large annotated datasets. For example, finding a large labeled dataset containing instances in a particular kitchen is unlikely. Each new environment with new instances requires expensive data collection and annotation. In this paper, we propose a simple approach to generate large annotated instance...
In order to realize autonomous landing of the unmanned aerial vehicle (UAV) in power patrolling, a visual method vision based on Faster Regions with Convolutional Neural Network (Faster R-CNN) for UAVs is studied. In this paper, we design the landing sign of the combination of concentric circles and pentagon, and propose the Faster R-CNN recognition algorithm which can be used to identify the target...
Progress in Multiple Object Tracking (MOT) has been historically limited by the size of the available datasets. We present an efficient framework to annotate trajectories and use it to produce a MOT dataset of unprecedented size. In our novel path supervision the annotator loosely follows the object with the cursor while watching the video, providing a path annotation for each object in the sequence...
Extending state-of-the-art object detectors from image to video is challenging. The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc. Existing work attempts to exploit temporal information on box level, but such methods are not trained end-to-end. We present flow-guided feature aggregation, an accurate and end-to-end learning...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.