The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper deals with automatic estimation of the horizon in videos from fixed surveillance cameras. The proposed algorithm is fully automatic in the sense that no user input is needed per-camera and it works with various scenes (indoor, outdoor, traffic, pedestrian, livestock, etc.). The algorithm detects moving objects, tracks them in time, assesses some of their geometric properties related to...
Activity recognition from first-person (ego-centric) videos has recently gained attention due to the increasing ubiquity of the wearable cameras. There has been a surge of efforts adapting existing feature descriptors and designing new descriptors for the first-person videos. An effective activity recognition system requires selection and use of complementary features and appropriate kernels for each...
In this paper, we have proposed a method to detect abnormal events for human group activities. Our main contribution is to develop a strategy that learns with very few videos by isolating the action and by using supervised learning. First, we subtract the background of each frame by modeling each pixel as a mixture of Gaussians(MoG) to concatenate the higher order learning only on the foreground....
This article gives a more robust justification for the use of the Bhattacharyya distance in the algorithm used by our Automated Sport Analysis System named ACE in the first of its three perception modules. Such first module consists in the temporal segmentation of television video broadcasts, aiming to break down the video into shots, delimited by scene boundaries. An evaluation of other seven histogram...
In this paper, we propose a method that automatically estimates surgical phases in a specified workflow with a multi-camera system. More specifically, our goal is to output an appropriate phase label for each one-second of input videos captured by multiple cameras in an operating room. The fundamental idea behind our work lies in constructing a hidden Markov model based on motion features, which are...
Wearable cameras used to record daily life are attracting researchers' attention, and a large number of ego-related applications have been developed in recent years. Hand detection is one of the key steps for the tasks like gesture recognition, action recognition and understanding hand-based interaction in egocentric videos, since humans are accustomed to interacting with objects using their hands...
Human action recognition is a way of retrieving videos emerged from Content Based Video Retrieval (CBVR).It is a growing area of research in the field of computer vision nowadays. Human action recognition has gained popularity because of its wide applicability in automatic retrieval of videos of particular action using visual features. The most common stages for action recognition includes: object...
In this paper, we propose a robust multi-object tracking algorithm for acquiring object oriented multi-angle videos, which takes advantages of two different tracking techniques represented by subdivided color histogram based tracking and labeling based tracking. Object models based on color histograms are further subdivided to differentiate similar color regions. Another tracking technique utilizes...
Performance of computer vision algorithm for pedestrian detection decrease in case of night vision system. Since intensity of luminance at night is extremely low compare to daytime, even human visibility cannot recognize object properly at night. Most recent generation of night vision system have researched using near infrared camera to settle the problem, but it still have poor visibility and application...
Video enhancement is important for video security surveillance system because the videos and images of outdoor street scenes when captured in severe climatic conditions such as fog, dust storms, mist gets degraded. Drivers much of the time turn on the headlights of their vehicles and streetlights are frequently lit which decrease the visibility and leads to colour shift problems. Due to improper visibility...
Video-based person re-identification has become a hot topic in the field of research on computer vision and intelligent surveillance, which is more robust to the variations in a person's appearance than single-shot based methods and involves space-time information. However, the most existing spatiotemporal features have been proposed for action recognition that they mainly focus on the exact spatial...
First-person videos (FPVs) captured by wearable cameras often contain heavy distortions, including motion blur, rolling shutter artifacts and rotation. Existing image and video quality estimators are inefficient for this type of video. We develop a method specifically to measure the distortions present in FPVs, without using a high quality reference video. Our local visual information (LVI) algorithm...
This study proposes a vision-based infant monitoring system. The input videos are obtained from one PT(Pan Tilt) IP(Internet Protocol) camera set on a high point in a room. The proposed system consists of three major stages: tracking object (infant) initialization, infant tracking, and PT IP camera control. First, a codebook background subtraction algorithm is applied to extract the infant on the...
We describe a methodology that is designed to match key point and region-based features in real-world images, acquired from long-running security cameras with no control over the environment. We detect frame duplication and images from static scenes that have no activity to prevent processing saliently identical images, and describe a novel blur-sensitive feature detection method, a combinatorial...
Great advancement and popularity of multimedia technology facilitates uploading and accessing thousands of videos through internet in seconds. This demands adopting efficient video data management techniques for ease of storage and browsing applications. This in turn requires an efficient method of annotation. Automatic cut detection is the elementary and essential step of any automatic annotation...
Due to the popularity of multimedia technology and digital world, thousands of videos are accessed through internet in seconds. Most of the videos, available in internet for public access are non-edited videos. Efficient way of searching and storage need an efficient method of annotation. Automatic cut detection is the first stage of any automatic annotation process. In this paper we addressed the...
The preview frame of an online video plays a critical role for a user to quickly decide whether to watch the video. However, the preview frames of most online videos such as those shared on social media platforms are either selected heuristically (e.g., the first or middle frame of a video) or manually by users or experienced editors. In this paper, we investigate the challenging automatic preview...
Of increasing interest to the computer vision community is to recognize egocentric actions. Conceptually, an egocentric action is largely identifiable by the states of hands and objects. For example, “drinking soda” is essentially composed of two sequential states where one first “takes up the soda can”, then “drinks from the soda can”. While existing algorithms commonly use manually defined states...
In this paper, we propose a novel kernel function for recognizing objects in RGB-D egocentric videos. In order to effectively exploit the varied object appearance in a video, we take a set-based recognition approach and represent the target object using the set of frames contained in the video. Our kernel function measures the similarity of two sets by the minimum distance between the sparse affine...
We present a framework and algorithm to analyze first person RGBD videos captured from the robot while physically interacting with humans. Specifically, we explore reactions and interactions of persons facing a mobile robot from a robot centric view. This new perspective offers social awareness to the robots, enabling interesting applications. As far as we know, there is no public 3D dataset for this...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.