The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
In computer vision, video-based approaches have been widely explored for the early classification and the prediction of actions or activities. However, it remains unclear whether this modality (as compared to 3D kinematics) can still be reliable for the prediction of human intentions, defined as the overarching goal embedded in an action sequence. Since the same action can be performed with different...
We propose a principled approach for the learning of causal conditions from actions and activities taking place in the physical environment through visual input. Causal conditions are the preconditions that must exist before a certain effect can ensue. We propose to consider diachronic and synchronic causal conditions separately for the learning of causal knowledge. Diachronic condition captures the...
Motion analysis is often restricted to a laboratory setup with multiple cameras and force sensors which requires expensive equipment and knowledgeable operators. Therefore it lacks in simplicity and flexibility. We propose an algorithm combining monocular 3D pose estimation with physics-based modeling to introduce a statistical framework for fast and robust 3D motion analysis from 2D video-data. We...
Following the recent progress in image classification and captioning using deep learning, we develop a novel natural language person retrieval system based on an attention mechanism. More specifically, given the description of a person, the goal is to localize the person in an image. To this end, we first construct a benchmark dataset for natural language person retrieval. To do so, we generate bounding...
We propose AcFR, an active face recognition system that employs a convolutional neural network and acts consistently with human behaviors in common face recognition scenarios. AcFR comprises two main components—a recognition module and a controller module. The recognition module uses a pre-trained VGG-Face net to extract facial image features along with a nearest neighbor identity recognition algorithm...
Recent work in computer graphics has explored the synthesis of indoor spaces with furniture, accessories, and other layout items. In this work, we bridge the gap between the physical and virtual worlds: Given an input image of an interior or exterior space, and a general user specification of the desired furnishings and layout constraints, our method automatically furnishes the scene with a realistic...
In the physical world, cause and effect are inseparable: ambient conditions trigger humans to perform actions, thereby driving status changes of objects. In video, these actions and statuses may be hidden due to ambiguity, occlusion, or because they are otherwise unobservable, but humans nevertheless perceive them. In this paper, we extend the Causal And-Or Graph (C-AOG) to a sequential model representing...
The production of sports highlight packages summarizing a game’s most exciting moments is an essential task for broadcast media. Yet, it requires labor-intensive video editing. We propose a novel approach for auto-curating sports highlights, and use it to create a real-world system for the editorial aid of golf highlight reels. Our method fuses information from the players’ reactions (action recognition...
The burst of video production appeals for new browsing frameworks. Chiefly in sports, TV companies have years of recorded match archives to exploit and sports fans are looking for replay, summary or collection of events. In this work, we design a new multi-resolution motion feature for video abstraction. This descriptor is based on optical flow singularities tracked along the video. We use these singlets...
Estimating action quality, the process of assigning a "score" to the execution of an action, is crucial in areas such as sports and health care. Unlike action recognition, which has millions of examples to learn from, the action quality datasets that are currently available are small-typically comprised of only a few hundred samples. This work presents three frameworks for evaluating Olympic...
A convolutional neural network (CNN) has been designed to interpret player actions in ice hockey video. The hourglass network is employed as the base to generate player pose estimation and layers are added to this network to produce action recognition. As such, the unified architecture is referred to as action recognition hourglass network, or ARHN. ARHN has three components. The first component is...
Due to recent advances in technology, the recording and analysis of video data has become an increasingly common component of athlete training programmes. Today it is incredibly easy and affordable to set up a fixed camera and record athletes in a wide range of sports, such as diving, gymnastics, golf, tennis, etc. However, the manual analysis of the obtained footage is a time-consuming task which...
Human pose analysis has been known to be an effective means to evaluate athlete's performance. Marker-less 3D human pose estimation is one of the most practical methods to acquire human pose but lacks sufficient accuracy required to achieve precise performance analysis for sports. In this paper, we propose a human pose estimation algorithm that utilizes multiple types of random forests to enhance...
Analyzing joint movements of an athlete helps to improve the pose of the athlete. Human pose estimation (HPE) algorithms regress the locations of parts such as wrists, ankles and knees. In this paper, we propose a network that combines global and local information for HPE using a 2D image. Unlike previous works that have used global or local information separately, we use the combined information...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.