The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we present a dynamic programming approach to voice transformation (VT). The goal of VT is to modify the speech of a source speaker such that it is perceived as if spoken by a target speaker. The speech model used in this work is based on MELP (Mixed Excitation Linear Prediction) speech coding algorithm. The designed system obtains speaker-specific codebooks of line spectral frequencies...
The Direction of Arrival estimation algorithm ESPRIT is capable of estimating the angles of arrival of N narrowband source signals using M > N anechoic sensor mixtures from a uniform linear array (ULA). Using a similar parameter estimation step, the DUET Blind Source Separation algorithm can demix N > 2 speech signals using M = 2 anechoic mixtures of the signals. We introduce here the DUET-ESPRIT...
We develop a novel algorithmic representation of textures using the statistics of multiple spectral components of images. Histograms of filter responses are viewed as elements of a non-parametric statistical manifold, and local texture patterns are compared using a geodesic metric derived from Riemannian information geometry. Several region-based image segmentation experiments are carried out to test...
In this paper, we propose an improvement of the classical image multi-thresholding methods. The goal is to achieve the precise determination of homogeneous zones in numerical images by pixels classification. The thresholds and the modes are obtained by minimization of a new energy of gravitational clustering initialized with the significant peaks of a cumulated histogram. Then, the best modes and...
This study is concerned with reconstruction of complex-valued components comprising a linear mixing model of unknown real-valued sources, given a set of their complex-valued mixtures. We adopt previous results in the area of Blind Source Separation (BSS) of linear mixtures, based on sparse representation by means of a multiscale framework such as wavelet packets, and exploit the properties of sparse...
A multidimensional filtering technique is proposed using fuzzy logic ideas and based on local statistics. The local multivariate histogram of the mutlichannel image is computed using the Parzen estimation technique. The maximum and minimum of the histogram are used as parameters, to describe the signal shape. The method is organized around a fuzzy control system. Experimental results, in true color...
We propose a novel method for applying real-valued independent component analysis (ICA) to complex-valued multi-sensor data that comprises instantaneous linear mixtures of co-channel non-Gaussian independent signals. We examine, for non-ideal practical conditions, the optimahty of cumulant-based ICA and blind signal separation approaches in terms of the standard criterion for statistical independence...
A primary concern of future high performance systems is the way data movement is managed; the sheer scale of data to be processed directly affects the achievable performance these systems can attain. However, the increasingly complex but inherently symbiotic relationships between upcoming scientific applications and high-performance architectures necessitate increasingly informative and flexible tools...
In this paper we propose a spatio-temporal digital video hashing scheme. In the proposed scheme, the digital video is treated as a three dimensional signal. Edge Orientation Histogram (EOH) is computed for each frame in the digital video, then a temporal discrete cosine transform (DCT) for the magnitude of each EOH bin is computed. A subset of the resulting DCT coefficients are then pseudo-randomly...
Advancement of RGB-D cameras that are capable of tracking human body movement in the form of a skeleton has contributed to growing interest in skeleton-based human action recognition. However, the tracking performance of a single camera is prone to occlusion and is view dependent. In this study, we use fusion skeletal data obtained from two views for recognizing human action. We perform a substitutive...
Automated classification of music signal is an active area of research. It can act as the fundamental step for various applications like archival, indexing and retrieval of music data. In this work, a simple methodology is presented to categorize the music signals based on their genre. In order to capture the characteristics of the music signal of different genres, signal is first decomposed to extract...
Appearance-based human re-identification is challenging due to different camera characteristics, varying lighting conditions, pose variations across camera views, etc. Recent studies have revealed that color information plays a critical role on performance. However, two problems remain unclear: (1) how do different color descriptors perform under the same scene in re-identification problem? and (2)...
The TRECVID report of 2010 [14] evaluated video shot boundary detectors as achieving "excellent performance on [hard] cuts and gradual transitions." Unfortunately, while re-evaluating the state of the art of the shot boundary detection, we found that they need to be improved because the characteristics of consumer-produced videos have changed significantly since the introduction of mobile...
We present a framework and algorithm to analyze first person RGBD videos captured from the robot while physically interacting with humans. Specifically, we explore reactions and interactions of persons facing a mobile robot from a robot centric view. This new perspective offers social awareness to the robots, enabling interesting applications. As far as we know, there is no public 3D dataset for this...
In this paper, we present a challenging dataset for the purpose of segmentation and change detection in photographic images of mountain habitats. We also propose a baseline algorithm for habitats segmentation to allow for performance comparison. The dataset consists of high resolution image pairs of historic and repeat photographs of mountain habitats acquired in the Canadian Rocky Mountains for ecological...
As mobile devices become more widespread, consumers increasingly demand greater image quality and longer battery lives. However, improving photo capabilities comes at the expense of battery life. In this paper, a power-constrained contrast enhancement method formulated as an optimization problem is proposed. The proposed locality condition is used to improve the conventional histogram equalization,...
This paper introduces a high efficient local spatiotemporal descriptor, called gradient boundary histograms (GBH). The proposed GBH descriptor is built on simple spatio-temporal gradients, which are fast to compute. We demonstrate that it can better represent local structure and motion than other gradient-based descriptors, and significantly outperforms them on large realistic datasets. A comprehensive...
Holistic approaches of face recognition are not robust to illumination, scale, occlusion and age variations. Various studies indicate that the performance of holistic approaches degrades as the face database size increases. In this paper, we propose a user specific landmark geometry based approach that assigns weights to different geometrical distances according to their role in face recognition process...
Despite the tremendous importance and availability of large video collections, support for video retrieval is still rather limited and is mostly tailored to very concrete use cases and collections. In image retrieval, for instance, standard keyword search on the basis of manual annotations and content-based image retrieval, based on the similarity to query image (s), are well established search paradigms,...
Non-Perspective Three Point Pose (NP3P) problem is a generalization of the classical three point pose problem for the case of multi-camera that have no common projection center. In this paper, we develop a simple, minimal algebraic solution to the NP3P problem where the projection rays of three points may have arbitrary but known directions. This problem is known to have a maximum of eight solutions...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.