Fig 2 - uploaded by Fatih Porikli
Content may be subject to copyright.
Object representation by multiple covariance matrices of subregions. 

Object representation by multiple covariance matrices of subregions. 

Source publication
Conference Paper
Full-text available
Mathematical formulation of certain natural phenomena exhibits group structure on topological spaces that resemble the Euclidean space only on a small enough scale, which prevents incorporation of conventional inference methods that require global vector norms. More specifically in computer vision, such underlying notions emerge in differentiable p...

Context in source publication

Context 1
... is the number of pixels and ̄ is the mean vector of the corresponding features within the region . Note that, this is not the computation of the covariance of two image regions, but the covariance of image features of a region. Refer to [5] for more details. Such a descriptor provides a natural way of fusing multiple features without a weighted average. Instead of evaluating the first order statistics of feature distributions through histograms, it embodies the second order characteristics. The noise corrupting individual samples are largely filtered out by the multitude of pixels. It endows spatial scale and feature shift invariance. It is possible to compute covariance matrix from feature images in a very fast way using integral image representation [6]. After constructing ( +1) / 2 tensors of integral images corresponding to each feature dimension and multiplication of any two feature dimensions, the covariance matrix of any arbitrary rectangular region can be computed in ( 2 ) time independent of the region size. The space of region covariance descriptors is not a vector space. For example, it is not closed under multiplication with negative scalars. They constitute the 0 , + space of positive semi-definite matrices . By adding a small diagonal matrix (or guaranteeing no features in the feature vectors would be exactly identical), they can be transformed into + , which is a Riemannian manifold, in order to apply the Riemannian metrics (10, 11). A first example using the covariance region descriptor is pattern search to locate a given object of interest in an arbitrary image. To find the most similar region in the image, distances between the descriptors of the object and candidate regions are computed. Each pixel of the image is converted to a 9-dimensional feature vector , = , , , , , , , , , where are the RGB color values, and , are spatial derivatives. An object is represented by a collection of partial region covariance matrices as shown in Figure 2. At the first phase, only the covariance matrix of the whole region from the source image is computed. The target image is searched for a region having similar covariance matrix at all the locations and different scales. A brute force search can be performed since the covariance of an arbitrary region can be obtained efficiently. Instead of scaling the target image, the size of the search window is changes. Keeping the best matching locations and scales, the search for initial detections is repeated using the covariance matrices of partially occluded subregions at the second phase. The distance of the object model and a candidate region is computed ...

Similar publications

Conference Paper
Full-text available
A robust hand gesture detection and recognition algorithm using dynamic time warping and multi-class probability estimates is proposed. Quaternion based directional features of the hand are extracted using the color-depth camera Kinect. The directional features utilized have position and orientation invariance. Dynamic time warping of the signal se...
Article
Full-text available
A novel approach to detection of stationary objects in the video stream is presented. Stationary objects are these separated from the static background, but remaining motionless for a prolonged time. Extraction of stationary objects from images is useful in automatic detection of unattended luggage. The proposed algorithm is based on detection of i...
Article
Full-text available
In this paper we extend the "shape, illumination and re-flectance from shading" (SIRFS) model [3, 4], which recov-ers intrinsic scene properties from a single image. Though SIRFS performs well on images of segmented objects, it per-forms poorly on images of natural scenes, which contain occlusion and spatially-varying illumination. We therefore pre...

Citations

... Building upon the manifold assumption, manifold learning methods aim to learn a lower-dimensional representation of data while retaining as much of its inherent information as possible. Manifold learning has been useful in many fields such as image classification and object detection [1,2,3], image synthesis and enhancement [4,5], video analysis [6,7], 3D data processing [8], analyzing single-cell RNA-sequencing data [9], and more [10] In particular, manifold learning has been used in nonlinear dimensionality reduction and data visualization. Classical approaches for nonlinear dimension reduction are Isomap [11], MDS [12], Local Linear Embedding (LLE) [13] and Laplacian Eigenmaps [14]. ...
Preprint
Data visualization via dimensionality reduction is an important tool in exploratory data analysis. However, when the data are noisy, many existing methods fail to capture the underlying structure of the data. The method called Empirical Intrinsic Geometry (EIG) was previously proposed for performing dimensionality reduction on high dimensional dynamical processes while theoretically eliminating all noise. However, implementing EIG in practice requires the construction of high-dimensional histograms, which suffer from the curse of dimensionality. Here we propose a new data visualization method called Functional Information Geometry (FIG) for dynamical processes that adapts the EIG framework while using approaches from functional data analysis to mitigate the curse of dimensionality. We experimentally demonstrate that the resulting method outperforms a variant of EIG designed for visualization in terms of capturing the true structure, hyperparameter robustness, and computational speed. We then use our method to visualize EEG brain measurements of sleep activity.
... Recent work reported in [26] on covariance tracking uses a covariance matrix (constructed from pixel-wise features inside the object region) that belongs to P n in order to describe the appearance of the target being tracked. This covariance descriptor has proved to be robust in both video detection [35,33] and tracking [26,24,39,18,36,15,19,5]. The covariance descriptor is a compact feature representation of the object with relatively low dimension compared to other appearance models such as the histogram model in [9]. ...
... One major challenge in covariance tracking is how to recursively estimate the covariance template (a covariance descriptor that serves as the target appearance template) based on the input video frames. In [26] and also in [24,19] the Karcher mean of sample covariance descriptors from a fixed number of video frames is used as the covariance template. This method is based on the natural Riemannian distance -the GL-invariant distance (Sec. ...
Article
Full-text available
We address the problem of video tracking using covariance descriptors constructed from simple features extracted from a given image sequence. Theoretically, this can be posed as a tracking problem in the space of (n×n) symmetric positive definite (SPD) matrices denoted by P n . A novel probabilistic dynamic model in P n based on Riemannian geometry and probability theory is presented in conjunction with a geometric (intrinsic) recursive filter for tracking a time sequence of SPD matrix measurements in a Bayesian framework. This newly developed filtering method can be used for the covariance descriptor updating problem in covariance tracking, leading to new and efficient video tracking algorithms. To show the accuracy and efficiency of our tracker in comparison to the state-of-the-art, we present synthetic experiments on P n and several real data experiments for tracking in video sequences.
Article
Individuating and locating repetitive patterns in still images is a fundamental task in image processing, typically achieved by means of correlation strategies. In this paper we provide a solid solution to this task using a differential geometry approach, operating on Lie algebra, and exploiting a mixture of templates. The proposed method asks the user to locate few instances of the target patterns (seeds), that become visual templates used to explore the image. We propose an iterative algorithm to locate patches similar to the seeds working in three steps: first clustering the detected patches to generate templates of different classes, then looking for the affine transformations, living on a Lie algebra, that best link the templates and the detected patches, and finally detecting new patches with a convolutional strategy. The process ends when no new patches are found. We will show how our method is able to process heterogeneous unstructured images with multiple visual motifs and extremely crowded scenarios with high precision and recall, outperforming all the state of the art methods.
Article
In recent years, there has been extensive research on sparse representation of vector-valued signals. In the matrix case, the data points are merely vectorized and treated as vectors thereafter (for example, image patches). However, this approach cannot be used for all matrices, as it may destroy the inherent structure of the data. Symmetric positive definite (SPD) matrices constitute one such class of signals, where their implicit structure of positive eigenvalues is lost upon vectorization. This paper proposes a novel sparse coding technique for positive definite matrices, which respects the structure of the Riemannian manifold and preserves the positivity of their eigenvalues, without resorting to vectorization. Synthetic and real-world computer vision experiments with region covariance descriptors demonstrate the need for and the applicability of the new sparse coding model. This work serves to bridge the gap between the sparse modeling paradigm and the space of positive definite matrices.