Figure 3 - uploaded by Parvez Ahammad
Content may be subject to copyright.
De-correlating SIFT descriptors. Here, we show the covariance matrix of the SIFT descriptors before and after applying PCA to de-correlate (For better visualization, we have applied logarithm to the absolute values of the covariance matrix). It is clear from 3(a) that coefficients of the SIFT descriptor are highly correlated. After applying PCA however, most of the correlation between coefficients has been removed, as seen in 3(b).

De-correlating SIFT descriptors. Here, we show the covariance matrix of the SIFT descriptors before and after applying PCA to de-correlate (For better visualization, we have applied logarithm to the absolute values of the covariance matrix). It is clear from 3(a) that coefficients of the SIFT descriptor are highly correlated. After applying PCA however, most of the correlation between coefficients has been removed, as seen in 3(b).

Source publication
Article
Full-text available
We consider the problem of communicating compact descriptors for the purpose of establishing visual correspondences between two cameras operating under rate constraints. Establishing visual correspondences is a critical step before other tasks such as camera calibration or object recognition can be performed in a network of cameras. We verify that...

Contexts in source publication

Context 1
... to how SIFT descriptors are computed, 4 their coefficients are highly correlated. This is clearly demon- strated in Figure 3(a). We approximately de-correlate the coefficients by applying a linear transform to the descriptor prior to encoding. ...
Context 2
... transform is estimated by applying PCA to a set of 12514 descriptors com- puted over a collection of 6 training images. Figure 3(b) shows that applying the estimated linear transform does a good job of de-correlation; the same estimated transform will be used in our experiments. For notational convenience, we will henceforth assume that D A i and D B j refer to the transformed descriptors and N BA ji refers to the innovation noise between transformed descriptors as in (2). ...

Similar publications

Article
Full-text available
Images of co-planar points in 3-dimensional space taken from different camera positions are a homography apart. Homographies are at the heart of geometric methods in computer vision and are used in geometric camera calibration, 3D reconstruction, stereo vision and image mosaicking among other tasks. In this paper we show the surprising result that...
Article
Full-text available
Camera parameters can't be estimated accurately using traditional calibration methods if the camera is substantially defocused. To tackle this problem, an improved approach based on three phase-shifting circular grating (PCG) arrays is proposed in this paper. Rather than encoding the feature points into the intensity, the proposed method encodes th...
Conference Paper
Full-text available
There is one common problem of satellites which quite widely discussed: the lifespan. An idea to extend operational lifespan of a satellite in orbit, instead of replacing it by a new one, is concluded in On-Orbit Servicing (OOS) projects. The typical scenario of OOS is safe and reliable Rendezvous and Docking (RvD) of the approaching chaser to the...
Article
Full-text available
A novel calibration method based on polar coordinate is proposed. The world coordinates are expressed in the form of polar coordinates, which are converted to world coordinates in the calibration process. In the beginning, the calibration points are obtained in polar coordinates. By transformation between polar coordinates and rectangular coordinat...
Article
Full-text available
This paper presents a novel two-step camera calibration method in a GPS/INS/Stereo Camera multi-sensor kinematic positioning and navigation system. A camera auto-calibration is first performed to obtain for lens distortion parameters, up-to-scale baseline length and the relative orientation between the stereo cameras. Then, the system calibration i...

Citations

... However, the rate of the syndrome code has to be chosen by trial and error to balance security, false positive and false negative performance. In our previous work, we proposed the novel use of distributed source coding (DSC) in the problem of establishing visual correspondences between cameras in a rate-efficient manner (Yeo et al. 2008a). We found that descriptors of corresponding features are highly correlated, and describe a framework for applying DSC with transform coding in feature matching given a particular matching constraint. ...
... Different rate constraints can be satisfied by varying the quantization step size used. DSC Descriptors are encoded using the encoding procedure outlined in our earlier work on using distributed source coding with transform coding (Yeo et al. 2008a ). The received messages are decoded using descriptors from camera A as side-information. ...
... For both the baseline and DSC schemes, we consider quantization step sizes ranging from 1.95 × 10 −3 to 6.25 × 10 −2 . In the DSC scheme, we use α = 1.718 and a 24-bit CRC (Yeo et al. 2008a). For both the RP and RP-LDPC schemes, we vary the number of random projection used from 64 to 1024 (per descriptor). ...
Article
Full-text available
Establishing visual correspondences is a critical step in many computer vision tasks involving multiple views of a scene. In a dynamic environment and when cameras are mobile, visual correspondences need to be updated on a recurring basis. At the same time, the use of wireless links between camera motes imposes tight rate constraints. This combination of issues motivates us to consider the problem of establishing visual correspondences in a distributed fashion between cameras operating under rate constraints. We propose a solution based on constructing distance preserving hashes using binarized random projections. By exploiting the fact that descriptors of regions in correspondence are highly correlated, we propose a novel use of distributed source coding via linear codes on the binary hashes to more efficiently exchange feature descriptors for establishing correspondences across multiple camera views. A systematic approach is used to evaluate rate vs visual correspondences retrieval performance; under a stringent matching criterion, our proposed methods demonstrate superior performance to a baseline scheme employing transform coding of descriptors.
... In this paper, we propose a method which uses binarized random projections and linear codes. One application where such a problem needs to be solved is that of determining visual correspondences in a distributed fashion between cameras in a wireless camera network [1, 2, 3]. This is a critical step for computer vision tasks such as camera calibration, novel view rendering, object recognition and scene understanding. ...
... To determine correspondences, Cheng et al. introduced a feature digest which applies Principal Components Analysis (PCA) at each camera on feature descriptors and then sends only the top principal components [1]. Yeo et al. exploited the correlation between descriptors of features in correspondence for rate savings by using distributed source coding (DSC) [2], θ O δ x ydimensional case can always be reduced to a 2-D case, in the plane formed by x, y, and the origin. The angle subtended by the rays from the origin to x and y in this plane can be found using simple trigonometry to be θ = 2 sin −1 (δ/2). ...
Conference Paper
Full-text available
We investigate a practical approach to solving one instantia- tion of a distributed hypothesis testing problem under severe rate constraints that shows up in a wide variety of applica- tions such as camera calibration, biometric authenticatio n and video hashing: given two distributed continuous-valued ran- dom sources, determine if they satisfy a certain Euclidean dis- tance criterion. We show a way to convert the problem from continuous-valued to binary-valued using binarized random projections and obtain rate savings by applying a linear syn- drome code. In finding visual correspondences, our approach uses just 49% of the rate of scalar quantization to achieve the same level of retrieval performance. To perform video hash- ing, our approach requires only a hash rate of 0.0142 bpp to identify corresponding groups of pictures correctly.
... Furthermore, per- forming PCA locally is computationally taxing. Yeo et al. ex- ploited the correlation between descriptors of features in cor- respondence for further rate gains by using distributed source coding (DSC) [5]. Their framework also allows for a princi- pled way of performing bit allocation based on estimated de- scriptor statistics. ...
... The first scheme, which we term "Plain", is that of simply quantizing the descriptor coefficients after applying a linear de-correlating transform [4]; thus the quantization cho- sen will impact the rate used. The second scheme, which we term "DSC", not only quantizes the descriptor coefficients, but also uses a DSC framework to exploit correlation between corresponding descriptors for additional rate savings [5]. ...
Conference Paper
Full-text available
We consider the problem of establishing visual correspondences in a distributed and rate-efficient fashion by broadcasting compact descriptors. Establishing visual correspondences is a critical task before other vision tasks can be performed in a wireless camera network. We propose the use of coarsely quantized random projections of descriptors to build binary hashes, and use the Hamming distance between binary hashes as the matching criterion. In this work, we derive the analytic relationship of Hamming distance between the binary hashes to Euclidean distance between the original descriptors. We present experimental verification of our result, and show that for the task of finding visual correspondences, sending binary hashes is more rate-efficient than prior approaches.
... We do assume that the cameras have been calibrated and are fixed. However, even if this is not the case, there exist solutions for continuously calibrating cameras in a distributed and rate-efficient manner [26,140], as we will discuss in Part II of this dissertation. The work presented in this chapter is joint work with Kannan Ramchandran, and has been presented in part in [146,149,148]. ...
Article
We present a novel framework for robustly delivering video data from distributed wireless camera networks that are characterized by packet drops. The main focus in this work is on robustness which is imminently needed in a wireless setting. We propose two alternative models to capture interview correlation among cameras with overlapping views. The view-synthesis-based correlation model requires at least two other camera views and relies on both disparity estimation and view interpolation. The disparity-based correlation model requires only one other camera view and makes use of epipolar geometry. With the proposed models, we show how interview correlation can be exploited for robustness through the use of distributed source coding. The proposed approach has low encoding complexity, is robust while satisfying tight latency constraints and requires no intercamera communication. Our experiments show that on bursty packet erasure channels, the proposed H.263+<sup>1</sup> based method outperforms baseline methods such as H.263+ with forward error correction and H.263+ with intra refresh by up to 2.5 dB. Empirical results further support the relative insensitivity of our proposed approach to the number of additional available camera views or their placement density.
Chapter
The problem of wide-area surveillance from a set of cameras without reliance on calibration has been approached by many through the computation of “connectivity/visibility graphs.” However, even after constructing these graphs, it is not possible to recognize features such as holes in the coverage. The difficulty is not due to the techniques used for finding connectivity between camera views, but rather due to the lack of information in visibility graphs. We propose a refined combinatorial representation of a network using simplices instead of edges and provide a mathematical framework (along with simulation and experimental results) to show that this representation contains, at the very least, accurate topological information such as the number of holes in network coverage. We also discuss ways in which this construct can be used for tracking, path identification, and coordinate-free navigation.