David Nistér's research while affiliated with University of Kentucky and other places

Publications (60)

Article
In this paper, we present results and experiments with several methods for bundle adjustment, producing the fastest bundle adjuster ever published in terms of computation and convergence. From a computational perspective, the fastest methods naturally handle the block-sparse pattern that arises in a reduced camera system. Adapting to the naturally...
Article
Full-text available
We present an approach for automatic 3D reconstruction of outdoor scenes using computer vision techniques. Our system collects video, GPS and INS data which are processed in real-time to produce geo-registered, detailed 3D models that represent the geometry and appearance of the world. These models are generated without manual measurements or marke...
Article
Full-text available
We investigate two interactive techniques for registering an image to 3D digital terrain and building models. Registering an image enables a variety of applications, including slide-shows with context, automatic annotation, and photo enhance-ment. To perform the registration, we investigate two modes of interaction. In the overlay interface, an ima...
Article
In this paper, we formulate a stereo matching algorithm with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global matching stereo model based on an energy-minimization framework. The global energy contains two terms, the data term and the smoothness term. The data term is first approximated by a color-weight...
Conference Paper
In this paper we present a new algorithm for computing Maximally Stable Extremal Regions (MSER), as invented by Matas et al. The standard algorithm makes use of a union-find data structure and takes quasi-linear time in the number of pixels. The new algorithm provides exactly identical results in true worst-case linear time. Moreover, the new algor...
Article
Full-text available
The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing ste...
Article
Assume that we have two perspective images with known intrinsic parameters except for an unknown common focal length. It is a minimally constrained problem to find the relative orientation between the two images given six corresponding points. To this problem which to the best of our knowledge was unsolved we present an efficient solver. Through nu...
Conference Paper
Full-text available
In this paper we present a novel approach to directly recover the location of both microphones and sound sources from time-difference-of-arrival measurements only. No approximation solution is required for initialization and in the absence of noise our approach is guaranteed to always recover the exact solution. Our approach only requires solving l...
Conference Paper
In this paper we present a highly scalable vision-based localization and mapping method using image collections. A topological world representation is created online during robot exploration by adding images to a database and maintaining a link graph. An efficient image matching scheme allows real-time mapping and global localization. The compact i...
Conference Paper
This paper presents minimal solutions for the geometric parameters of a camera rotating about its optical centre. In particular we present new 2 and 3 point solutions for the homography induced by a rotation with 1 and 2 unknown focal length parameters. Using tests on real data, we show that these algorithms outperform the standard 4 point linear h...
Conference Paper
This paper presents a general method, based on Galois theory, for establishing that a problem can not be solved by a 'machine' that is capable of the standard arithmetic operations, extraction of radicals (that is, m-th roots for any m), as well as extraction of roots of polynomials of degree smaller than n, but no other numerical operations. The m...
Conference Paper
We present an autocalibration algorithm for upgrading a projective reconstruction to a metric reconstruction by estimating the absolute dual quadric. The algorithm enforces the rank degeneracy and the positive semidefiniteness of the dual quadric as part of the estimation procedure, rather than as a post-processing step. Furthermore, the method all...
Conference Paper
In this paper we investigate how to scale a content based image retrieval approach beyond the RAM limits of a sin- gle computer and to make use of its hard drive to store the feature database. The feature vectors describing the im- ages in the database are binned in multiple independent ways. Each bin contains images similar to a representative pro...
Conference Paper
We present a new post-processing step to enhance the resolution of range images. Using one or two registered and potentially high-resolution color images as reference, we iteratively refine the input low-resolution range image, in terms of both its spatial resolution and depth preci- sion. Evaluation using the Middlebury benchmark shows across-the-...
Conference Paper
Given five motion vectors observed in a calibrated cam- era, what is the rotational and translational velocity of the camera? This problem is the infinitesimal motion analogue tothefive-pointrelativeorientationproblem, whichhaspre- viously been solved through the derivation of a tenth-degree polynomial and extraction of its roots. Here, we present...
Article
Full-text available
This paper introduces a multi-view stereo matcher that generates depth in real-time from a monocular video stream of a static scene. A key feature of our processing pipeline is that it estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in the stereo stage without impacting the real-time performa...
Conference Paper
Full-text available
We present a viewpoint-based approach for the quick fu- sion of multiple stereo depth maps. Our method selects depth estimates for each pixel that minimize violations of visibility constraints and thus remove errors and inconsis- tencies from the depth maps to produce a consistent surface. We advocate a two-stage process in which the first stage ge...
Conference Paper
This paper shows that structure from motion is NP-hard for most sensible cost functions when missing data is allowed. The result provides a fundamental limitation of what is possible to achieve with any structure from motion algorithm. Even though there are recent, promising attempts to compute globally optimal solutions, there is no hope of obtain...
Article
It is a well known classical result that given the image projections of three known world points it is possible to solve for the pose of a calibrated perspective camera to up to four pairs of solutions. We solve the Generalised problem where the camera is allowed to sample rays in some arbitrary but known fashion and is not assumed to perform a cen...
Article
Full-text available
The paper introduces a data collection system and a processing pipeline for automatic geo-registered 3D reconstruction of urban scenes from video. The system collects multiple video streams, as well as GPS and INS measurements in order to place the reconstructed models in geo-registered coordinates. Besides high quality in terms of both geometry an...
Conference Paper
Full-text available
We present a stereo algorithm that achieves high quality results while maintaining real-time performance. The key idea is simple: we introduce an adaptive aggregation step in a dynamic-programming (DP) stereo framework. The per-pixel matching cost is aggregated in the vertical direc- tion only. Compared to traditional DP, our approach re- duces the...
Article
This paper presents a novel version of the five-point relative orientation algorithm given in Nister [Nister, D., 2004. An efficient solution to the five-point relative pose problem, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26 (6), 756–770]. The name of the algorithm arises from the fact that it can operate even on the minima...
Article
We present a method for estimating the global uncertainty of epipolar geometry with applications to autonomous vehicle navigation. Such uncertainty information is necessary for making informed decisions regarding the confi-dence of a motion estimate, since we must otherwise accept the estimate without any knowledge of the probability that the estim...
Article
Full-text available
Suppose two perspective views of four world points are given and that the intrinsic parameters are known but the camera poses and the world point positions are not. We prove that the epipole in each view is then constrained to lie on a curve of degree ten. We derive the equation for the curve and establish many of the curve's properties. For exampl...
Conference Paper
In this paper, we formulate an algorithm for the stereo matching problem with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global matching stereo model based on an energy- minimization framework. The global energy contains two terms, the data term and the smoothness term. The data term is first approximated...
Conference Paper
A recognition scheme that scales efficiently to a large number of objects is presented. The efficiency and quality is exhibited in a live demonstration that recognizes CD-covers from a database of 40000 images of popular music CD’s. The scheme builds upon popular techniques of indexing descriptors extracted from local regions, and is robust to back...
Article
We present a system that estimates the motion of a stereo head or a single moving camera based on video input. The system operates in real-time with low delay and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched between pairs of frames and linked into image traject...
Conference Paper
In this paper, we present a belief propagation based global algorithm that gen- erates high quality results while maintaining real-time performance. To our knowledge, it is the first BP based global method that runs at real-time speed. Our efficiency performance gains mainly from the parallelism of graphics hardware,which leads to a 45 times speedu...
Conference Paper
We present an attempt to determine whether the shape of a generic central-projection camera, such as the eye of an insect or a log-polar camera, can be determined from two motion flows resulting from purely rotational motions with non-collinear axes. Our first contribution is to write the smooth non-parametric calibration problem as a differ- entia...
Conference Paper
In this paper we investigate the status of bundle adjustment as a component of a real-time camera tracking system and show that with current computing hardware a significant amount of bundle adjustment can be performed every time a new frame is added, even under stringent real-time constraints. We also show, by quantifying the failure rate over lon...
Article
In this paper we investigate the status of bundle adjustment as a component of a real-time camera tracking system and show that with current computing hardware a significant amount of bundle adjustment can be performed every time a new frame is added, even under stringent real-time constraints. We also show, by quantifying the failure rate over lon...
Article
A system capable of performing robust live ego-motion estimation for perspective cameras is presented. The system is powered by random sample consensus with preemptive scoring of the motion hypotheses. A general statement of the problem of efficient preemptive scoring is given. Then a theoretical investigation of preemptive scoring under a simple i...
Conference Paper
In this paper, we develop a theory of non-parametric self-calibration. Recently, schemes have been devised for non-parametric laboratory calibration, but not for self-calibration. We allow an arbitrary warp to model the intrinsic mapping, with the only restriction that the camera is central and that the intrinsic mapping has a well-defined non-sing...
Conference Paper
Full-text available
We present a solution for optimal triangulation in three views. The solution is guaranteed to find the optimal solution because it computes all the stationary points of the (maximum likelihood) objective function. Internally, the solution is found by computing roots of multivariate polynomial equations, directly solving the conditions for stationar...
Conference Paper
We present a quality assessment procedure for correspondence estimation based on geometric coherence rather than ground truth. The procedure can be used for performance evaluation of correspondence extraction schemes developed by researchers, as well as for online learning and adaptation aimed at better system performance. A very important aspect o...
Article
We propose a general framework for aligning continuous (oblique) video onto 3D sensor data. We align a point cloud computed from the video onto the point cloud directly obtained from a 3D sensor. This is in contrast to existing techniques where the 2D images are aligned to a 3D model derived from the 3D sensor data. Using point clouds enables the a...
Conference Paper
Full-text available
Assume that we have two perspective images with known intrinsic parameters except for an unknown common focal length. It is a minimally constrained problem to find the relative orientation between the two images given six corresponding points. We present an efficient solution to the problem and show that there are 15 solutions in general (including...
Article
In this paper we explore the relative efficiency of various data-driven sampling techniques for estimating the epipolar ge-ometry and its global uncertainty. We explore standard fully data-driven methods, specifically the five-point, seven-point, and eight-point methods. We also explore what we refer to as partially data-driven methods, where in th...
Article
Full-text available
We present a method to obtain the solutions to the general- ized 6-point relative pose problem. The problem is to find the relative positions of two generalized cameras so that six corresponding image rays meet in space. Here, a general- ized camera is a camera that captures some arbitrary set of rays and does not adhere to the central perspective...
Article
A method for upgrading a projective reconstruction to metric is presented. The method compares favourably to state of the art algorithms and has been found extremely reliable for both large and small reconstructions in a large number of experiments on real data. The notion of a twisted pair is generalized to the uncalibrated case. The reconstructio...
Conference Paper
This paper gives a summary of automatic passive reconstruction of 3D scenes from images and video. Features are tracked between the images. Relative camera poses are then estimated based on the feature tracks. Both the feature tracking and the robust method used to estimate the camera poses has recently been shown to provide robust real-time perfor...
Article
An efficient algorithmic solution to the classical five-point relative pose problem is presented. The problem is to find the possible solutions for relative camera pose between two calibrated views given five corresponding points. The algorithm consists of computing the coefficients of a tenth degree polynomial in closed form and, subsequently, fin...
Conference Paper
Full-text available
Suppose that two perspective views of four world points are given, that the intrinsic parameters are known, but the camera poses and the world point positions are not. We prove that the epipole in each view is then con- strained to lie on a curve of degree ten. We give the equation for the curve and establish many of the curve's properties. For exa...
Conference Paper
We propose a general framework for aligning continuous (oblique) video onto 3D sensor data. We align a point cloud computed from the video onto the point cloud directly obtained from a 3D sensor. This is in contrast to existing techniques where the 2D images are aligned to a 3D model derived from the 3D sensor data. Using point clouds enables the a...
Conference Paper
We present a system that estimates the motion of a stereo head or a single moving camera based on video input. The system operates in real-time with low delay and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched between pairs of frames and linked into image traject...
Conference Paper
It is a well known classical result that given the image projections of three known world points it is possible to solve for the pose of a calibrated perspective camera to up to four pairs of solutions. We solve the generalised problem where the camera is allowed to sample rays in some arbitrary but known fashion and is not assumed to perform a cen...
Conference Paper
A system capable of performing robust live ego-motion estimation for perspective cameras is presented. The system is powered by random sample consensus with preemptive scoring of the motion hypotheses. A general statement of the problem of efficient preemptive scoring is given. Then a theoretical investigation of preemptive scoring under a simple i...
Article
An efficient algorithmic solution to the classical five-point relative pose problem is presented. The problem is to find the possible solutions for relative camera motion between two calibrated views given five corresponding points. The algorithm consists of computing the coefficients of a tenth degree polynomial and subsequently finding its roots....
Article
In a typical security and monitoring system a large number of networked cameras are installed at fixed positions around a site under surveillance. There is generally no global view or map that shows the guard how the views of different cameras relate to one another. Individual cameras may be equipped with pan, tilt and zoom capabilities, and the gu...
Conference Paper
Full-text available
Videos and 3D models have traditionally existed in separate worlds and as distinct representations. Although texture maps for 3D models have been traditionally derived from multiple still images, real-time mapping of live videos as textures on 3D models has not been attempted. This paper presents a system for rendering multiple live videos in real-...
Conference Paper
The topic of this first panel session was algorithms and computations. Bill Triggs chaired the discussion and David Nister, Kenichi Kanatani, Jean Ponce and Zhengyou Zhang also participated. Each panelist discussed the issues that he felt were going to be important in the future. The panel session was followed by some questions and discussions whic...
Conference Paper
A method for upgrading a projective reconstruction to metric is presented. The reconstruction is first transformed by considering cheirality so that the convex hull of the set of camera projection centres is the same as in the metric counterpart. The method then proceeds iteratively and starting from such a reconstruction is a necessary condition f...
Conference Paper
This paper considers projective reconstruction with a hierarchical computational structure of trifocal tensors that integrates feature tracking and geometrical validation of the feature tracks. The algorithm was embedded into a system aimed at completely automatic Euclidean reconstruction from uncalibrated handheld amateur video sequences. The algo...
Conference Paper
A frame decimation scheme is proposed that makes automatic extraction of Structure and Motion (SaM) from handheld sequences more practical. Decimation of the number of frames used for the actual SaM calculations keeps the size of the problem manageable, regardless of the input frame rate. The proposed preprocessor is based upon global motion estima...
Conference Paper
This paper presents techniques and animations developed from 1991 to 2000 that use digital photographs of the real world to create 3D models, virtual camera moves, and realistic computer animations. In these projects, images are used to determine the ...

Citations

... The FE processes images and estimates motions. The BE uses preintegrated inertial measurements [28] and manages the topological consistency through local and/or global optimizations of 3D poses and points [29]. The 3D reconstruction module provides a model of shapes and gives geometric properties. ...
... Dense stereo matching to find the disparity for every pixel between two or more images has been actively researched for decades. [1][2][3][4][5][6][7][8][9][10] Stereo matching algorithms are classified into global and local approaches. 1 Local methods utilize the color or intensity values within a finite support window to determine the disparity for each pixel. ...
... The use of inertial measurement needs strong initialization techniques to properly bootstrap the system. For monocular-only cases, the gold standard is to use 5point or 8point algorithm [28]. Whereas VIN systems are way more complicated as they need to estimate metric scale, gravity vector, and bias terms for a fairly accurate estimation. ...
... SLAM algorithms that estimate camera pose from RGB camera frames are called Visual SLAM algorithms and are commonly used to render virtual content in a persistent position in physical space for augmented reality applications [34]. Visual SLAM algorithms work by detecting unique 'feature points' visible to onboard cameras that persist from frame to frame and then triangulate the camera pose from changes in the positions of feature points over time [35]. The tracked location of points together with the data from the inertial sensor can triangulate the position and orientation of the headset as it moves in space. ...
... IV, we will validate our signature's performance on large-scale retrieval tests. For training, 16,000 matching and 16,000 non-matching image pairs are collected from the Oxford Buildings Data Set [19] and the University of Kentucky Benchmark Data Set [20]. For testing, 8,000 matching and 8,000 non-matching image pairs are collected from the Stanford Media Cover Data Set [21] and the Zurich Buildings Data Set [22]. ...
... Recognition built on vocabulary tree with indexing scheme that quantized descriptions from image key points hierarchically, which is used for image similarity indication is described in [23]. Indexing descriptor is computed for local regions. ...
... On the other hand, standard geometric lifting [40] transforms each 3D point into a randomly oriented line whose direction follows a uniform distribution on the unit sphere. Speciale et al. [40] showed it is possible to estimate camera poses from such original line cloud (OLC) by formulating the problem as finding the relative pose between two general cameras [42]. Successive works inspired by this approach include privacy-preserving SfM and SLAM [9,10,38]. ...
... This set of non linear local searching algorithms presents several drawbacks, and therefore it is not common to use them without a previous alignment based on descriptors or another estimation source, such as robot odometry. Due to the non-convex optimization, if the estimated initialization is not sufficiently precise, the probability of falling in a local minimum increases and the convergence becomes slower [35]. ...
... The retrieved water line has to be transfered into object space, which can be expressed as annotation, a term commonly used in computer graphics, of 3D data using registered 2D images , Kehl et al., 2016, Chen et al., 2008. Popular use cases can be found in augmented reality (AR) where virtual image content is projected into the real world using mobile devices as interfaces, e.g. ...
... The conclusions of [14] indicate that the single camera method is less reliable due to inherent ambiguities in calculating the essential matrix between pairs of images to initialise the structure and motion [15]. These ambiguities have been further verified for some specific motions such as pure sideways translation [16]. Later work focuses on stereo camera rigs for the calculation of VO [2, 17, 18]. ...