Fig 10 - uploaded by Amnon Shashua
Content may be subject to copyright.
The left two columns show some of the facial expression captured with the stationary pair of cameras. The right most column show some virtual views. The pre-process stage of computing the seed tensor was done only once for the upper pair of images. For each additional pair we start from the same seed tensor.

The left two columns show some of the facial expression captured with the stationary pair of cameras. The right most column show some virtual views. The pre-process stage of computing the seed tensor was done only once for the upper pair of images. For each additional pair we start from the same seed tensor.

Source publication
Article
Full-text available
Presents a new method for synthesizing novel views of a 3D scene from two or three reference images in full correspondence. The core of this work is the use and manipulation of an algebraic entity, termed the “trilinear tensor”, that links point correspondences across three images. For a given virtual camera position and orientation, a new trilinea...

Context in source publication

Context 1
... parameters and reproject the novel view. The result is a y-through around a \talking head" as can be seen in Fig 10. Notice that in all the examples contain a considerable degree of extrapolation (i.e., views that are outside the viewing cone of the original two model views). ...

Similar publications

Conference Paper
Full-text available
A novel method is proposed for image interpolation. It is assumed that the pixel correlation between local regions across scales would remain similar. In addition, this a priori similarity could be extracted from a set of available image data that have the same content but different resolutions. A simple architecture is devised to estimate the corr...
Article
Full-text available
Soft hyperelastic composite structures that integrate soft hyperelastic material and linear elastic hard material can undergo large deformations while isolating high strain in specified locations to avoid failure. This paper presents an effective topology optimization-based methodology for seeking the optimal united layout of hyperelastic composite...
Article
Full-text available
This paper presents the formulation of a computational model for overlapping multiple sprinklers. Input data required by the model are the water application profile, desired geometrical layout and degree of spacing. A two step process of interpolation and superimposition were employed to convert the water application depths on radial lines onto squ...
Conference Paper
Full-text available
We present a new image morphing approach in which the output sequence is regenerated from small pieces of the two source (input) images. The approach does not require manual correspondence, and generates compelling results even when the images are of very different objects (e.g., a cloud and a face). We pose the morphing task as an optimization wit...
Article
Full-text available
This paper presents a multidimensional interpolation method and a numerical integration for a bounded region using boundary integral equations and a polyharmonic function. In the method using B-spline, points must be assigned in a gridiron layout for two-dimensional cases. In the method presented in this paper, using the polyharmonic function, arbi...

Citations

... The original NeRF method employs Multilayer Perceptron (MLP) [32] and implicit volume rendering techniques to render new images from novel viewpoints. NeRF achieves better visual quality representation compared to other traditional novel view synthesis methods [1][2][3]. Subsequent work on improving and optimizing NeRF demonstrates its excellent potential. However, the original NeRF approach demands considerable computational resources and training time, and it can only deal with some small-scale scenes. ...
Article
Full-text available
Recently, there have been significant developments in the realm of novel view synthesis relying on radiance fields. By incorporating the Splatting technique, a new approach named Gaussian Splatting has achieved superior rendering quality and real-time performance. However, the training process of the approach incurs significant performance overhead, and the model obtained from training is very large. To address these challenges, we improve Gaussian Splatting and propose Frequency-Importance Gaussian Splatting. Our method reduces the performance overhead by extracting the frequency features of the scene. First, we analyze the advantages and limitations of the spatial sampling strategy of the Gaussian Splatting method from the perspective of sampling theory. Second, we design the Enhanced Gaussian to more effectively express the high-frequency information, while reducing the performance overhead. Third, we construct a frequency-sensitive loss function to enhance the network’s ability to perceive the frequency domain and optimize the spatial structure of the scene. Finally, we propose a Dynamically Adaptive Density Control Strategy based on the degree of reconstruction of the background of the scene, which adaptive the spatial sample point generation strategy dynamically according to the training results and prevents the generation of redundant data in the model. We conducted experiments on several commonly used datasets, and the results show that our method has significant advantages over the original method in terms of memory overhead and storage usage and can maintain the image quality of the original method.
... Novel view synthesis [2,3,15] from dynamic scenes aims to generate photorealistic views at arbitrary viewpoints and any time step with given images from one or more cameras as input. It provides the possibility to generate freeview rendering using finite-view input and brings a lifelike representation. ...
Preprint
Full-text available
Representing and synthesizing novel views in real-world dynamic scenes from casual monocular videos is a long-standing problem. Existing solutions typically approach dynamic scenes by applying geometry techniques or utilizing temporal information between several adjacent frames without considering the underlying background distribution in the entire scene or the transmittance over the ray dimension, limiting their performance on static and occlusion areas. Our approach $\textbf{D}$istribution-$\textbf{D}$riven neural radiance fields offers high-quality view synthesis and a 3D solution to $\textbf{D}$etach the background from the entire $\textbf{D}$ynamic scene, which is called $\text{D}^4$NeRF. Specifically, it employs a neural representation to capture the scene distribution in the static background and a 6D-input NeRF to represent dynamic objects, respectively. Each ray sample is given an additional occlusion weight to indicate the transmittance lying in the static and dynamic components. We evaluate $\text{D}^4$NeRF on public dynamic scenes and our urban driving scenes acquired from an autonomous-driving dataset. Extensive experiments demonstrate that our approach outperforms previous methods in rendering texture details and motion areas while also producing a clean static background. Our code will be released at https://github.com/Luciferbobo/D4NeRF.
... Chum et al. [29] applied OPG in wide-baseline stereo matching, not restricted to a translation. Avidan and Shashua [30] apparently make use of OPG concepts expressed by the tensor algebra and recover epipolar points for resolving the visibility problems in view synthesis. Nardi et al. [31] presented an approach for seeing through visual occlusions while creating a single image from a set of uncalibrated images taken from multiple viewpoints. ...
... Our approach resolved contradiction between two or more image pair occlusion resolutions, apparently like Nardi et al. [31], since FP-PIV of a set of images is constructed by matching point correspondences, and all matchable points are transferrable points. Contrarily, in the proposed variety formulation, there is no such requirement of epipole estimation or fundamental matrix computation, as described in [28], [30], and [31]. We implicitly used image-space parameterization of a set of images to maintain projective coherence at a novel viewpoint. ...
Article
Full-text available
This paper presents a new hybrid Kinect-variety based synthesis scheme that renders artifacts-free multiple views for autostereoscopic/ auto-multiscopic displays. The proposed approach does not explicitly require dense scene depth information for synthesizing novel views from arbitrary viewpoints. Instead, the integrated framework first constructs a consistent minimal image-space parameterization of the underlying 3D scene. The compact representation of scene structure is formed using only implicit sparse depth information of few reference scene points extracted from raw RGB-D data. The views from arbitrary positions can be inferred by moving the novel camera in parameterized space by enforcing Euclidean constraints on reference scene images under a full-perspective projection model. Unlike state-of-the-art DIBR methods, where input depth map accuracy is crucial for high quality output, our proposed algorithm does not depend on precise per-pixel geometry information. Therefore, it simply sidesteps to recover and refine the incomplete or noisy depth estimates with advanced filling or upscaling techniques. Our approach performs fairly well in unconstrained indoor/outdoor environments, where the performance of range sensors or dense depth-based algorithms could be seriously affected due to scene complex geometric conditions. We demonstrate that the proposed hybrid scheme provides guarantees on the completeness, optimality with respect to the inter-view consistency of the algorithm. In the experimental validation, we performed a quantitative evaluation as well as subjective assessment on scene with complex geometric or surface properties. Comparison with latest representative DIBR methods is additionally performed to demonstrate the superior performance of the proposed scheme.
... When cameras are calibrated, i.e. when both internal and external parameters are available, given the depth of an image point, it is straightforward When dealing with 3DS conversion of monocular footage, however, calibration data is hardly available, which makes the problem more challenging for several reasons. First of all, depth cannot be used, and suitable depth-proxies must be defined, together with proper warping functions based on fundamental matrices [17], trilinear tensors [18], or plane-parallax representation [19,20]. Second, specifying the external orientation (position and attitude) of virtual views is unnatural, since they are embedded in a projective frame, linked to the Euclidean one by an unknown projective transformation. ...
Conference Paper
Full-text available
In this paper we present a method for the generation of 3D Stereo (3DS) pairs to be used for 3D visualization of sequences of historical aerial photographs. The goal of our work is to provide a 3D visualization solution even when the existing images are a monocular sequence. Each input image is processed using neighboring views and a synthetic image is rendered. The synthetic image and the original input image form a 3D stereo pair. We validate our method by showing results on real images taken from a historical photo archive. The reported results are promising and they corroborate the idea of generating 3DS data from monocular footage.
... The uncalibrated view synthesis (UVS) is less explored and more challenging for several reasons. First of all, depth cannot be used in uncalibrated situations, and suitable depth-proxies must be defined, together with proper warping functions based on fundamental matrices [16], trilinear tensors [2], or plane-parallax representation [13, 22]. Second, specifying the external orientation (position and attitude) of virtual views is unnatural, since they are embedded in a projective frame, linked to the Euclidean one by an unknown projective transformation. ...
Conference Paper
In this paper we confront the problem of uncalibrated view synthesis, i.e. rendering novel images from two, or more images without any knowledge on camera parameters. The method builds on the computation of planar parallax and focuses on the application of converting a monocular image sequence to a 3D stereo video, a problem that requires the positioning of the virtual camera outside the actual motion trajectory. The paper addresses both geometric and practical issues related to the rendering. We validate our method by showing both quantitative and qualitative results.
... For example, given the internal and external parameters of the camera, and the depth of a scene point (with respect to the camera), it is easy to obtain the position of the point in any synthetic view [13]. Where no knowledge on the imaging device can be assumed, uncalibrated point transfer techniques utilize image-to-image constraints such as the fundamental matrices [11], homographies [3], trifocal tensors [2], plane+parallax [10], or relative affine structure [15,16] to re-project pixels from a small number of reference images to a given view. All these techniques requires some user interaction in order to specify a position for the virtual camera. ...
Conference Paper
Full-text available
This paper presents a novel approach to uncalibrated view synthesis that overcomes the sensitivity to the epipole of existing methods. The approach follows a interpolate-then-derectify scheme, as opposed to the previous derectify-then-interpolate strategy. Both approaches generate a trajectory in an uncalibrated framework that is related to a specific Euclidean counterpart, but our method yields a warping map that is more resilient to errors in the estimate of the epipole, as it is confirmed by synthetic experiments.
... Our algorithm calculates the projective depth from the viewpoint of geometrical meaning. It first uses the trifocal tensor 18 to find corresponding points from the three adjacent images and reconstructs the points between the two adjacent ones, and then calculates the projective depth according to this new corollary. It should be noted that errors will be introduced inevitably when integrating the partial models. ...
Article
The authors propose a new algorithm for 3-D construction from a sequence of images based on a new corollary that the projective depth can be comprehended as a scalar factor between two 3-D points when reconstructed from three images. This algorithm constructs the partial models of an object using two consecutive images in the image sequence and integrates the obtained partial models to a complete one on the basis of this new corollary. In order to avoid accumulation of errors in the integration process, this algorithm modifies the reconstruction results based on a simplified Iterative Closest Point (ICP) algorithm. We have carried out two groups of experiments based on images captured from a library environment. In one group of experiments, sparse points are used to reconstruct regular objects; in the other group of experiments, dense points are employed. We compared experimental results of the proposed algorithm with the optimization method using the fundamental matrix, which demonstrated that the proposed algorithm yielded better efficiency and accuracy of the 3-D reconstruction. Experiments also showed that the reconstruction errors of the proposed method were within 5%.
... For example, in [5] a virtual 3D camera path can be synthesized, provided that parallax information can be referred to the homography of the plane at infinity. The availability of three or more images has also been exploited for the purpose of novel view synthesis through trifocal tensors [6], [7]. Our approach propagates visual information through the basic tools of epipolar geometry and fundamental matrices. ...
Conference Paper
Full-text available
We present an approach for merging into a single super-image a set of uncalibrated images of a general 3D scene taken from multiple viewpoints. To this aim, the content of either image is augmented with visual information taken from the others, while maintaining projective coherence. The approach extends the usual mosaicing techniques to image collections with3D parallax, and operates like a virtual sensor provided with an enlarged field of view and the capability of seeing through visual occlusions in an "X-ray" fashion. Fundamental matrices are used to transfer visual information through the vertexes of an image graph. A dense stereo paradigm is employed to achieve photorealism by partitioning image pairs into corresponding regions. Results in oriented projective geometry are then exploited to both detect and handle occlusions by assessing the visibility properties of each transferred point.
... However, the gaping disconnect between high bandwidth image sensors (up to 1280 × 1024 pixels @ 15 fps) and low bandwidth communications channels (a maximum of 250 kbps per IEEE 802.15.4 channel including overhead) makes the exchange of all captured views across the cameras impractical (Chen et al. 2008). Many computer vision tasks relevant to camera networks, such as calibration procedures (Hartley and Zisserman 2000; Ma et al. 2004), localization (Se et al. 2002), vision graph building (Cheng et al. 2007), object recognition (Ferrari et al. 2004; Lowe 2004; Berg et al. 2005 ), novel view rendering (Avidan and Shashua 1998; Shum and Kang 2000) and scene understanding (Franke and Joos 2000; SchaffalitzkyFig. 1 Problem setup. We address a " dense " wireless camera network that has many cameras observing the scene of interest. ...
Article
Full-text available
Establishing visual correspondences is a critical step in many computer vision tasks involving multiple views of a scene. In a dynamic environment and when cameras are mobile, visual correspondences need to be updated on a recurring basis. At the same time, the use of wireless links between camera motes imposes tight rate constraints. This combination of issues motivates us to consider the problem of establishing visual correspondences in a distributed fashion between cameras operating under rate constraints. We propose a solution based on constructing distance preserving hashes using binarized random projections. By exploiting the fact that descriptors of regions in correspondence are highly correlated, we propose a novel use of distributed source coding via linear codes on the binary hashes to more efficiently exchange feature descriptors for establishing correspondences across multiple camera views. A systematic approach is used to evaluate rate vs visual correspondences retrieval performance; under a stringent matching criterion, our proposed methods demonstrate superior performance to a baseline scheme employing transform coding of descriptors.
... There are methods for synthesising novel views of a 3D scene from two reference images in full correspondence. In [8], an algebraic entity termed trilinear tensor links point correspondence between three images. For any given virtual camera position and orientation, a new trilinear tensor can be computed based on the original tensor of the reference images and the desired view can be created using it and the point correspondences across two of the reference images. ...
Article
Full-text available
We propose a method for creating a virtual world that integrates synthetic objects with images captured with a camera. A virtual camera is defined to produce different views of the 3D synthetic objects integrated with the real image. The system includes image capturing, 3D scene creation composed with the real images, and finally the coherent integration between them. Particularly we use the graphic library of Open Inventor * , but the system could be implemented with any computer graphic tool such as VRML, Java 3D, etc.