Fig 8 - uploaded by Min H. Kim
Content may be subject to copyright.
Horizontal pans in start and end clips affect empty areas in full 3D static transitions. Right: Still image content is projected onto the scene from the transition start and end frames, so the dominant content direction shown as a green arrow differs from the full 3D dynamic case (Figure 7) and is consistently between frames. 

Horizontal pans in start and end clips affect empty areas in full 3D static transitions. Right: Still image content is projected onto the scene from the transition start and end frames, so the dominant content direction shown as a green arrow differs from the full 3D dynamic case (Figure 7) and is consistently between frames. 

Source publication
Article
Full-text available
Emerging interfaces for video collections of places attempt to link similar content with seamless transitions. However, the automatic computer vision techniques that enable these transitions have many failure cases which lead to artifacts in the final rendered transition. Under these conditions, which transitions are preferred by participants and w...

Similar publications

Conference Paper
Full-text available
Owing to the movie industry's wide use of 2D-to-3D conversion techniques, the problem of synthesizing stereoscopic views using depth-image-based rendering (DIBR) is extremely important. A major challenge for DIBR is processing semitransparent edges near depth-map discontinuities. Existing approaches either can only deal with simple cases in which t...
Preprint
Full-text available
Rendering articulated objects while controlling their poses is critical to applications such as virtual reality or animation for movies. Manipulating the pose of an object, however, requires the understanding of its underlying structure, that is, its joints and how they interact with each other. Unfortunately, assuming the structure to be known, as...
Article
Full-text available
Traditional stereoscopic displays assume the viewer to be standing at a specific location, that is the same pose (relative to the screen) of the stereo camera pair that depicted the scene (physically or virtually). Even for basic applications, such as movies or games, this leads to visual inconsistencies as soon as the user moves his head. Moreover...
Chapter
Full-text available
Expanded and modified version of a presentation in the “Große Werke des Films” lecture series at the University of Augsburg in January 2017. The chapter is in part an introduction to the movie and in part an exemplary discussion of the mechanisms of canonicity that make movies (and by extension, other media artefacts) ‘great’.
Article
Full-text available
Communication is one of the most important parts of human life. We cannot live without communicating with each other. There are many forms and mediums of communication among which cinema is considered one of the most important modes of communication. Not only this it also has the power to influence the mass and bring about some changes in the socie...

Citations

... Tompkin et al. [112] made a study comparing different types of transition in a video to determine users' preference. For their study they captured their own videos. ...
Thesis
In multi-view capture, the focus of attention can be controlled by the viewers rather than by a director, which implies that each viewer can observe a unique point of view. Therefore, this requires placing cameras around the scene to be captured, which could be very expensive. Generating virtual cameras to replace part of the real cameras in the scene reduces the cost of setting up multi-view video. This thesis focuses on generating virtual video transitions in scenes captured by multi-view video to virtually move from one real viewpoint to another in the same scene. The fewer real cameras we use, the less expensive is required in the multi-view video; however, the larger the baseline is. View synthesis methods have attracted our attention as an approach to our problem. However, in the literature, these methods still suffer from visual artifacts in the final rendered image due to occlusions in the new target virtual view. As a first step, we propose a hybrid approach to view synthesis. We first warp the reference views by correcting the occlusions. We merge the pre-processed views via a simple convolution architecture. Warping the reference views reduces the distance between the reference views and the size of the convolutional filters and thus reduces the complexity of the network. Next, we present a hybrid approach. We merge the pre-warped views via a residual encoder-decoder with a Siamese encoder to keep the parameters low. We also propose a hole inpainting algorithm to fill in disocclusions in warped views. In addition, we focus on the quality of user experience for the video transition and the database. First, we perform a creative dataset for the quality of experience of the video transition. Second, we propose an algorithmic-learning-based multiple view synthesis optimizer. The work aims to subjectively evaluate the proposed view synthesis approaches on 8 different video sequences by performing a series of subjective tests.
... Importantly, our goal is not to show view-dependent as superior, since fnding such a result may not be possible or relevant. For example, related studies found cinematography effect preferences diffcult to assess, since they are infuenced by aspects like emotional attachment [11] or contextual factors like scene content and camera motion [42]. Consider how a dissolve transition is not better than a wipe transition; each may be used in different contexts and for different creative purposes. ...
Conference Paper
Full-text available
"View-dependent effects'' have parameters that change with the user's view and are rendered dynamically at runtime. They can be used to simulate physical phenomena such as exposure adaptation, as well as for dramatic purposes such as vignettes. We present a technique for adding view-dependent effects to 360 degree video, by interpolating spatial keyframes across an equirectangular video to control effect parameters during playback. An in-headset authoring tool is used to configure effect parameters and set keyframe positions. We evaluate the utility of view-dependent effects with expert 360 degree filmmakers and the perception of the effects with a general audience. Results show that experts find view-dependent effects desirable for their creative purposes and that these effects can evoke novel experiences in an audience.
... An experiment was also performed for precomputed videos, so that the impact of user's interaction and dynamic aspects of free viewing could be judged. More recently, this work was extended to transitions between videos [37]. Similar studies were also performed in the context of panoramas [23]. ...
Article
Full-text available
Light fields become a popular representation of three-dimensional scenes, and there is interest in their processing, resampling, and compression. As those operations often result in loss of quality, there is a need to quantify it. In this work, we collect a new dataset of dense reference and distorted light fields as well as the corresponding quality scores which are scaled in perceptual units. The scores were acquired in a subjective experiment using an interactive light-field viewing setup. The dataset contains typical artifacts that occur in light-field processing chain due to light-field reconstruction, multi-view compression, and limitations of automultiscopic displays. We test a number of existing objective quality metrics to determine how well they can predict the quality of light fields. We find that the existing image quality metrics provide good measures of light-field quality, but require dense reference light-fields for optimal performance. For more complex tasks of comparing two distorted light fields, their performance drops significantly, which reveals the need for new, light-field-specific metrics.
Article
Packet loss is a significant cause of visual impairments in video broadcasting over packet-switched networks. There are several subjective and objective video quality assessment methods focused on the overall perception of video quality. However, less attention has been paid on the visibility of packet loss artifacts appearing in spatially and temporally limited regions of a video sequence. In this paper, we present the results of a subjective study, using a methodology where a video sequence is displayed on a touchscreen and the users tap it in the positions where they observe artifacts. We also analyze the objective features derived from those artifacts, and propose different models for combining those features into an objective metric for assessing the noticeability of the artifacts. The practical results show that the proposed metric predicts visibility of packet loss impairments with a reasonable accuracy. The proposed method can be applied for developing packetization and error recovery schemes to minimize the subjectively experienced distortion in error-prone networked video systems.
Article
The ultimate goal of many image-based modeling systems is to render photo-realistic novel views of a scene without visible artifacts. Existing evaluation metrics and benchmarks focus mainly on the geometric accuracy of the reconstructed model, which is, however, a poor predictor of visual accuracy. Furthermore, using only geometric accuracy by itself does not allow evaluating systems that either lack a geometric scene representation or utilize coarse proxy geometry. Examples include a light field and most image-based rendering systems. We propose a unified evaluation approach based on novel view prediction error that is able to analyze the visual quality of any method that can render novel views from input images. One key advantage of this approach is that it does not require ground truth geometry. This dramatically simplifies the creation of test datasets and benchmarks. It also allows us to evaluate the quality of an unknown scene during the acquisition and reconstruction process, which is useful for acquisition planning. We evaluate our approach on a range of methods, including standard geometry-plus-texture pipelines as well as image-based rendering techniques, compare it to existing geometry-based benchmarks, demonstrate its utility for a range of use cases, and present a new virtual rephotography-based benchmark for image-based modeling and rendering systems.
Article
The ultimate goal of many image-based modeling systems is to render photo-realistic novel views of a scene without visible artifacts. Existing evaluation metrics and benchmarks focus mainly on the geometric accuracy of the reconstructed model, which is, however, a poor predictor of visual accuracy. Furthermore, using only geometric accuracy by itself does not allow evaluating systems that either lack a geometric scene representation or utilize coarse proxy geometry. Examples include light field or image-based rendering systems. We propose a unified evaluation approach based on novel view prediction error that is able to analyze the visual quality of any method that can render novel views from input images. One of the key advantages of this approach is that it does not require ground truth geometry. This dramatically simplifies the creation of test datasets and benchmarks. It also allows us to evaluate the quality of an unknown scene during the acquisition and reconstruction process, which is useful for acquisition planning. We evaluate our approach on a range of methods including standard geometry-plus-texture pipelines as well as image-based rendering techniques, compare it to existing geometry-based benchmarks, and demonstrate its utility for a range of use cases.