Figure 6 - uploaded by Belle Tseng
Content may be subject to copyright.
Dynamic Field or Frame Prediction Mode Selection of Interlaced Video Coding For scenes containing stereoscopic objects, temporal disparity c hanges are examined, whether objects move a t v ariable or constant depth. For images with limited object motion, frame picture structure is selected. At relatively constant disparity v alues where objects move in the direction of constant depth, the interlaced frame maintains the same composition in the temporal sequence, therefore predictions with the frame picture structure works better, as shown in Figure 6a. For objects changing in depth position, the disparity b e t ween the left and right images have c hanged as well. As a result, the better selection is the eld picture structure, as demonstrated by Figure 6b. Utilizing frame picture structure eliminates extra bit transmission of similar data from the left and right c hannels, thus contributing more bits to correct prediction errors. To our advantage, frame picture structure is selected most of the time because temporal scene movements tend to have continuous and small changes in disparity v alues. 

Dynamic Field or Frame Prediction Mode Selection of Interlaced Video Coding For scenes containing stereoscopic objects, temporal disparity c hanges are examined, whether objects move a t v ariable or constant depth. For images with limited object motion, frame picture structure is selected. At relatively constant disparity v alues where objects move in the direction of constant depth, the interlaced frame maintains the same composition in the temporal sequence, therefore predictions with the frame picture structure works better, as shown in Figure 6a. For objects changing in depth position, the disparity b e t ween the left and right images have c hanged as well. As a result, the better selection is the eld picture structure, as demonstrated by Figure 6b. Utilizing frame picture structure eliminates extra bit transmission of similar data from the left and right c hannels, thus contributing more bits to correct prediction errors. To our advantage, frame picture structure is selected most of the time because temporal scene movements tend to have continuous and small changes in disparity v alues. 

Source publication
Article
Full-text available
Three approaches of an MPEG-2 compatible coding technique are presented for stereoscopic sequences. The first method utilizes the spatial scalability structure and the second employs the temporal scalability syntax. The scalability extensions of the video coding standard make the processing easier to accommodate the transmission of a stereoscopic v...

Contexts in source publication

Context 1
... images with limited object motion, frame picture structure is selected. At relatively constant disparity v alues where objects move in the direction of constant depth, the interlaced frame maintains the same composition in the temporal sequence, therefore predictions with the frame picture structure works better, as shown in Figure 6a. For objects changing in depth position, the disparity b e t ween the left and right images have c hanged as well. ...
Context 2
... objects changing in depth position, the disparity b e t ween the left and right images have c hanged as well. As a result, the better selection is the eld picture structure, as demonstrated by Figure 6b. ...

Similar publications

Article
Full-text available
The NASA Solar Terrestrial Relations Observatory (STEREO) mission will place two spacecraft into solar orbits with sufficient separation to provide remote sensing instruments with a stereoscopic view of the heliosphere extending from the lower solar corona to beyond one astronomical unit. Analysis of the stereographs returned from the two spacecraf...
Article
Full-text available
We use rotation stereoscopy to estimate the height of a steady-state solar feature relative to the photosphere, based on its apparent motion in the image plane recorded over several days of observation. The stereoscopy algorithm is adapted to work with either one- or two-dimensional data (i.e., from images or from observations that record the proje...
Article
Full-text available
Presented herein is a novel stereoscopic three-dimensional (3D) display system with active color filter glasses. This system provides full-color 3D images by applying the time-multiplexing technique on the original anaglyph method. By switching between the opposite anaglyph statuses, a full-color anaglyph is presented. A liquid crystal panel from a...
Article
Full-text available
This paper presents the development of a oriented-object framework that uses techniques of Virtual Reality. Initially designed to the construction of puncture examination applications, the framework owns functionalities for previously implemented collision detection, stereoscopy and deformation. The main results are presented, discussing the diffic...
Article
Full-text available
Along with the success of the digitally revived stereoscopic cinema, events beyond 3D movies become attractive for movie theater operators, i.e. interactive 3D games. In this paper, we present a case that explores possible challenges and solutions for interactive 3D games to be played by a movie theater audience. We analyze the setting and showcase...

Citations

... In stereoscopic video, scalability usually refers to keeping the non-stereoscopic bitstream as the base layer and putting the residual stereoscopic signal in one or more enhancement layers. In this context, the conventional spatial and temporal scalabilities may then be applied to each of these layers [13]. Finally, frame compatible video format [14] is a class of 3D video formats in which the left and right views are packed together in a single frame and with half of the resolution of the full coded frame, which can then also benefit from conventional scalability [15]. ...
Conference Paper
Full-text available
Both three dimensional (3D) and multi-view video technologies have made noticeable progress and become more popular in recent years. 3D video expands the user's experience beyond the conventional 2D video by adding the sensation of depth, while multi-view video shows the same scenery from different viewpoints. In both cases, huge amount of data need to be compressed and transmitted, making it challenging to support heterogeneous mobile devices with limited bandwidth and processing power. Scalable Multi-view Video Coding is one of the main techniques that addresses this challenge by scaling down the video. However, in addition to conventional scalable modalities of temporal, spatial, quality, and complexity in 2D video, SMVC has many more modalities, adding a much higher dimension to the difficult decision making process in the video scalability engine. In this paper, we use Grounded Theory to systematically extract various scalable modalities in multi-view 3D video and we find, in addition to some known modalities, some new modalities specifically for mobile multi-view 3D video. The usefulness of these scalable modalities in applications specific to mobile multi-view 3D video are also shown.
... Passive methods usually concern the task of generating a 3D model given multiple 2D photographs of a scene. In general they do not require a very expensive equipment, but quite often a specialized set-up, ( [19], [20], [21]). Passive methods are commonly employed by Model-Based Rendering techniques. ...
Article
Full-text available
Graphic rendering is expensive in terms of computation. We investigate distributing it by applying the powerful computing technique called grid computing, and showing how this technology has a great effectiveness and high performance. The paper shows how to develop a java drawing framework for drawing in the distributed environment by dividing the work upon nodes in grid computing and selecting the best nodes for job assignments to have the jobs executed in the least amount of time. Schedulers are limited in individual capability, but when deployed in large numbers can represent a strong force similar to a colony of ants or swarm of bees. The paper also presents a mechanism for load balancing based on swarm intelligence such as Ant colony optimization and Particle swarm Optimization.
... One technique for stereoscopic image compression is to exploit redundancies in the left and right eye views, which differ mainly in horizontal disparity. For example, MPEG-2, a compression standard for high-quality video applications, includes tools for predicting aspects of one view from another in multiview applications such as stereoscopic video (Puri, Kollarits, & Haskell, 1997;Tseng & Anastassiou, 1994). However, with these tools it is possible to achieve only modest compression gains (less than 30%) relative to simply doubling the bandwidth by independently coding left and right eye views. ...
Article
Full-text available
For efficient storage and transmission of stereoscopic images over bandwidth-limited channels, compression can be achieved by degrading 1 monocular input of a stereo pair and maintaining the other at the desired quality. The desired quality of the fused stereoscopic image can be achieved, provided that binocular vision assigns greater weight to the nondegraded input. A psychophysical matching procedure was used to determine if such over-weighting occurred when the monocular degradation included blur or blocking artifacts. Over-weighting of the nondegraded input occurred for blur, but under-weighting of the nondegraded input occurred for blockiness. Some participants exhibited ocular dominance, but this did not affect the blur results. The authors conclude that blur, but not blockiness, is an acceptable form of monocular degradation.
... The importance of a reliable disparity estimation derives from two applications: the recovery of 3-D structure in computer vision 9], 7] and redundancy elimination in stereoscopic image compression and processing 22] (together with camera parameters) are used to recover depth of the corresponding 3-D point. In the latter case, to assure e cient transmission cross-image correlation is exploited by means of disparity-compensated prediction, very much like motion-compensated prediction. ...
Article
In a typical disparity (or motion) estimation algorithm developed for interimage prediction, an interpolation of intensities is applied to one of the two images used. Therefore, nonfiltered intensities of the image being predicted are compared with low-pass-filtered intensities of the other image of the stereo pair. Consequently, noise and detail suppression in the two images are unequal. In this paper we propose to apply the same (balanced) filtering to both images. In addition to image smoothing that helps avoid unreliable intensity matches, a low-pass filter is used to carry out intensity interpolation at the same time; the computation of subpixel attributes is consistent with low-pass filtering of both images unlike arbitrary linear or cubic interpolation applied to one image only. The proposed approach lends itself naturally to a multiresolution implementation, We apply the new approach to stereo disparity estimation based on sliding blocks. Using synthetic and natural data we experimentally compare the new approach with the traditional sliding-block method. For standard stereoscopic images we demonstrate up to 2.4 dB reduction of disparity-compensated prediction error over the traditional sliding-block method
... Researchers at Columbia University [94] have investigated means for incorporating stereo video compression into the MPEG II coding standard. The main concept is to encode one channel in the base layer, perform a prediction operation to estimate the other channel, and transmit prediction error information in the enhancement layer. ...
Article
Today's computer users are becoming increasingly sophisticated, demanding richer and fuller machine interfaces. This is evidenced by the fact that viewing and manipulating a single stream of full-size video along with its associated audio stream is becoming commonplace. However, multiple media streams will become a necessity to meet the increasing demands of future applications. An example which requires multiple media streams is an application that supports multi-viewpoint audio and video, which allows users to observe a remote scene from many different perspectives so that a sense of immersion is experienced. Although desktop audio and video open many exciting possibilities, their use in a computer environment only becomes interesting when computational resources are expended to manipulate them in an interactive manner. We feel that user interaction will also be an extremely important component of future multimedia systems, and the methods of interaction will become increasingly complex. In addition, future applications will make significant demands on the network in terms of bandwidth, quality of service guarantees, latency, and connection management. Based on these trends we feel that an architecture designed to support future multimedia applications must provide support for several key features. The need for numerous media streams is clearly the next step forward in terms of creating a richer environment. Support for non-trivial, fine-grain interaction with the media data is another important requirement, and distributing the system across a network is imperative so that multiple participants can become involved. Finally, as a side effect of the network and multi-participant requirements, integral support for and use of multicast will be a prime architectural component. The goal of our work is to design and implement a complete system architecture capable of supporting applications with these requirements.
... For example, all the enhancement frames can be configured to be B-pictures (Bidirectionally predictive-coded picture) as shown in Fig.3. Thus, the disparity vectors can be transmitted as prediction vectors which are embedded in the basic MC framework of MPEG-2 [19], [20]. ...
Article
Full-text available
SUMMARY This paper surveys the results of various stud- ies on 3-D image coding. Themes are focused on efficient com- pression and display-independent representation of 3-D images. Most of the works on 3-D image coding have been concentrated on the compression methods tuned for each of the 3-D image for- mats (stereo pairs, multi-view images, volumetric images, holo- grams and so on). For the compression of stereo images, several techniques concerned with the concept of disparity compensation have been developed. For the compression of multi-view images, the concepts of disparity compensation and epipolar plane image (EPI) are the efficient ways of exploiting redundancies between multiple views. These techniques, however, heavily depend on the limited camera configurations. In order to consider many other multi-view configurations and other types of 3-D images comprehensively, more general platform for the 3-D image repre- sentation is introduced, aiming to outgrow the framework of 3-D "image" communication and to open up a novel field of technol- ogy, which should be called the "spatial" communication. Espe- cially, the light ray based method has a wide range of application, including efficient transmission of the physical world, as well as
Chapter
Coding of the stereoscopic video source has received significant interest recently. The MPEG committee decided to form an ad hoc group to define a new profile which is referred to as Multiview Profile (MVP) [4]. The importance of multiview video representation is also recognized by the MPEG 4 committee as one of the eight functionalities to be addressed in the near future. In this paper, we will first review the technical results using temporal scalability (disparity analysis) in MPEG-2 as pioneering by [9] and [10]. Based on temporal scalability, the concept is further generalized to affine transformation to consider the deformation and foreshortening due to the change of view point. Estimation of the affine parameters is crucial for the performance of the estimator. In this paper we propose a novel technique to find a convergent solution which results in the least mean square errors. Our result shows that about 40 percent of the macroblocks in a picture has benefited by using the affine transformation. In our approach, the additional computational complexity is minimal since a pyramidal scheme is used. In one of our experiments, only four interations are necessary to find a convergent solution. The improvement in prediction gain is found to be around 0.77 dB.
Article
In this paper, we propose an efficient block-based disparity estimation algorithm fur multiple view image coding in EE2 and EE3 in 3DAV. The proposed method emphasizes on visual quality improvement to satisfy the requirements for multiple view generation. Therefore, we perform an adaptive disparity estimation that constructs variable blocks by considering given image features. Examining neighboring features around desired block search range is set up to decrease complexity and additional information than only using quad-tree coding through applying binary-tree and quad-tree coding by taking into account stereo image feature having big disparity. The experimental results show that the proposed method improves PSNR about 1 to 2dB compared to existing other methods and decreases computational complexity up to maximum 68 percentages than FBMA.
Conference Paper
Multi-view 3D video is currently attracting growing attention in several applications such as the 3DTV, free-view point video and entertainment industry where it can be used to provide multi-perspective viewing and 3D scene experiences. In multi-view 3D video, several 3D video sequences should be captured simultaneously from the same scene but through different viewing angles. One of the major challenges in this field is how to transmit the large amount of data of a multi-view 3D video sequence over error prone channels to heterogeneous devices with different bandwidth, resolution, and processing power, while maintaining a high visual quality. Scalable Multi-view 3D Video Coding (SMVC) is one of the methods to address this challenge. But there are many difficulties in SMVC that makes it impractical in most 3D video applications. In this work, we propose an adaptive framework to use SMVC in various 3D video applications effectively. The current prototype shows enhanced capability in handling the existing 3D video applications.
Article
Disparity vectors are used for the reconstruction of the right image sequence from the left one (vice versa) in a 3DTV transmission system, it is replaced the full transmission of both stereo channels by a compensated scheme. An improved matching algorithm for disparity compensated video coding is presented in this paper. The algorithm requires transmitting the same amount of disparity data as the conventional block matching algorithm(BMA) while achieving much higher prediction accuracy by refinement scheme that captures the fine disparity variation between left and right image sequences.