Fig 3 - uploaded by Sebastian Knorr
Content may be subject to copyright.
Pictorial depth cues in a 2D image. Visible depth cues: linear perspective, relative and known size, texture gradient, atmospheric scattering, relative height in picture, and interposition. 

Pictorial depth cues in a 2D image. Visible depth cues: linear perspective, relative and known size, texture gradient, atmospheric scattering, relative height in picture, and interposition. 

Source publication
Article
Full-text available
Three-dimensional television (3D-TV) is the next major revolution in television. A successful rollout of 3D-TV will require a backward-compatible transmission/distribution system, inexpensive 3D displays, and an adequate supply of high-quality 3D program material. With respect to the last factor, the conversion of 2D images/videos to 3D will play a...

Context in source publication

Context 1
... percep- tion could be related to the physical characteristics of the Human Visual System (HVS) such as the perception of depth by accom- modation or could be learned from experience like the percep- tion acquired from the relative height of objects in the picture, perspective, shadows, and other pictorial cues [22], [23]. An ex- ample is shown in Fig. 3, which presents an image where a clear depth order can be extracted by using pictorial depth ...

Citations

... Among the nondeep learning methods are skeleton line/edge tracking [10,11], object segmentation [12,13], bilateral filtering [14,15], trilateral filtering [16], planar transformation between images [17], motion information between consecutive frames [18,19], the Welsch M-estimator [20], residual-driven optimization [21], and depth from motion/optical flow [5]. A survey on 2D to 3D video conversion can be found in [22]. ...
Article
Full-text available
Algorithms for converting 2D to 3D are gaining importance following the hiatus brought about by the discontinuation of 3D TV production; this is due to the high availability and popularity of virtual reality systems that use stereo vision. In this paper, several depth image-based rendering (DIBR) approaches using state-of-the-art single-frame depth generation neural networks and inpaint algorithms are proposed and validated, including a novel very fast inpaint (FAST). FAST significantly exceeds the speed of currently used inpaint algorithms by reducing computational complexity, without degrading the quality of the resulting image. The role of the inpaint algorithm is to fill in missing pixels in the stereo pair estimated by DIBR. Missing estimated pixels appear at the boundaries of areas that differ significantly in their estimated distance from the observer. In addition, we propose parameterizing DIBR using a singular, easy-to-interpret adaptable parameter that can be adjusted online according to the preferences of the user who views the visualization. This single parameter governs both the camera parameters and the maximum binocular disparity. The proposed solutions are also compared with a fully automatic 2D to 3D mapping solution. The algorithm proposed in this work, which features intuitive disparity steering, the foundational deep neural network MiDaS, and the FAST inpaint algorithm, received considerable acclaim from evaluators. The mean absolute error of the proposed solution does not contain statistically significant differences from state-of-the-art approaches like Deep3D and other DIBR-based approaches using different inpaint functions. Since both the source codes and the generated videos are available for download, all experiments can be reproduced, and one can apply our algorithm to any selected video or single image to convert it.
... Although 2D to 3D conversion display technology 6) 7) can be used to create 3D content, some problems arise because designers only use the source image and the corresponding depth image to construct a realistic threedimensional image. We further explored the reasons and found that the quality of the depth image will directly affect not only the authenticity of the converted stereoscopic image but also the comfort of the viewer. ...
Article
3D displays and interactive technology play an important role in the Metaverse. This paper addresses and analyzes 3D display and depth sensing technologies applied in the Metaverse. Combining key 2D to 3D conversion algorithms, depth information reconstruction technology which is based on structured light and time of flight technology, realizes the 3D vision of the Metaverse. At the same time, it introduces 3D interactive technologies in detail. Through single-lens interactive technology, dual-lens interactive technology and depth camera interactive technology, it explains the 3D interaction way of the Metaverse. The forward-looking 3D display and interactive technology, will be able to realize a more perfect Metaverse world. We hope the coming of the Metaverse Era will create a better future.
... 3D images have been used for various applications and will attract more attention in the multimedia field as advanced 3D displays for virtual reality, augmented reality, etc. are developed. The depth map represents two-dimensional distance information from a camera to a subject, and it is key information for creating 3D image content [1][2][3]. There are various ways to obtain a depth map. ...
Conference Paper
A technique for removing unnecessary patterns from captured images by using a generative network is studied. The patterns, composed of lines and spaces, are superimposed onto a blue component image of RGB color image when the image is captured for the purpose of acquiring a depth map. The superimposed patterns become unnecessary after the depth map is acquired. We tried to remove these unnecessary patterns by using a generative adversarial network (GAN) and an auto encoder (AE). The experimental results show that the patterns can be removed by using a GAN and AE to the point of being invisible. They also show that the performance of GAN is much higher than that of AE and that its PSNR and SSIM were over 45 and about 0.99, respectively. From the results, we demonstrate the effectiveness of the technique with a GAN.
... The used methods and algorithms are based on the original images analysis. At the same time, creating their depth maps with that, helps the 3D restoration and replaces the transformation of the original image into a stereo pair [1,2,[4][5][6][7][8][9][10][11] . ...
... In practice, to create a high quality of pseudo 3D stereo, the technologies with "manual" source frames marking are used to create depth maps [1,2,[4][5][6] . However, this makes it practically impossible to use them for autonomous online 3D pseudo-stereo generation. ...
... Several methods and algorithms that have been used for "image based 3D synthesis" are based on the original content analysis (images, video) and the creation of their depth maps. This is the basis for Depth Image Based Rendering (DIBR) "pseudo-3D conversion", transformation of the original image into a stereo pair [1][2][3][4][5][6][7][8][9][10][11] . ...
Preprint
This article discusses the study of a computer system for creating 3D pseudo-stereo images and videos using hardware and software support for accelerating a synthesis process based on General Purpose Graphics Processing Unit (GPGPU) technology. Based on the general strategy of 3D pseudo-stereo synthesis previously proposed by the authors, Compute Unified Device Architect (CUDA) method considers the main implementation stages of 3D pseudo-stereo synthesis: (i) the practical implementation study; (ii) the synthesis characteristics for obtaining images; (iii) the video in Ultra-High Definition (UHD) 4K resolution using the Graphics Processing Unit (GPU). Respectively with these results of 4K content test on evaluation systems with a GPU the acceleration average of 60.6 and 6.9 times is obtained for images and videos. The research results show consistency with previously identified forecasts for processing 4K image frames. They are confirming the possibility of synthesizing 3D pseudo-stereo algorithms in real time using powerful support for modern Graphics Processing Unit/Graphics Processing Clusters (GPU/GPC).
... The used methods and algorithms are based on the original images analysis. At the same time, creating their depth maps with that, helps the 3D restoration and replaces the transformation of the original image into a stereo pair [1,2,[4][5][6][7][8][9][10][11] . ...
... In practice, to create a high quality of pseudo 3D stereo, the technologies with "manual" source frames marking are used to create depth maps [1,2,[4][5][6] . However, this makes it practically impossible to use them for autonomous online 3D pseudo-stereo generation. ...
... Several methods and algorithms that have been used for "image based 3D synthesis" are based on the original content analysis (images, video) and the creation of their depth maps. This is the basis for Depth Image Based Rendering (DIBR) "pseudo-3D conversion", transformation of the original image into a stereo pair [1][2][3][4][5][6][7][8][9][10][11] . ...
Article
Full-text available
تناقش هذه المقالة دراسة نظام الكمبيوتر لإنشاء صور ومقاطع فيديو ثلاثية الأبعاد مجسمة غير حقيقية باستخدام دعم الأجهزة والبرامج لتسريع عملية التوليف بناءً على تقنية وحدة معالجة الرسومات للأغراض العامة. استنادًا إلى الاستراتيجية العامة للتركيب المجسم ثلاثي الأبعاد غير الحقيقي الذي اقترحه المؤلفون سابقًا، تأخذ طريقة تنفيذ حساب موحد لمعامرية الجهاز مع الأخذ بالاعتبار مراحل التنفيذ الرئيسية للتركيب المجسم ثلاثي الأبعاد غير الحقيقي: (1) دراسة التنفيذ العملي؛ (2) خصائص التوليف للحصول على الصور؛ (3) الفيديو بدقة 4K فائقة الوضوح باستخدام وحدة معالجة الرسومات. على التوالي مع هذه النتائج لاختبار محتوى 4K على أنظمة التقييم باستخدام وحدة معالجة الرسومات، يتم الحصول على متوسط تسريع يبلغ 60.6 و6.9 مرة للصور ومقاطع الفيديو. تظهر نتائج البحث تناسقًا مع التوقعات المحددة مسبقًا لمعالجة إطارات الصور بدقة 4K. النتائج تؤكد إمكانية توليف خوارزميات ثلاثية الأبعاد مجسمة غير حقيقية في الوقت الفعلي باستخدام دعم قوي لوحدة معالجة الرسومات الحديثة وعناقيد مجموعات معالجة الرسومات.
... The used methods and algorithms are based on the original images analysis. At the same time, creating their depth maps with that, helps the 3D restoration and replaces the transformation of the original image into a stereo pair [1,2,[4][5][6][7][8][9][10][11] . ...
... In practice, to create a high quality of pseudo 3D stereo, the technologies with "manual" source frames marking are used to create depth maps [1,2,[4][5][6] . However, this makes it practically impossible to use them for autonomous online 3D pseudo-stereo generation. ...
... Several methods and algorithms that have been used for "image based 3D synthesis" are based on the original content analysis (images, video) and the creation of their depth maps. This is the basis for Depth Image Based Rendering (DIBR) "pseudo-3D conversion", transformation of the original image into a stereo pair [1][2][3][4][5][6][7][8][9][10][11] . ...
Article
Full-text available
This article discusses the study of a computer system for creating 3D pseudo-stereo images and videos using hardware and software support for accelerating a synthesis process based on General Purpose Graphics Processing Unit (GPGPU) technology. Based on the general strategy of 3D pseudo-stereo synthesis previously proposed by the authors, Compute Unified Device Architect (CUDA) method considers the main implementation stages of 3D pseudo-stereo synthesis: (i) the practical implementation study; (ii) the synthesis characteristics for obtaining images; (iii) the video in Ultra-High Definition (UHD) 4K resolution using the Graphics Processing Unit (GPU). Respectively with these results of 4K content test on evaluation systems with a GPU the acceleration average of 60.6 and 6.9 times is obtained for images and videos. The research results show consistency with previously identified forecasts for processing 4K image frames. They are confirming the possibility of synthesizing 3D pseudo-stereo algorithms in real time using powerful support for modern Graphics Processing Unit/Graphics Processing Clusters (GPU/GPC).
... In the depth image, a higher intensity value denotes that objects are closer to the shooting camera. As illustrated in Fig. 3, the DIBR system consists of the following steps: preprocessing of depth image, depth normalization and 3D image warping, and hole-filling [30]. For natural synthesized image generation, preprocessing the depth image is needed before virtual view rendering [3]. ...
... This process consists of the two following steps: 1) With depth values, the points in the center image are reprojected into 3D space, 2) These points on 3D space are projected into the image plane of the virtual left and right cameras [4]. The parallel camera configuration is typically utilized for view synthesis because it does not generate vertical disparities, unlike the convergent camera configuration [30]. Fig. 4 represents the illustration of 3D image generation on a parallel camera configuration, where c l , c c , and c r denote the left, center, and right cameras, respectively. ...
Article
Full-text available
Depth-image-based rendering (DIBR), where arbitrary views are synthesized from a center image and depth image, has received much attention in the three-dimensional (3D) research field. With advances in depth-acquisition techniques and the proliferation of 3D glasses and 3D display devices, there is a growing demand for schemes to protect the copyrights of DIBR 3D images. Digital watermarking is a typical protection technology and designing a watermarking method for DIBR 3D images is a challenging task because the synchronization of watermarks can easily be broken in the process of generating synthetic images. To address this issue, we propose a non-subsampled contourlet transform (NSCT)-based blind watermarking for DIBR 3D images. To ensure the proposed method has properties of robustness against the DIBR process, we conduct an analysis of robustness for NSCT subbands against DIBR attacks. Based on the analysis results, we select subbands that are robust against DIBR attacks and embed watermark in the low coefficients of NSCT subbands using quantization-based embedding. While ensuring robustness, the proposed method also improves the imperceptibility of watermarks by adjusting their embedding strength according to computed perceptual masking values. Through experiments, we show that the proposed method is robust against both desynchronization attacks of the DIBR process and common attacks including signal processing operations and geometric distortions. The high imperceptibility of our method is also verified by several evaluation metrics in a subjective and objective manner.
... A typical strategy for DIBR is simply reprojecting pixels based on depth values to a new, synthetic camera, but such methods are susceptible to large "holes" at disocclusions. Much work has been done to fill these holes (e.g., [38], [39], [40], [41]), but visual artifacts still remain in the case of general scenes. ...
Preprint
We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, we use a Kinect-based system to collect a large dataset containing stereoscopic videos with known depths. We show that our depth estimation technique outperforms the state-of-the-art on benchmark databases. Our technique can be used to automatically convert a monoscopic video into stereo for 3D visualization, and we demonstrate this through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.
... Estimating depth from a single image is an ill-posed problem, which is still interesting to solve. Indeed, many three-dimensional structures can have the same two-dimensional projection, but finding the real one can be convenient in various situations such as automatically converting a 2D film in 3D [5][6][7]. To that end, a few algorithms inspired by the way humans use monocular cues as well as their prior visual experience have been proposed [8][9][10][11][12]. ...
Article
Full-text available
Among the various cues that help us understand and interact with our surroundings, depth is of particular importance. It allows us to move in space and grab objects to complete different tasks. Therefore, depth prediction has been an active research field for decades and many algorithms have been proposed to retrieve depth. Some imitate human vision and compute depth through triangulation on correspondences found between pixels or handcrafted features in different views of the same scene. Others rely on simple assumptions and semantic knowledge of the structure of the scene to get the depth information. Recently, numerous algorithms based on deep learning have emerged from the computer vision community. They implement the same principles as the non-deep learning methods and leverage the ability of deep neural networks of automatically learning important features that help to solve the task. By doing so, they produce new state-of-the-art results and show encouraging prospects. In this article, we propose a taxonomy of deep learning methods for depth prediction from 2D images. We retained the training strategy as the sorting criterion. Indeed, some methods are trained in a supervised manner which means depth labels are needed during training while others are trained in an unsupervised manner. In that case, the models learn to perform a different task such as view synthesis and depth is only a by-product of this learning. In addition to this taxonomy, we also evaluate nine models on two similar datasets without retraining. Our analysis showed that (i) most models are sensitive to sharp discontinuities created by shadows or colour contrasts and (ii) the post processing applied to the results before computing the commonly used metrics can change the model ranking. Moreover, we showed that most metrics agree with each other and are thus redundant.
... For 3D image warping, the commonly used parallel camera configuration is considered for the DIBR system. In 3D rendering operation, the pixelwise mapping from the center image to the virtual left and right images can be performed by the following formulas [29,30]: ...
Article
Full-text available
The quality evaluation of stereoscopic images is becoming increasingly important in 3D multimedia applications. Watermarking-based quality evaluation has been a promising technology for 3D content quality evaluation. In this work, we propose a novel quality evaluation scheme for depth-image-based rendering (DIBR) 3D images. The scheme utilizes the watermarking technique to evaluate the quality of watermarked images under various distortions. In this scheme, the watermark is embedded into the selected coefficients of dual-tree complex wavelet transform sub-bands of the center view. The virtual left and right views are synthesized from the watermarked center view and the associated depth map using the DIBR technique at the receiver side. The watermark can be detected from the three views individually, and the quality of these views under various attacks can be estimated by examining the degradation of corresponding extracted watermarks. In addition, the quality evaluation is made possible by checking the generated mapping curve, which maps the normalized correlation of the extracted watermark to the quality measure of the watermarked image under distortions. There is a high correlation between the estimated quality and the calculated quality. The experimental results demonstrate that the proposed scheme has good performance of quality evaluation for DIBR 3D images under JPEG compression, JPEG 2000 compression, Gaussian noise, and Gaussian blur. Moreover, our proposed scheme exhibits superiority over other related methods in terms of evaluation accuracy.