Figure 1 - uploaded by Adrian Barbu
Content may be subject to copyright.
An example of a clean and noisy image pair as well as their corresponding blue channel. The noise present is the result of the low-light environment. The images were taken using a Canon PowerShot S90.

An example of a clean and noisy image pair as well as their corresponding blue channel. The noise present is the result of the low-light environment. The images were taken using a Canon PowerShot S90.

Source publication
Article
Full-text available
Many modern and popular state of the art image denoising algorithms are trained and evaluated using images corrupted by artificial noise. These trained algorithms and their evaluations on synthetic data may lead to incorrect conclusions about their performances on real noise. In this paper we introduce a benchmark dataset of uncompressed color imag...

Contexts in source publication

Context 1
... Mi2Raw Camera app was used to capture the RAW images for the Xiaomi Mi3 (in DNG format). An example of one of the images in the dataset can be seen in Figure 1. In the end we collected 51 scenes for the S90, 40 for the T3i, and another 40 for the Mi3. ...
Context 2
... then looked to see if the standard estimate of using the difference image between the reference image and the other calibration images provided similar results to those we obtained using our methodology from equations (2) As Figure 10 shows, our estimated σ values are less biased and have smaller vari- ance than the standard estimation of σ from the difference images. The average relative error for our method of estimation is 1.58% and for the standard method of estimation is 36.22%. ...
Context 3
... our data acuisition methodology we have both clean and noisy images and are able to infer more accurately the noise level in the image. This is why in Figure 11, right we can see all the three estimation methods based on our data are very close to each other and the Foi estimation is quite far. Also, note that this evaluation was on a special scene of a uniform background of continuously changing intensity and no edges. ...

Similar publications

Preprint
Full-text available
Deep neural networks (DNNs) have achieved significant success in image restoration tasks by directly learning a powerful non-linear mapping from corrupted images to their latent clean ones. However, there still exist two major limitations for these deep learning (DL)-based methods. Firstly, the noises contained in real corrupted images are very com...

Citations

... In our implementation, the feature extraction network is pretrained on ImageNet (Russakovsky et al., 2015), the CBDNet is pre-trained on BSD500 (Martin et al., 2001), Waterloo (Ma et al., 2017), MIT-Adobe FiveK (Bychkovsky et al., 2011) and RENOIR dataset (Anaya & Barbu, 2018). We train PIE end-to-end while fixing the weights of the feature extraction network. ...
Article
Full-text available
In this paper, we propose a physics-inspired contrastive learning paradigm for low-light enhancement, called PIE. PIE primarily addresses three issues: (i) To resolve the problem of existing learning-based methods often training a LLE model with strict pixel-correspondence image pairs, we eliminate the need for pixel-correspondence paired training data and instead train with unpaired images. (ii) To address the disregard for negative samples and the inadequacy of their generation in existing methods, we incorporate physics-inspired contrastive learning for LLE and design the Bag of Curves (BoC) method to generate more reasonable negative samples that closely adhere to the underlying physical imaging principle. (iii) To overcome the reliance on semantic ground truths in existing methods, we propose an unsupervised regional segmentation module, ensuring regional brightness consistency while eliminating the dependency on semantic ground truths. Overall, the proposed PIE can effectively learn from unpaired positive/negative samples and smoothly realize non-semantic regional enhancement, which is clearly different from existing LLE efforts. Besides the novel architecture of PIE, we explore the gain of PIE on downstream tasks such as semantic segmentation and face detection. Training on readily available open data and extensive experiments demonstrate that our method surpasses the state-of-the-art LLE models over six independent cross-scenes datasets. PIE runs fast with reasonable GFLOPs in test time, making it easy to use on mobile devices. Code available
... The VELIE dataset is divided into two parts: paired (P-VELIE) and unpaired (UP-VELIE). In the P-VELIE portion, we adopted the method from RENOIR [52], using a fixed Nikon Z8 camera (Nikon Corporation, Tokyo, Japan) to capture the same scene with different apertures, designating the normal lighting images as the ground truth and the others as low-light scenarios. This fixed shooting mode allowed us to collect 500 pairs of driving scene low-light datasets, which were split into 485 training and 15 testing pairs. ...
Article
Full-text available
In Advanced Driving Assistance Systems (ADAS), Automated Driving Systems (ADS), and Driver Assistance Systems (DAS), RGB camera sensors are extensively utilized for object detection, semantic segmentation, and object tracking. Despite their popularity due to low costs, RGB cameras exhibit weak robustness in complex environments, particularly underperforming in low-light conditions, which raises a significant concern. To address these challenges, multi-sensor fusion systems or specialized low-light cameras have been proposed, but their high costs render them unsuitable for widespread deployment. On the other hand, improvements in post-processing algorithms offer a more economical and effective solution. However, current research in low-light image enhancement still shows substantial gaps in detail enhancement on nighttime driving datasets and is characterized by high deployment costs, failing to achieve real-time inference and edge deployment. Therefore, this paper leverages the Swin Vision Transformer combined with a gamma transformation integrated U-Net for the decoupled enhancement of initial low-light inputs, proposing a deep learning enhancement network named Vehicle-based Efficient Low-light Image Enhancement (VELIE). VELIE achieves state-of-the-art performance on various driving datasets with a processing time of only 0.19 s, significantly enhancing high-dimensional environmental perception tasks in low-light conditions.
... For uniformity in comparison, the image dimension was kept either fixed to default size as provided in the database or cropped to M = 512×512 rows and columns. Three bench-marking real-noisy image datasets: PolyU Realnoisy Image Database [58], Cross Channel (CC) image database [59] and Renoir Image Dataset [60] were considered in this article for testing and validation of proposed denoising algorithm. Being the simplest inverse problem, a plethora of research has been carried out in this area, still it remains an open problem in image processing [61], [33]. ...
... This is because that (sometimes) artifacts increase the preserved realism of the denoised image, as structures in the image are recognized as details. We have conducted a few experiments using Renoir image dataset [60] and selected flat/homogeneous regions with almost constant image pixel intensity, as shown in Fig. 11. The selected image patches would highlight the denoising performance of each approach while yielding the notable artifacts manifesting as method noises. ...
Article
Full-text available
Naïve simulated additive white Gaussian noise (AWGN) may not fully characterize the complexity of real world noisy images. Owing to optimal sparsity in image representation, we propose a curvelet based model for denoising real-world RGB images. Initially, the image is decomposed in three curvelet scales, namely: the approximation scale (that retains low-frequency information), the coarser scale and the finest scale (that preserves high-frequency components). Coefficients in the approximation and finest scale are estimated using NLM filter, while a scale dependent threshold is adopted for signal estimation in the coarser scale. The reconstructed image in spatial domain is further processed using Guided Image Filter (GIF) to suppress the ringing artifacts due to curvelet thresholding. The proposed approach known as CTuNLM method is extended for color image denoising using uncorrelated YUV color space. Extensive experiments on multi-channel real noisy images are conducted in comparison with eight sate-of-the-art methods. With four encouraging qualitative and quantitative measures including PSNR and SSIM, we found that CTuNLM method achieves better denoising performance in terms of noise reduction and detail preservation. We further examined the potential of proposed approach by focusing only on the Finest scale curvelet Coefficients (FC). Features like small details, edges and textures always add up to improve the overall denoising performance, while minimizing spurious details. We studied “The Curious Case of the Finest Scale” and constructed “Deep Curvelet-Net”: an encoder-decoder-based CNN architecture, as a pilot work. The encoder uses multiscale spatial characteristics from noisy FC, while the decoder processes denoised FC under the supervision of encoder’s multiscale spatial attention map. The “Deep Curvelet-Net” links encoder multiscale feature modeling with decoder spatial attention supervision to learn the most essential features for denoising. The CNN-based architecture only estimates FC, while all other CTuNLM stages are left unchanged to produce the denoised output. Results presented in this article validated the design of proposed CNN architecture in curvelet domain and motivated us to search beyond classical thresholding and/or filtering approaches.
... Over the past decade, various visual enhancement techniques (Guo et al., 2016(Guo et al., , 2020Ying et al., 2017;Zhang et al., 2019;Li et al., 2022) have been proposed to improve the visibility of degraded images and videos, ranging from dehazing, de-raining to illumination enhancements. Given the effectiveness of deep neural networks in related tasks such as image reconstruction, deeplearning based illumination enhancement methods have also been developed with the introduction of various illumination enhancement datasets (e.g., SID (Chen et al., 2018), ReNOIR (Anaya and Barbu, 2018) and LOL dataset ). The results are reportedly promising from a human vision viewpoint, given their capability in improving the visual quality of low-illumination images and videos. ...
Article
Full-text available
While action recognition (AR) has gained large improvements with the introduction of large-scale video datasets and the development of deep neural networks, AR models robust to challenging environments in real-world scenarios are still under-explored. We focus on the task of action recognition in dark environments, which can be applied to fields such as surveillance and autonomous driving at night. Intuitively, current deep networks along with visual enhancement techniques should be able to handle AR in dark environments, however, it is observed that this is not always the case in practice. To dive deeper into exploring solutions for AR in dark environments, we launched the UG2+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {UG}^{2}{+}$$\end{document} Challenge Track 2 (UG2-2) in IEEE CVPR 2021, with a goal of evaluating and advancing the robustness of AR models in dark environments. The challenge builds and expands on top of a novel ARID dataset, the first dataset for the task of dark video AR, and guides models to tackle such a task in both fully and semi-supervised manners. Baseline results utilizing current AR models and enhancement methods are reported, justifying the challenging nature of this task with substantial room for improvements. Thanks to the active participation from the research community, notable advances have been made in participants’ solutions, while analysis of these solutions helped better identify possible directions to tackle the challenge of AR in dark environments.
... However, the real-world noise is more sophisticated than AWGN [39], and the polarimetric noise images in the target domain are the real-world noise data. Two datasets of real-world noisy images, i.e., SIDD [40] and RENOIR [41], are adopted in the source domain. ...
Article
Although deep learning-based methods have achieved great success in various polarimetric imaging tasks, the performance and the generalization ability are strongly dependent on massive training data, which is a critical limitation and limits the practical applications. In this paper, for the first time to our knowledge, we present a deep transfer learning-based solution for polarimetric image denoising. This solution performs the transfer learning by fine-tuning a denoising model pre-trained on a large-scale color image dataset and using a small-scale polarimetric dataset. The experimental results show that, based on a small-scale dataset, the proposed network can achieve almost the same denoising performance as that with a large-scale dataset. The polarization parameters, i.e., the degree of polarization and the angle of polarization, can be reconstructed simultaneously. In addition, serials of experiments demonstrate the generalization ability of the method for different materials and noise levels.
... In order to solve the above unsettled problems and consider the urgency of implementing 3D reconstruction in the LLL environment, therefore, in this paper we propose an embeddable converged front-and back-end network (EC-FBNet) to realize 3D reconstruction for objects in the LLL environment. The prerequisite to accomplish this purpose is to seek an available dataset, and presently, there exist several image datasets under photon-poor environment [16,17], but none of them contain accessible 3D data for training. Focusing on this situation, we create a real LLL environment dataset and title it as 3LNet-12; this dataset consists of photon counting (PHC) images captured by multi-pixel photon counter (MPPC) in the LLL environment and corresponding 3D coordinate data. ...
... By implementing different combinations of the aboveproposed initiatives for the FCE, we compared the recognition accuracy about low-light images on three dim environment datasets (3LNet-12, ExDark [16], RENOIR [17]). Meanwhile, we selected several image characteristic extrac-tion networks (e.g., DenseNet [55], VGG [56], and SqueezeNet [57]) for comparison, and the experiment results are shown in Table 4. On the 3LNet-12 dataset, it can be observed that the application of the ISM initiative increases the recognition accuracy of the FCE by 3.52% compared to the baseline. ...
... Similarly, on the ExDark [16] and RENOIR [17] dataset, by applying these three initiatives simultaneously to the FCE, the accuracy recognition about low-light image is also improved by 7.19% and 8.57%, respectively, compared to the baseline. Experiment results indicate that when the ISM, SC and AMS initiatives are implemented to the FCE, it enables the network to extract diverse semantic information of PHC images, while making the FCE acquire characteristic information from various perceive fields. ...
Article
Full-text available
The implementation of 3D reconstruction for targets in the low-light-level (LLL) environment is an immediate requirement in military, aerospace and other fields related to this environment. However, in such a photon-deficient environment, the amount of available information is extremely limited; thus, leading to the 3D reconstruction task in this environment is challenging. To address this issue, an embeddable converged front- and back-end network (EC-FBNet) is proposed in this paper, it can extract sparse information from the LLL environment by aggregating multi-layer semantic, then according to the similarity of features among object parts, to calculate the global topology structure of the 3D model. For the training approach, the EC-FBNet performs the two-stage integrated training modality. We additionally construct an embedded global inferential attention module (GIAM), to distribute the association weights among the points in the model, and thus reason out the global topology structure of the 3D model. In order to acquire realistic images in the LLL environment, this study leverages the multi-pixel photon counter (MPPC) detector to capture stable photon counting images in this environment, then packages into a dataset for training by the network. In experiment, the proposed approach not only achieves results superior to the state-of-the-art approaches, but also competitive in the quality of the reconstructed model. We believe that this approach can be a useful tool for 3D reconstruction field in the LLL environment.
... RENOIR [114]: It is a dataset of color images corrupted by natural noise due to low-light conditions, together with spatially and intensity-aligned low noise images of the same scenes. ...
Preprint
Full-text available
The advent of deep learning has brought a revolutionary transformation to image denoising techniques. However, the persistent challenge of acquiring noise-clean pairs for supervised methods in real-world scenarios remains formidable, necessitating the exploration of more practical self-supervised image denoising. This paper focuses on self-supervised image denoising methods that offer effective solutions to address this challenge. Our comprehensive review thoroughly analyzes the latest advancements in self-supervised image denoising approaches, categorizing them into three distinct classes: General methods, Blind Spot Network (BSN)-based methods, and Transformer-based methods. For each class, we provide a concise theoretical analysis along with their practical applications. To assess the effectiveness of these methods, we present both quantitative and qualitative experimental results on various datasets, utilizing classical algorithms as benchmarks. Additionally, we critically discuss the current limitations of these methods and propose promising directions for future research. By offering a detailed overview of recent developments in self-supervised image denoising, this review serves as an invaluable resource for researchers and practitioners in the field, facilitating a deeper understanding of this emerging domain and inspiring further advancements.
... When forming an image, we want the brightness of all parts of the image to be uniform, except for the parts that need to form the image [5]. However, reality is often different from the ideal state and some factors that are not required to form the image will also produce variations in brightness. ...
Article
Full-text available
Currently, image-denoising algorithms based on convolutional neural networks (CNN)have been widely used and have achieved good results. Compared with traditional image-denoisingmethods, it has powerful learning ability and efficient algorithms. This paper summarizes traditionaldenoising methods and CNN-based image denoising methods, and introduces the basics of imagedenoising in detail, which is helpful for readers who are starting with image denoising processing.In addition, this paper also summarizes some commonly used datasets in the field of imageprocessing, which makes it easier for us to denoise images. Finally, some suggestions for improvingthe performance of CNN image denoising are presented, and possible future research directions arediscussed.
... Gaussian noise, however that practice fails to generalize to real noisy images [27]. Later, datasets of real noisy images with their clean counterparts were collected (SIDD [1], RENOIR [2]), and are commonly used for denoising evaluation. As shown in [34], learning the noise distribution of real images via a GAN, which is used to synthesize noise for a denoising network, significantly improves performance. ...
... Accounting for the more complex nature of real camera noise, we propose a diffusion formulation that unifies realistic image noise with that of the diffusion process. (2). The reverse denoising process starts from complete noise (left) and iterates for 1000 time-steps. ...
Preprint
Full-text available
Denoising diffusion models have recently shown impressive results in generative tasks. By learning powerful priors from huge collections of training images, such models are able to gradually modify complete noise to a clean natural image via a sequence of small denoising steps, seemingly making them well-suited for single image denoising. However, effectively applying denoising diffusion models to removal of realistic noise is more challenging than it may seem, since their formulation is based on additive white Gaussian noise, unlike noise in real-world images. In this work, we present SVNR, a novel formulation of denoising diffusion that assumes a more realistic, spatially-variant noise model. SVNR enables using the noisy input image as the starting point for the denoising diffusion process, in addition to conditioning the process on it. To this end, we adapt the diffusion process to allow each pixel to have its own time embedding, and propose training and inference schemes that support spatially-varying time maps. Our formulation also accounts for the correlation that exists between the condition image and the samples along the modified diffusion process. In our experiments we demonstrate the advantages of our approach over a strong diffusion model baseline, as well as over a state-of-the-art single image denoising method.
... H IGH-ENERGY visibility images contain abundant information about the target scene, which is crucial for most visually-based tasks including object detection [30], [23], image classification [1], and image denoising [2], among other classic upstream visual tasks. In the process of image acquisition, there are often many uncontrollable physical factors, which degenerate the quality of captured image and consequently debilitate the performance of feature extraction for many upstream visual tasks. ...
Preprint
Full-text available
Image captured under low-light conditions presents unpleasing artifacts, which debilitate the performance of feature extraction for many upstream visual tasks. Low-light image enhancement aims at improving brightness and contrast, and further reducing noise that corrupts the visual quality. Recently, many image restoration methods based on Swin Transformer have been proposed and achieve impressive performance. However, On one hand, trivially employing Swin Transformer for low-light image enhancement would expose some artifacts, including over-exposure, brightness imbalance and noise corruption, etc. On the other hand, it is impractical to capture image pairs of low-light images and corresponding ground-truth, i.e. well-exposed image in same visual scene. In this paper, we propose a dual-branch network based on Swin Transformer, guided by a signal-to-noise ratio prior map which provides the spatial-varying information for low-light image enhancement. Moreover, we leverage unsupervised learning to construct the optimization objective based on Retinex model, to guide the training of proposed network. Experimental results demonstrate that the proposed model is competitive with the baseline models.