Content uploaded by Jingchun Zhou
Author content
All content in this area was uploaded by Jingchun Zhou on Aug 31, 2023
Content may be subject to copyright.
International Journal of Computer Vision
https://doi.org/10.1007/s11263-023-01853-3
Underwater Camera: Improving Visual Perception Via Adaptive Dark
Pixel Prior and Color Correction
Jingchun Zhou1·Qian Liu1·Qiuping Jiang2·Wenqi Ren3·Kin-Man Lam4·Weishi Zhang1
Received: 8 February 2023 / Accepted: 12 July 2023
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023
Abstract
We present a novel method for underwater image restoration, which combines a Comprehensive Imaging Formation Model
with prior knowledge and unsupervised techniques. Our approach has two main components: depth map estimation using a
Channel Intensity Prior (CIP) and backscatter elimination through Adaptive Dark Pixels (ADP). The CIP effectively mitigates
issues caused by solid-colored objects and highlighted regions in underwater scenarios. The ADP, utilizing a dynamic depth
conversion, addresses issues associated with narrow depth ranges and backscatter. Furthermore, an unsupervised method is
employed to enhance the accuracy of monocular depth estimation and reduce artificial illumination influence. The final output
is refined via color compensation and a blue-green channel color balance factor, delivering artifact-free images. Experimental
results show that our approach outperforms state-of-the-art methods, demonstrating its efficacy in dealing with uneven lighting
and diverse underwater environments.
Keywords Underwater camera imaging ·Underwater image ·Image restoration ·Image enhancement ·Light scattering
Communicated by Chongyi Li.
BKin-Man Lam
enkmlam@polyu.edu.hk
BWei sh i Zh an g
teesiv@dlmu.edu.cn
Jingchun Zhou
zhoujingchun03@qq.com
Qian Liu
qianliu@dlmu.edu.cn
Qiuping Jiang
jiangqiuping@nbu.edu.cn
Wenqi Ren
rwq.renwenqi@gmail.com
1School of Information Science and Technology, Dalian
Maritime University, No. 1 Lingshui Road, Dalian 116026,
Liaoning, China
2School of Information Science and Engineering, Ningbo
University, No. 818 Fenghua Road, Ningbo 315211,
Zhejiang, China
3School of Cyber Science and Technology, Sun Yat-Sen
University, Shenzhen Campus, Shenzhen 518107,
Guangdong, China
4Department of Electronic and Information Engineering, Hong
Kong Polytechnic University, Hong Kong 999077, China
1 Introduction
Underwater imaging is one of the critical technologies for
studying and exploring the underwater world. This technique
gathers information and images of the underwater environ-
ment and objects utilizing high-tech equipment and sensors,
such as laser radar, sonar, and imaging sensors (Kang et
al., 2023; Pan et al., 2022; Zhang et al., 2022). Underwa-
ter cameras can directly capture images of the underwater
environment and marine life, providing critical observational
data and evidence for research and exploration in ocean
energy development and marine life monitoring. Addition-
ally, underwater cameras can support the work of underwater
robots and divers, improving their operational efficiency
and safety (Jiang et al., 2022). However, underwater image
restoration poses more significant challenges than terrestrial
image restoration, due to selective absorption and scattering
caused by diverse aquatic media, inadequate lighting, and
inferior underwater imaging equipment (Ren et al., 2020;Qi
et al., 2022; Li et al., 2022; Liu et al., 2021; Jiang et al.,
2022). Specifically, color distortion in underwater images is
often caused by the selective absorption of light by water.
Furthermore, light scattering contributes to the degradation
of image clarity, resulting in low contrast and blurry details
(Zhuang et al., 2022). While the addition of artificial illumi-
123
International Journal of Computer Vision
nation has made the underwater environment more complex,
the reconstruction of color loss in underwater images remains
an essential and valuable area of research that has received
significant attention (Liu et al., 2022a,b; Qi et al., 2021;
Ren et al., 2021; Yuan et al., 2021). High-quality underwa-
ter images are valuable for various tasks, including target
detection (Zang et al., 2013), recognition, and segmentation.
To overcome these challenges, IFM-based methods are
used to reduce backscattering and improve color distortion
and contrast in underwater imaging. The depth map of a scene
is crucial for IFM, and traditional depth estimation methods
rely on hand-crafted priors, which can lead to errors. Deep
learning-based depth estimation methods are more accurate
and robust, but require a large dataset for network training,
which is difficult to obtain underwater.
Therefore, we propose a depth estimation method that
combines prior and unsupervised methods based on Com-
prehensive Imaging Formation Model (CIFM) to reconstruct
underwater images. Extensive experiments conducted on
multiple underwater databases have demonstrated that our
method can produce enhanced results with superior visual
quality compared to other relevant techniques. The major
contributions of this paper can be summarized as follows:
(1) We propose a novel restoration strategy based on
a CIFM, which involves three stages: monocular depth
estimation, backscatter removal, and color correction. This
approach leverages absolute depth map estimations and an
adaptive dark pixel prior for efficient and dynamic backscat-
ter elimination across varying depths, followed by a color
correction to rectify color bias and enhance image bright-
ness.
(2) We design the CIP to estimate the depth map consid-
ering the underwater light attenuation rate. Integrating the
depth map of the CIP prior with an unsupervised method
to generate the fused depth map (CIP+), which effectively
overcomes prior failure and unsupervised errors.
(3) We construct the ADP to calculate the minimum and
maximum distances in the dynamic depth transformation
based on varying image degradation levels and NIQE met-
rics. ADP not only accelerates algorithmic operations but
also minimizes backscatter fitting errors via calculating the
sum channel and strategic dark pixel selection for different
depth intervals.
(4) We develop a color compensation strategy that
improves the precision of the attenuation coefficient fit-
ting by defining the minimum distance between consecutive
data points. Concurrently, we devise a color balance proce-
dure that accounts for the pixel intensity distribution within
the blue and green channels to establish the color bal-
ance factor. Our approach adeptly circumvents the issue of
over-amplification artifacts commonly associated with the
low-intensity red channel in underwater imaging.
The remainder of this paper is organized as follows. In
Sect. 2, we introduce two imaging models and provide a suc-
cinct recap of previous work in the domain of underwater
image enhancement. Our proposed method is presented in
detail in Sect. 3. Subsequently, Sect. 4presents an exten-
sive series of experiments validating the effectiveness of our
approach. Finally, we conclude and consolidate our findings
in Sect. 5, where we also discuss potential directions for
future work.
2 Background
In this section, we will first provide an overview of two under-
water imaging models. Subsequently, we review research
related to underwater image enhancement, including physi-
cal model-based methods, non-physical model-based meth-
ods, and deep learning-based methods.
2.1 Underwater Image Formation Model
The Jaffe–McGlamey imaging model (Jaffe, 1990) depicted
in Fig. 1, the imaging process can be represented as a com-
bination of three elements: direct scattering, backscattering,
and forward scattering. However, direct scattering is typically
disregarded, allowing the imaging model to be simplified as
follows:
Ic=Jcexp(−βcz)+Ac(1−exp(−βcz)), c∈{R,G,B}(1)
where Iand Jare the degraded and clear images, respec-
tively. Ais the global background light, βis the light
attenuation coefficient, and zis the distance from the camera
to the scene. exp(−βz)is the medium transmission map indi-
cating the portion of Jthat reaches the camera. Jexp(−βz)
is the forward scattering, which is the main cause of blur and
fog effects. A(1−exp(−βz)) is the backward scattering,
which causes contrast degradation and color bias in under-
water images.
According to Akkaynak et al. (2017), we noted the lima-
tion of the Jaffe–McGlamey imaging model in effectively
portraying the multifaceted nature of underwater imaging.
This is due to its failure to account for the varying dependence
of direct scattering attenuation coefficient and backward scat-
tering coefficient, and instead, merely assuming them to be
identical. To delve deeper into the imaging process, they con-
ducted in-situ experiments (Akkaynak & Treibitz, 2018)in
two different types of optical water bodies and analyzed the
functional relationships and parameter dependencies. Their
work exposed the inaccuracies originating from this oversim-
plification, prompting them to design a revised, more robust
underwater imaging model.
123
International Journal of Computer Vision
Fig. 1 Underwater imaging model and light absorption schematic
Ic=Jcexp(−βD
c(νD)·z)+A∞
c(1−exp(−βB
c(νB)·z))
(2)
where I,J,A,z, and care the same as in the Jaffe–McGlamey
imaging model, and the vectors νDand νBdenote the param-
eter dependence of the attenuation coefficient βD
cin direct
scattering and the scattering coefficient βB
cin backward
scattering, respectively, as follows:
νD={z,ξ,H,Rs,β},νB={H,Rs,γ,β}(3)
where ξ,H,Rsdenotes the scene reflectance, irradiance,
and sensor spectral response parameters, respectively. β
and γdenote the beam attenuation and scattering coeffi-
cients, respectively. zdenotes the amount of change in
distance. Based on the wavelength λof visible light and the
global background light A∞(λ),βD
cand βB
ccan be further
expressed as:
βD
c=ln Rs(λ)ξ(λ) H(λ) exp(−β(λ)(z))dλ
Rs(λ)ξ(λ) H(λ) exp(−β(λ)(z+z))dλ/z
(4)
βB
c=−ln 1−Rs(λ) A∞(λ)(1−exp(−β (λ)(z)))dλ
Rs(λ)A∞(λ)dλ/z
(5)
However, compared to the Jaffe–McGlamey imaging
model, the CIFM to invert underwater degradation processes
is limited to certain scenarios due to the reliance on precise
depth information and a series of manually measured optical
parameters.
2.2 Related Work
Recently, various techniques have been devised to enhance
the clarity of underwater images. These underwater image
enhancement (UIE) methods can be broadly categorized into
three groups: physical model-based, non-physical model-
based, and deep learning-based methods.
Physical model-based methods The method is rooted in
a physical imaging model for underwater environments,
which employs specific prior constraints to determine the
background light and transmission maps, thereby inverting
the degradation process and producing high-quality images.
Several prior-based depth estimation methods have been
proposed for underwater imaging. Carlevaris-Bianco et al.
(2010) introduced the scene depth Prior (MIP) that estimates
depth using the significant differences in light attenuation
across the three color channels in water. Drews et al. (2013)
designed the Underwater Dark Channel Prior (UDCP), which
is based on the selective absorption of light by water, exclud-
ing the red channel. Peng and Cosman (2017) proposed
the Image Blurriness and Light Absorption (IBLA) method,
which relies on image ambiguity and light absorption prior
to estimating the scene depth and restoring degraded under-
water images. Recently, Berman et al. (2017) suggested the
Haze-Line prior to dealing with wavelength-dependent atten-
uation in underwater images. Song et al. (2018) proposed a
123
International Journal of Computer Vision
Table 1 Underwater image depth estimation prior
Method Prior on IcFormula
Carlevaris-Bianco et al. (2010)Dmip minc∈r,y∈(x){Ic(y)}−maxc∈(g,b),y∈(x){Ic(y)}
He et al. (2010)Irgb
dark(x)minc∈(r,g,b),y∈(x){Ic(y)}
Drews et al. (2013)Igb
dark(x)minc∈(g,b), y∈(x){Ic(y)}
Galdran et al. (2015)Irgb
dark(x)miny∈(x)1−Ir(y), Ig(y), Ib(y)
Dong and Wen (2016)Irgb
dark (x)minc∈(r,g,b),y∈(x){1−Ic(y)}
Peng and Cosman (2017)Ir(x)Pblr Dmip maxy∈(x){Ir(y)}Crmaxy∈(x)1
nn
i=1|Ig(y)−Gri,ri(y)|
minc∈r,y∈(x){Ic(y)}−maxc∈(g,b),y∈(x){Ic(y)}
Song et al. (2018)Irgb(x)maxc∈(g,b),y∈(x){Ic(y)}−minc∈r,y∈(x){Ic(y)}
Zhou et al. (2022)L(x)Dmip 1
2maxc∈(r,g,b),y∈(x){Ic(y)}+minc∈(r,g,b),y∈(x){Ic(y)}
maxc∈(g,b),y∈(x){Ic(y)}−minc∈r,y∈(x){Ic(y)}
quick depth estimation model based on the Underwater Light
Attenuation Prior (ULAP). The model’s coefficients were
trained using supervised linear regression with a learning-
based approach. Furthermore, Akkaynak and Treibitz (2019)
improved the underwater imaging model, called Sea-Thru,
by considering underwater specificities. Based on CIFM
(Akkaynak & Treibitz, 2019), Zhou et al. (2021,2022)
proposed a new underwater unsupervised depth estimation
method and a backward scattering-based color compensa-
tion method, respectively. Table 1lists several prior-based
depth estimation methods. These methods based on physi-
cal models are usually efficient, yet they heavily depend on
manually designed prior knowledge.
Non-physical model-based methods This method
enhances the image quality without relying on physical mod-
els by directly manipulating pixel values to produce more
visually appealing underwater images. Popular techniques
include image fusion (Ancuti et al., 2012,2017), histogram
stretching (Hitam et al., 2013), and Retinex-based methods
(Zhuang & Ding, 2020; Zhuang et al., 2021). For example,
Ancuti et al. (2012) suggested a fusion approach that com-
bined different feature images into a single image through
weight assignment. Following this, Ancuti et al. (2017)
advanced their approach by creating a multiscale fusion
technique that combined the white balance method’s color
correction and the histogram method’s contrast-boosted ver-
sion, resulting in favorable outcomes for underwater images
that experience substantial red channel reduction. Hitam et
al. (2013) developed a hybrid Contrast-Limited Adaptive
Histogram Equalization (CLAHE) method. This method car-
ries out CLAHE operations on both RGB and HSV color
models and then uses Euclidean parametric to merge the
results, thereby improving the image contrast in small areas.
Zhuang et al. (2021) proposed a Bayesian Retinex algorithm,
which simplifies the complex underwater image enhance-
ment process by dividing it into two simpler denoising sub-
problems using multi-order gradient priors on reflectance and
illumination. However, these image enhancement techniques
lack the consideration of the fundamental principles of under-
water imaging, which can result in over-enhancement and
overexposure in the output images.
Deep learning-based methods The trend of deep learning
in the field of UIE has emerged due to its exceptional capabil-
ity in robust and powerful feature learning. Li et al. (2020), an
underwater image enhancement network (UWCNN) based
on CNN was developed, utilizing synthetic underwater
images for training. The network aims to restore clear under-
water images through an end-to-end approach that considers
the optical properties of various underwater environments.
Nonetheless, the UWCNN lacks the capability to determine
the appropriate water type automatically. An unsupervised
color correction network named WaterGAN was presented
by Li et al. (2017). It integrated a generative adversarial net-
work (GAN) with a physical underwater imaging model,
generating a dataset of improved underwater images and
accompanying depth information. Li et al. (2019) proposed
a gated fusion network, known as WaterNet, and created an
underwater image enhancement benchmark dataset (UIEBD)
that includes a variety of scenes. They also developed cor-
responding high-quality reference images. Li et al. (2021)
developed a Ucolor network by taking inspiration from a
physical underwater imaging model. The network incor-
porates multicolor spatial embedding and media transport
guidance to improve its response to areas with degraded
quality. The proposed Lightweight Adaptive Feature Fusion
Network (LAFFNet) in Yang et al. (2021) incorporates mul-
tiple adaptive feature fusion modules from the codec model
to generate multi-scale feature mappings and utilizes chan-
nel attention to merge these features dynamically. Tang et
al. (2022) introduced a new search space that incorporates
transformers and employs neural structure search to find the
optimal U-Net structure for enhancing underwater images.
This leads to the creation of an effective and lightweight
deep network with ease. Fu et al. (2022) proposed a new
123
International Journal of Computer Vision
Fig. 2 The flowchart of the proposed approach. Our methodology com-
prises three steps: depth estimation, backscatter removal, and color
reconstruction. Specifically, the CIP+depth map is derived by fus-
ing the Channel Intensity Prior (CIP) depth map, which accounts for
distinct light attenuation laws, with the MONO2 depth map obtained
from an unsupervised approach. Backscatter is then removed using an
Adaptive Dark Pixel (ADP) technique, dynamically adapted according
to varying degrees of image degradation. Finally, the image’s color and
luminance are reconstructed via color compensation and balancing
probabilistic network PUIE, to enhance the distribution of
degraded underwater images and mitigate bias in reference
map markers. However, deep learning-based UIE methods
face a common challenge: the need for large, high-quality
public training datasets.
3 Proposed Method
In this study, we propose a novel approach for underwater
image restoration leveraging the Comprehensive Imaging
Formation Model (CIFM). The proposed method encom-
passes the development of an enhanced Channel Intensity
Prior (CIP+) for depth estimation, the deployment of
Adaptive Dark Pixels (ADP) for backscatter removal, and
advanced techniques for color reconstruction. The overall
process is illustrated in Fig. 2, which will be detailed in the
following sections.
3.1 Simplified Model
For many years, IFM-based methods have been popular for
recovering underwater images. However, unlike the atmo-
sphere, its results are inconsistent and depend on the under-
water environment, making them unreliable and unstable.
This is due to the dependence of the underwater environment
on wavelength and scene. IFM assumes that the attenuation
and scattering coefficients are equal. However, the improved
underwater imaging model (CIFM) explicitly considers their
differences (Zhou et al., 2021). However, the model faces
difficulties in its application to underwater image recovery
methods, due to the numerous parameters. That is difficult
to estimate. To tackle this issue, our method simplifies the
improved underwater imaging model. Based on Akkaynak
and Treibitz (2018), we understand that there is one attenua-
tion coefficient value for each color and distance in the scene,
the imaging distance has the greatest effect on the attenua-
123
International Journal of Computer Vision
Fig. 3 Depth estimation for underwater images. aDCP (He et al., 2010),
bUDCP (Drews et al., 2013), cMIP (Carlevaris-Bianco et al., 2010),
dIRC (invert the R channel intensity) (Galdran et al., 2015), and eOur
depth estimation method. The top row shows the estimated results for
the natural light scene, while the bottom row demonstrates estimates
for the artificial light scene. The darker areas represent regions further
from the camera. The images are sourced from the UIEBD dataset
tion coefficient, and there is only one scattering coefficient
value for the whole scene. In addition, the backward scatter-
ing increases exponentially with the imaging distance, i.e.,
in a fixed scene (water type), the value of the scattering coef-
ficient also depends on the imaging distance. Therefore, we
ignore other parameters with a smaller effect and focus only
on the effect of imaging distance on the attenuation and scat-
tering coefficients. The simplified improved imaging model
obtained is as follows:
Ic=Jce−βD
c·z+Ac1−e−βB
c·z(6)
where Ic,Jc,Ac,z,βD
cand βB
care consistent with the revised
model. Compared with the IFM, the simplified and improved
imaging model considers the various functional dependen-
cies between the direct reflection and backward scattering
components, leading to a more precise representation of
underwater imaging.
3.2 Depth Estimation
The simplified and improved imaging model is contingent on
the absolute depth value of the scene, estimating the depth
map as a crucial aspect of the processing procedure. The
sea-thru method (Akkaynak & Treibitz, 2019) employs the
Structure From Motion (SFM) method to obtain the depth
map in meters. However, the SFM approach requires multi-
ple images of the same scene, which becomes a limitation
given the variable and challenging underwater conditions.
This constraint underscores the importance of monocular
depth estimation in enhancing the flexibility of underwater
imaging applications.
Traditional depth estimation methods using DCP and MIP
can obtain accurate results under ideal lighting conditions
for monocular depth estimation. However, when underwater
lighting conditions are not ideal, these prior assumptions are
violated, causing decreased accuracy in in-depth estimation
and recovery outcomes.
The first row in Fig. 3presents a natural illumination
image of the shallow water area. Regarding the DCP, the
fish and coral in the foreground exhibit dark pixels, resulting
in the dark channels having low values. Hence, they are accu-
rately identified to be near. Conversely, the background lacks
extremely dark pixels, leading the dark channel to exhibit
high values, and these areas are inferred to be relatively
distant. For the MIP, the closer sites yield more significant
Dmip values than the farther sites, thereby providing accurate
depth estimation. However, the far end of this image exhibits
substantially greater brightness than the near point, i.e., the
distant R-channel intensity is larger, leading to an error in the
IRC depth estimation. The second row shows an underwa-
ter image captured under artificial lighting. The traditional
depth estimation methods yield unsatisfactory results. The
DCP incorrectly classifies the bright fish in the foreground as
distant, and the depth span is not obvious. The RGB channel
values are similar across the image result, leading to incorrect
estimation of the MIP and IRC channels. Unlike these meth-
ods, our method accurately delineates the foreground and
the background under different lighting conditions achiev-
ing more precise depth variations.
Deep learning-based techniques leverage the remarkable
feature extraction capacity of neural networks, resulting in
increased robustness and precision. However, obtaining the
pixel-level depth datasets necessary for training supervised
depth estimation methods is challenging (Bhoi, 2019). In
contrast, instead of minimizing the error in the classical depth
map, the unsupervised depth estimation method monodepth2
(Godard et al., 2019) network estimates the pose between
two images, computes the depth map, and then uses the esti-
mated pose and depth map to compute the reprojection of the
first image on the second image. The loss to be minimized
is the reconstruction error of that estimate. This enables the
123
International Journal of Computer Vision
Fig. 4 Monodepth2 performs depth estimation of underwater images and uses them directly in the recovery results of our method. The top row
showcases a successful outcome, whereas the bottom row demonstrates a failure case. The source of these images is the UIEBD
network to receive training solely from stereo images, elim-
inating the limitations posed by the dataset, and producing
precise depth map estimates.
Nevertheless, compared to atmospheric images, underwa-
ter images often exhibit significant color cast due to selective
light absorption by water. Monodepth2 is inefficient in deal-
ing with heavily skewed underwater images, leading to a
failure in image recovery. The top row in Fig.4is suitable
for the monodepth2 method, which allows for precise depth
map estimation and optimal recovery outcomes. The second-
row image has a classical deep-sea image, with severely
attenuated red and blue channels and a greenish hue. The
monodepth2 method is unsuitable for depth estimation and
restoration of these types of images.
In order to make Monodepth2 work in underwater scenar-
ios, we present CIP+, a solution that merges unsupervised
and prior techniques to calculate the scene depth. Our
proposal starts with the channel intensity prior (CIP). In
underwater environments, red light with the longest wave-
length decays quickly, followed by blue-green light. This
means that the distance of an object is proportional to the
lower intensity of the red channel and the higher difference
in intensity between the blue and green channels. We then
elaborate on how CIP and unsupervised methods are com-
bined using color deviation factors.
The definition of the red channel map R is as follows:
R(x)=min
y∈(x)
{1−Ir(y)}(7)
where (x)is a square local patch centered at x,Icis the
observed intensity in color channel cof the input image at
pixel x. Based on CIP, the red light with the longest wave-
length decays quickest as distance increases. As a result, the
farther away from the camera, the lesser the red channel per-
centage of the image field. Hence, we can calculate the depth
estimation directly from the red channel map, denoted
dr:
dr=Ns(R)(8)
where Nsis a normalized function, defined as follows:
Ns(ν) =ν−min ν
max ν−min ν(9)
where νis a vector.
The chromatic aberration map M is defined as:
M(x)=max
y∈(x)Ig(y), Ib(y)−max
y∈(x)Ir(y)(10)
Adopting the CIP, the greater the difference in intensity
between the three channels, the greater the distance. Depth
estimate is obtained, denoted
dm:
dm=Ns(M)(11)
Combining Eqs. (8) and (11), the CIP depth is obtained as
follows:
dcip =α
dm+(1−α)
dr(12)
123
International Journal of Computer Vision
where α=SSum(Igra y >127.5)
Size(Igr ay ),0.2,Sum(x)and Size(y)
count the number of pixels that match the condition xand the
number of all pixels in y, respectively. The sigmod function
S(a,δ)is defined as follows:
S(a,δ)=1
1+e−s(a−δ) (13)
The value of αhinges upon the global illumination of the
image. When the percentage of pixels in the grayscale image
Igra y, with values greater than 127.5, is significantly below
0.2, αis set to 0. This implies that majority of the pixel points
in the image exhibit lower intensity, the overall brightness of
the image is dim, and the intensity variation between the
three channels is negligible, making
dminapplicable. Con-
sequently, the depth can only be accurately represented by
dr. Conversely, when the proportion of pixels with values
larger than 127.5 in the grayscale map Igr ay significantly sur-
passes 0.2, αis set to 1. The image is brighter, the background
light becomes relatively brighter, and the intensity accounted
for by the background light becomes more significant at the
pixel points farther away, resulting in the possibility that the
more distant pixel exhibit larger values in the red channel and
are incorrectly assumed to be closer. Thus, for the brighter
images,
dmis utilized to represent the scene depth. For cases
in these two cases, the depth is obtained by a weighted com-
bination of the two methods. The optimal values of 127.5
and 0.2 were experimentally derived, and we recommend
variations between [110–140] and [0.15–0.35]. Changing the
127.5 and 0.2 to values beyond these ranges will invariably
result in αbeing either to 1 or 0.
The unsupervised depth estimation is directly obtained by
the monodepth2 approach, which is denoted as
dmono.Com-
bining the prior and unsupervised estimations, the CIP+
depth of the underwater scene is described as follows:
dcip+=β
dcip +(1−β)
dmono (14)
where β=S(k,2),kis the image color bias factor. When
kis significantly larger than 2, β= 1, this implies a heavily
color biased, the unsupervised method is infeasible, and the
depth is represented by the prior depth estimate
dcip. Con-
versely, if kis substantially less than 2, β= 0, there indicates
no color bias and a more accurate depth estimate can be rep-
resented by
dmono. Between these two cases, the depth map
is obtained though a weighted combination of both meth-
ods. Through experimental validation, 2 is established as the
optimal value for classified images, and we recommended
operational range of [1.5−3.5]. Changing 2 to a smaller or
larger value will result in βalways being fixed to 1 or 0,
undermining the adaptivity of the method.
Determining the color bias coefficient kis a critical aspect
in image enhancement. To address this, we adopt the equiv-
alent circle-based chromaticity detection method described
in Xu et al. (2008). The traditional methods exhibit certain
limations, which merely depend on the average image chro-
maticity or the image luminance extreme chromaticity to
measure the degree of image chromaticity. The working prin-
ciple of this method is that if the chromaticity distribution
in the two-dimensional histogram on the a-bchromaticity
coordinate plane is a single peak or the distribution is more
concentrated. The chromaticity average plays a crucial role
in assessing the level of color deviation in a sample. Gen-
erally, a larger chromaticity average often indicates a more
severe color bias.
Leveraging the principle of equivalent circle, we derive
the color bias coefficient, denoted as k:
k=D
M(15)
where D=d2
a+d2
bis the average image chromaticity and
M=m2
a+m2
bis the chromaticity center distance.
da=W
i=1V
i=1a
WV (16)
db=W
i=1V
i=1b
WV (17)
ma=W
i=1V
i=1(a−da)2
WV (18)
mb=W
i=1V
i=1(b−db)2
WV (19)
where Wand Vare the width and height of the image, respec-
tively, in pixels. In the a-bchromaticity plane, the coordinate
center of the equivalent circle is (da,db). The color balance
of the image is established by the location of the equivalent
circle on the coordinate system. If da>0, the overall image
hue is red. Otherwise, it is green; if db>0, the overall image
hue is yellow. Otherwise, it is blue. As the value of the color
bias factor kincreases, the severity of the color bias becomes
more severe.
Details of the Depth-Estimation Algorithm are outlined in
Algorithm 1.
3.3 Backscatter Estimation
In the preceding section, the obtained depth map is a relative
depth rather than an absolute depth, and its depth values are
dimensionless and only related to other objects in the scene
rather than an absolute depth in meters. To address this issue,
we propose adaptive dark pixels, which aim to dynamically
convert the relative depth to absolute depth and effectively
remove backscatter.
123
International Journal of Computer Vision
Algorithm 1 Depth-Estimate
Require: Ic
Ensure:
dcip +
while c∈(R,G,B)do
Ic⇐Ic/255
end while
R⇐get Min (1−Ir,5∗5)
dr⇐Fs(R)
M⇐getG BM ax(Ic,5∗5)−get R Max(Ir,5∗5)
dm⇐Fs(M)
dmono ⇐monodepth2(Ic)
Igray ⇐BG R2GRAY(Ic)
α⇐S[sum(Igray >127.5)/si ze(Igray ), 0.2]
Ilab ⇐BGR2LAB(Ic)
β⇐S[sqrt(a2+b2)/sqr t(var (a)2+var(b)2), 2]
dcip ⇐α
dm+(1−α)
dr
dcip +⇐β
dcip +(1−β)
dmono
First, we categorize underwater images into two groups,
each defined by the background: images with seawater as the
background and images with other backgrounds. For the for-
mer category, the theoretical maximum distance is ∞,butthe
visibility decreases rapidly with increasing distance. There-
fore, we define a maximum visibility dmax withapre-set
default value of 12meters. For images where other elements
form the background, this default visibility limit is reduced
to 8 meters. It is important to note that these values, 12 and 8,
are determined empirically based on the underwater camera’s
visibility, and we recommend a range of [8–15] for these val-
ues. Any alteration to these numbers, either smaller or larger,
could result in corresponding changes in the absolute depths,
potentially increasing the backward scattering fitting error.
Moreover, the limited field of view of camera lenses
often results in elements situated at shallow depths being
not discernible in the captured image. To resolve the con-
cern, we add the nearest distance dmin in meters for each
depth within the scene. By estimating the maximum differ-
ence between the pixel intensity of the maximum depth value
and the observed intensity Icin the input image, the estimated
dmin ∈[0,1]can be efficiently computed.
dmin =1−max
x,c∈{r,g,b}
|θ−Ic(x)|
max(θ , 255 −θ) (20)
where θ=Ic(arg max d(x)) represents the pixel intensity at
the maximum depth, i.e., the global background light. Con-
sequently, when the global background light contributes a
significant portion to the pixel intensity of the nearest pixel
point, the gap between the closest and farthest pixel points
decreases, resulting in an increase in dmin.
We employ a linear conversion method to convert the rel-
ative depth values xin the original depth map to absolute
depth values y. The conversion equation is:
y=dmax −dmin
d
max −d
min
x−d
min
dmax −dmin
d
max −d
min
+dmin (21)
where d
max represents the highest value in the relative depth
map and d
min represents the lowest value, introducing dmax
and dmin to adjust the relative depth.
Finally, the optimal recovery map is governed by the
dmax value, as determined by the NIQE index. Unlike other
current non-reference image quality assessment (IQA) algo-
rithms, which require prior knowledge of image distortion
and training on subjective human evaluations, NIQE imple-
ments “quality-aware” statistical feature sets by means of
a simple and effective statistical model of natural scenes
in the spatial domain, thereby solely requiring measurable
deviations from the statistical patterns observed in natural
images. Thus, NIQE does not require contact with distorted
images and avoids the instability of subjective factors. As
indicated by Mittal et al. (2012), NIQE outperforms the full-
reference peak signal-to-noise ratio (PSNR) and structural
similarity (SSIM) metrics and provides the same perfor-
mance as the top-performing no-reference, option aware, and
distortion-aware IQA algorithms. The more details of the
Depth-Conversion process can be found in Algorithm 2.
Algorithm 2 Depth-Convert
Require: Ic,
dcip +
Ensure:
dabsolute
while c∈(R,G,B)do
ML ⇐Ic{argmax[
dcip +(x)]}
dmin ⇐1−max[| ML −Ic|/max(ML,255 −ML)]
end while
d
max ⇐max(
dcip +)
d
min ⇐min(
dcip +)
while dmax ∈(8,12)do
df⇐(dmax −dmin )/(d
max −d
min )
dabsolute ⇐df
dcip ++
dmin
end while
In addressing backscatter removal, we employ the prin-
ciple of dark pixel prior knowledge (Akkaynak & Treibitz,
2019). This is grounded on the following assumption that in
any given underwater scenario, there exist regions that exhibit
zero reflectance (ξ=0), which indicates an absence of any
color light reflection. Such regions are potentially attributable
to black objects or the shadow cast by various objects.
Unlike the dark pixel prior, our approach posits that the
dark pixel should be the pixel with the smallest summation
of the R,G, and Bchannels. This is underpinned by the
recognition that the intensity of the black pixel is the sum of
Bc, while the other pixels are dictated by Bc+Dc.Given
123
International Journal of Computer Vision
the same depth, it is inherently evident that the former will
invariably be smaller than the latter.
Subsequently, the pixel points in the degraded image are
separated into Tgroups based on the depth map. The total
sum of RGB values is computed for each group of pixel
points, and the first Ypixel points with the lowest values are
selected as the initial estimates for backward scattering. T
and Yare calculated as follows:
T=dmax −2(22)
Y=min Ni∗T
10000 ,N(23)
where the dynamic value of Tis adopted to avoid the viola-
tion of the dark pixel prior, which could occur due to the small
depth span of individual intervals in close scenes where the
overall brightness of the image is high. Nirepresents the total
number of pixels in a particular group. Given the backscatter
estimation, we only require a small number of backscattered
pixels, thus Nis set to 500. As can be observed in the second
row of Fig. 2, the selected black pixels are denoted by red
dots.
Upon obtaining the initial estimate of the backscattering
Bcand its corresponding depth value z, we ascertain the value
of Jc,βD
c,Ac,βB
cin Eq. (24). The correlation between the
tri-channel backscattering value and the depth value facili-
tates the derivation.
Bc=Jc(x)e−βD
cz(x)+Ac(1−e−βB
cz(x))(24)
where Jc,Ac∈[0,1],βD
c,andβB
c∈[0,10]. In the fit-
ting procedure, we noticed irregularity and discreteness in
shallow-depth data, which affected the fitting. To address this
issue, we set a minimum threshold for the color depth uti-
lized in the estimation process, defaulting 0.1% of the depth
values. Consequently, we can calculate the direct reflection
Dcof the scene as detailed below:
Dc=Ic−Bc(25)
Details of backscatter estimation are displayed in Algo-
rithm 3.
3.4 Color Reconstruction
The process of removing backward scattering from the raw
image merely resolves the haze effect attributable to scatter-
ing, while does not correct color distortion induced by light
absorption, as illustrated in Fig. 5. Consequently, it neces-
sitates the implementation of both color compensation and
color balance to reconstruct the image’s color and luminance
to yield a more natural recovery outcome.
Algorithm 3 Backsactter-Estimation
Require: Ic,
dabsolute,dmax
Ensure: Dc
(w, h)⇐size(Ir)
Sum ⇐zeros(w, h)
while c∈(R,G,B)do
Sum+=Ic
end while
start ⇐min(
dabsolute)
T⇐dmax −2
scope ⇐[max(
dabsolute)−min(
dabsolute)]/T
end ⇐start +scope
while i∈1→Tdo
Ni⇐find(
dabsolute >start&
dabsolute <end)
start ⇐end
end ⇐end +scope
if Ni∗0.001 >500 then
Y⇐500
else
Y⇐Ni∗0.001
end if
[indexx,indexy]⇐sort(Sum)
while j∈(1→Y)do
B⇐Ic(indexx(j),indexy(j))
D⇐
dabsolute(indexx(j), indexy(j))
end while
parameter ⇐lsqcurvefit(f,D,B)
Backscatter ⇐f(
dabsolute,parameter)
Dc=Ic−Backscatter
end while
Building on the work of Akkaynak and Treibitz Akkaynak
and Treibitz (2019), we model the color compensation factor
βD
c(z)as the summation of two exponentials, as detailed
follows:
βD
c(z)=a∗ebz +c∗edz (26)
A preliminary estimate of βD
c(z)can be obtained from the
scene light source diagram Hc(Ebner & Hansen, 2013), as
follows:
−log Hc
z=βD
c(z)(27)
Next, employing the already established range map z,we
refine the estimate of βD
c(z)obtained via minβD
c(z)z−z
from Eq. (28). This yields the following:
z=−log Hc
βD
c(z)(28)
In the refinement stage, we set the minimum distance
between consecutive depth inputs to at least 1% of the total
depth range. This setting balances the distribution of pixels
at varying depths within the input dataset, allowing for an
accurate estimation of the fitting exponential trend, rather
than being dominated by dense clusters of data points at the
123
International Journal of Computer Vision
Fig. 5 Backscatter removal process: aoriginal image from UIEBD, bresults of scattering removal by our method, ccorresponding RGB channel
backscatter fit curve, with the horizontal axis representing the imaging distance in meters and the vertical axis representing the color value
minimum and maximum depths.
Jc=DceβD
c(z)z(29)
After image reconstruction using Eq. (29), we enhance
the visual appeal by calculating the color balance factor,
grounded in the CIP prior. This factor considers the pixel
intensity distribution in the blue and green channels, avoid-
ing the red artifacts caused by over-compensation of the red
channel in extreme cases.
wg=avg(max
10% (Ig)) (30)
wb=avg(max
10% (Ib)) (31)
where avg(max10%(Ic)) is calculated by taking the average
of the intensity of the larger top 10% pixel points among
all the pixel points of channel Ic. Further, the green channel
color balance factor Wgand the blue channel color balance
factor Wbcan be calculated as:
Wg=wg
2∗(wb+wg)(32)
Wb=wb
2∗(wb+wg)(33)
The green and blue channels, Igand Ib, are updated as
follows:
Ig=Wg∗Ig(34)
Ib=Wb∗Ib(35)
4 Experiment and Analysis
In this section, we evaluate the effectiveness of our method
against several traditional and deep learning methods. We
also discuss the impact of critical components of our method
through detailed enhancement and ablation studies. Finally,
we further analyze the time complexity of our method.
4.1 Experimental Settings
Comparison Methods To evaluate the efficacy of our method-
ology, we conducted a comparative study with nine other
techniques used to improve underwater images. These
included five methods based on physical models, including
IBLA (TIP’17) (Peng & Cosman, 2017), GDCP (TIP’18)
(Peng et al., 2018), ULAP (RCM’18) (Song et al., 2018),
SMBL (TB’20) (Song et al., 2020), L2UWE (CVPR’20)
(Marques & Albu, 2020), as well as four deep learning-based
methods like WaterNet (TIP’19) (Li et al., 2019), UWCNN
123
International Journal of Computer Vision
Fig. 6 Visual comparisons on images of different color bias: aorig-
inal image from UIEBD, b–kdisplay the results acquired by IBLA,
GDCP, ULAP, WaterNet, SMBL, UWCNN, L2UWE, Ucolor, PUIE,
and our method, respectively. The best UCIQE scores in each case are
highlighted in red, and the second-best scores are denoted in blue
(PR’20) (Li et al., 2020), Ucolor (TIP’21) (Li et al., 2021),
and PUIE (ECCV’22) (Fu et al., 2022).
Benchmark Datasets We evaluated our method using sev-
eral datasets, including the UIEBD (Li et al., 2019), MABLs
(Song et al., 2020), UCCS (Liu et al., 2020), U-45 (Li et
al., 2019) and EUVP (Islam et al., 2020) datasets. The UIEB
dataset (Li et al., 2019) includes 890 image pairs captured in
real underwater environments, while the UCCS dataset (Liu
et al., 2020) is divided into subsets of different water hues to
test different color correction techniques. MABLs (Song et
al., 2020) come with manual annotations of background light
in images. U-45 (Li et al., 2019) is a commonly used dataset
for underwater image testing, and EUVP (Islam et al., 2020)
encompasses a diverse range of underwater objects.
Evaluation Metrics To assess the image quality, we
employed four metrics: Underwater Color Image Quality
Evaluation (UCIQE) (Yang & Sowmya, 2015), Contrast-
changed Image Quality Measure (CEIQ) Yan et al. (2019),
Naturalness Image Quality Evaluator (NIQE) (Mittal et al.,
2012), and Information Entropy (IE) (Zhang et al., 2019).
UCIQE evaluates image quality through chromaticity, sat-
uration, and contrast, with a higher score indicating better
quality. CEIQ assesses the overall quality using five contrast-
related features, with higher scores indicating higher quality.
NIQE gauges image quality by comparing it to a model
derived from natural scenes, with lower scores implying
better quality. Finally, IE represents the average amount of
information in the image, and a higher score means more
information and richer color.
4.2 Subjective Assessment
The performance of various color correction methods was
evaluated using the UIEB dataset, as depicted in Fig.6.In
instances where there is a strong green color cast, the out-
comes produced by IBLA (Peng & Cosman, 2017), GDCP
(Peng et al., 2018), ULAP (Song et al., 2018), and L2UWE
(Marques & Albu, 2020) fall short of expectations. This is due
to the near-zero values of the red and blue channels, result-
ing in a prior failure. SMBL (Song et al., 2020) and L2UWE
(Marques & Albu, 2020) enhance the clarity of low-visibility
underwater images, but are not fully successful in eliminat-
ing the haze. In high scattering images, the values in the RGB
channels tend to be similar, which leads to the ULAP prior
not working effectively. This results in an overcompensation
of the red channel. WaterNet (Li et al., 2019), UWCNN (Li
et al., 2020), Ucolor (Li et al., 2021), and PUIE (Fu et al.,
2022) correct color distortion. Still, their loss functions do
not prioritize luminance information, resulting in inadequate
improvement of contrast in low-visibility images and result-
ing in local darkness. From the RGB histogram in Fig. 6k, it
can be seen that our method effectively removes color bias
and enhances contrast, thereby effectively resolving artificial
artifacts. UCIQE values confirm the superior visual quality
of the results.
123
International Journal of Computer Vision
Fig. 7 Visual comparisons on images of various illumination condi-
tions: aoriginal image from MABLs, EUVP, and U-45, b–killustrate
the results obtained by IBLA, GDCP, ULAP, WaterNet, SMBL,
UWCNN, L2UWE, Ucolor, PUIE, and our method, respectively. The
best UCIQE scores in each case are highlighted in red, and the second-
best scores are denoted in blue
Fig. 8 Visual comparison results of high scattering and high dis-
torted color images: aoriginal image from MABLs, b–kshowcase the
results obtained by IBLA, GDCP, ULAP, WaterNet, SMBL, UWCNN,
L2UWE, Ucolor, PUIE, and our method, respectively. The best UCIQE
scores in each case are highlighted in red, and the second-best scores
are denoted in blue
To effectively handle diverse underwater lighting condi-
tions and address the problem of non-uniform illumination
caused by artificial light sources, we evaluated the enhance-
ment results of images with different illumination conditions
on the MABLs, U-45, and EUVP datasets. As depicted in
Fig. 7. Existing methods such as IBLA (Peng & Cosman,
2017), GDCP (Peng et al., 2018), SMBL (Song et al., 2020),
and L2UWE (Marques & Albu, 2020) resulted in overexpo-
sure images when enhancing artificially illuminated images.
Although methods like WaterNet (Li et al., 2019), UWCNN
(Lietal.,2020), Ucolor (Li et al., 2021), and PUIE (Fu et
al., 2022) performed well on artificially illuminated images,
they introduced local darkness in low-illumination images,
with WaterNet exhibiting the worst performer in this regard.
However, IBLA, SMBL, and L2UWE still exist with overex-
posure. In contrast, our approach surpasses the performance
of the compared methods in enhancing contrast and pre-
serving details, avoiding over or under-enhancement, and
preventing the creation of dark regions. The UCIQE scores
demonstrate that our approach effectively enhances contrast
and removes haze under various lighting conditions.
To evaluate the effectiveness and robustness of various
techniques, we conducted image enhancement experiments
on a dataset of MABLs with high backscatter and color bias,
asshowninFig.8. It is evident that several compared methods
encounter difficulties when applied to enhance challenging
underwater images. GDCP (Peng et al., 2018) induce undesir-
able color distortions. while L2UWE boosts the texture and
edge sharpness, it also causes a blurring of the image’s details.
IBLA, WaterNet, UWCNN, and Ucolor tend to introduce
localized color bias without effectively correcting the overall
darkness of the image. SMBL (Song et al., 2020) effectively
improves the color of the image, but does not eliminate the
fog effect of the image. In contrast, our method successfully
eliminates unnatural colors, while improving visibility, ren-
dering more details and vibrant colors. Additionally, UCIQE
values also show that our method is effective and robust.
123
International Journal of Computer Vision
Table 2 Quantitative evaluations of various techniques on the UIEB, U-45, UCCS, and MABLS datasets were conducted
Datasets Indexes IBLA GDCP ULAP WaterNet SMBL UWCNN L2UWE Ucolor PUIE Ours
UIEBD UCIQE 0.5941 0.5827 0.6034 0.5837 0.6031 0.5364 0.5371 0.5715 0.5659 0.6214
CEIQ 3.3118 3.2605 3.3862 3.1614 3.3323 3.1605 3.3176 3.2609 3.3186 3.4019
NIQE 3.2226 3.4754 3.2046 3.1625 3.1986 3.3233 3.2634 3.3454 3.2053 3.1466
IE 7.2962 7.2250 7.2071 7.1125 7.3107 7.0533 7.3021 7.2164 7.3078 7.4304
MABLs UCIQE 0.5737 0.5727 0.5677 0.5845 0.5846 0.5136 0.5220 0.5542 0.5519 0.6177
CEIQ 3.1511 3.1836 3.0552 3.1595 3.1874 3.0564 3.2114 3.1807 3.2481 3.3220
NIQE 3.6640 4.1634 3.6809 3.8062 3.6710 3.6593 4.2178 3.4600 3.7372 3.5594
IE 7.0392 7.1375 6.8782 7.1206 7.0908 6.8962 7.1757 7.0913 7.1951 7.3149
U-45 UCIQE 0.5922 0.5937 0.5960 0.5989 0.5836 0.5461 0.5503 0.5730 0.5631 0.6365
CEIQ 2.9121 3.1914 3.1912 3.1863 3.2491 3.2126 3.1984 3.2826 3.2957 3.4089
NIQE 5.1003 4.1202 3.7641 4.5488 3.7272 4.0803 3.8118 4.4583 3.8646 3.9783
IE 6.6261 7.1924 7.1842 7.1991 7.2534 7.1709 7.1926 7.2951 7.3142 7.4761
UCCS UCIQE 0.5462 0.5636 0.5175 0.5620 0.5679 0.4947 0.4920 0.5473 0.5337 0.5906
CEIQ 3.0100 3.0470 2.9056 3.2210 3.1846 3.1712 3.2439 3.2845 3.2421 3.4109
NIQE 3.6123 4.1728 3.6624 3.9630 3.6612 3.9591 3.8705 3.7834 3.8896 3.4745
IE 6.8711 6.9760 6.7098 7.2134 7.1663 7.1045 7.2795 7.2701 7.2363 7.4844
The top performer is indicated in bold, and the second best in italics for each case
4.3 Objective Assessment
To validate the earlier subjective observations, we employed
an objective evaluation technique to conduct a more com-
prehensive assessment of the quality of the restored images.
Table 2presents the average scores of the four no-reference
quality metrics (UCIQE, CEIQ, NIQE, and IE) for vari-
ous methods. These methods were applied to the UIEBD,
MABLs, U-45, and UCCS datasets.
It can be observed that deep learning-based methods such
as WaterNet, UWCNN, Ucolor, and PUIE have performed
well on the four test datasets. They exhibit lower UCIQE
and favorable CEIQ, NIQE, and IE scores. The convolutional
capabilities of these deep learning methods allow them to
correct color distortion effectively. However, they may not
be as effective as traditional methods that employ physical
models in terms of enhancing contrast and increasing color
vividness.
The physics-based model, including IBLA, GDCP, ULAP,
SMBL, and L2UWE, have demonstrated lower CEIQ, NIQE,
and IE scores on the four test datasets. The root cause of
this phenomenon is that these methods use physics-based
atmospheric imaging models. However, atmospheric imag-
ing models fall short of accurately describing the degradation
of underwater image quality. These physical-based methods
achieve optimal performance and necessitate the incorpora-
tion of accurate prior information, a requirement they often
fail to meet in underwater imaging scenarios.
Thanks to the differentiation of attenuation and scattering
coefficients with CIFM and dynamic removal of backscatter
based on image type by ADP, our method achieves the highest
UCIQE, CEIQ, and IE scores on all four test datasets. More-
over, our method outperforms other state-of-the-art methods
regarding the NIQE score. In conclusion, our approach’s
superiority in color correction is proved by both qualitative
and quantitative evaluations.
4.4 Comparisons of Detail Enhancement
Precise fine structural details are essential for generating
high-quality underwater images. To evaluate the effective-
ness of various enhancement techniques in enhancing the
detailed portions of the images, we conducted a comparison
by localized zoom, as illustrated by the red and blue boxes
in Fig. 9. From a global perspective, our method effectively
removes color distortion and improves contrast. On a local
scale, it enhances image structure details and significantly
enhances clarity and information.
4.5 Ablation Study
To validate the efficacy of the core components in our method,
i.e., the CIP+and ADP modules, we conducted an exten-
sive series of ablation studies on the UIEB dataset. The tested
variants include (a) the original image, (b) our method with-
out channel intensity prior depth estimation (-w/o CIP), (c)
our method without self-supervised depth estimation (-w/o
MONO), (d) our method with only red channel prior depth
estimates (-o/y R), (e) our method with only chromatic aber-
ration prior depth estimates (-o/y M), (f) our method without
123
International Journal of Computer Vision
Fig. 9 Detail enhances the visual effect of different methods: aoriginal image from MABLs, b–kresults obtained by IBLA, GDCP, ULAP,
WaterNet, SMBL, UWCNN, L2UWE, Ucolor, PUIE, and Our method, respectively
Fig. 10 Qualitative ablation results are presented for each key component of our method: aoriginal image from UIEBD, b–kresults obtained by
-w/o CIP, -w/o MONO, -o/y R, -o/y M, -w/o ADP, and Our method (full model), respectively
adaptive dark pixel (-w/o ADP), and (g) our method (full
model).
To assess the effectiveness of the CIP+module, we per-
formed a detailed comprehensive ablation study. As depicted
in Fig. 10(b)–(e), the backscatter was successfully removed in
certain areas, and the contrast was improved. However, there
existed inaccurate depth estimation, resulting in the introduc-
tion of artifacts and over-enhancement in some regions. The
depth estimation results for each portion from (b) to (e) are
shown in Fig. 11. Both MONO2 and M depth maps exhibited
inaccuracies of varying extents, which were primarily due
to color bias and luminance loss. Although the R depth map
appeared accurate, it incorporated excessive image detail and
lacked smoothness. Contrarily, the CIP+model provides a
more accurate and smoother depth estimate. The objective
ablation experiments results are reported in Table 3. All met-
rics of the incomplete CIP+module experienced varying
degrees of the decline attributed to depth estimation errors.
Therefore, the CIP+module is the superior choice.
To explore the effectiveness of the ADP module, we elim-
inate the ADP module and obtain a variant of the method.
As shown in Fig. 10, it can be seen that (f) successfully
removes color bias and enhances texture detail, but it also
causes local darkness due to the fitting error. Meanwhile,
Fig. 10(g) includes all the critical components for the best
visual outcome. The fitting results with and without the ADP
module are displayed in Fig. 12, and it is evident that the fit-
ting loss is smaller with the addition of the ADP module in
various scenarios. To further validate our ablation study, we
employed the full reference metrics SSIM (Structural Sim-
ilarity Index) and PSNR (Peak Signal-to-Noise Ratio). The
detailed ablation results are presented in Table 3. It’s evident
that all indicators of the method without the ADP module
trigger noticeable declines across all performance metrics.
This is attributed to the ADP module’s ability to dynami-
cally select dark pixel points based on the degradation level
of each image, thereby minimizing fitting errors. As a result,
the ADP module proves to be a crucial component of our
method.
4.6 Running Time Comparisons
To evaluate the computational efficiency of our method, we
created an underwater image dataset composed of 100 images
123
International Journal of Computer Vision
Fig. 11 Qualitative ablation results for each key component of our
depth estimation method: aoriginal image from UIEBD, b–fresults
obtained by MONO, M, R, CIP, and CIP+, respectively. The x-axis of
gis the depth, and the y-axis is the chromaticity, reflecting the depth
values represented by the different colors in the depth map
Table 3 Ablation study on the
UIEB dataset -w/o CIP -w/o MONO -o/y R -o/y M -w/o ADP Ours
UCIQE 0.6192 0.6096 0.6009 0.6100 0.6189 0.6214
CEIQ 3.3982 3.3587 3.3026 3.3379 3.3750 3.4019
IE 7.4164 7.3681 7.2503 7.3475 7.3983 7.4304
PSNR 16.517 16.803 16.660 16.420 16.473 17.374
SSIM 0.8038 0.8039 0.8048 0.7929 0.7980 0.8178
The top performer is indicated in bold, and the second best in italics for each case
for each of the following sizes: 256 ×256, 512 ×512, and
1024 ×1024. We tested a PC with an Intel Xeon Silver
4215R CPU @ 3.20 GHz 3.19 GHz and an NVIDIA Tesla
V100 PCIE 32GB GPU. The traditional method was run
using MATLAB R2019a, while Python and PyTorch were
employed for executing the deep learning-based method.
Table 4indicates that deep learning methods generally
perform faster than traditional ones due to their compre-
hensive training and GPU utilization. Conversely, traditional
restoration methods, like IBLA and our method, consume
a considerable amount of time to calculate the background
light and transmission map. This leads to an extended run-
time, primarily driven by the need for repetitive transmission
map estimations. Although our approach may not outperform
other methods in terms of processing speed, it effectively
removes blur and color bias while addressing the issue of
artificial light.
5 Conclusion
In this paper, we propose a novel method for artificial light
removal by combining an adaptive dark pixel prior and color
correction technique within the CIMF framework. We adopt
the CIP+depth estimation technique based on the law
of light attenuation and unsupervised methods, considering
the degree of image degradation. Additionally, we employ
ADP to remove backward scattering effectively. Our method
demonstrates robust performance across various underwater
environments and illumination conditions, yielding visu-
ally pleasing images. Objective experiments report that our
UCIQE/CEIQ outperforms the GDCP method by 6.64%
and 6.79%, and the recently data-driven PUIE approach by
11.36% and 3.35%, evidencing significant improvements in
color recovery and detail enhancement. The extensive exper-
123
International Journal of Computer Vision
Fig. 12 Qualitative ablation results for the key component of our scattering fit: aoriginal image from UIEBD, bthe result of fitting the backward
scattering coefficient of -w/o ADP, and cthe result of fitting the backward scattering coefficient of our method
Table 4 Average runtime of different underwater image enhancement techniques
Methods IBLA GDCP ULAP WaterNet SMBL UWCNN L2UWE Ucolor PUIE Ours
Time(s) 256 * 256 9.134 0.163 0.560 0.091 3.070 0.050 4.829 0.576 0.018 15.114
512 * 512 38.955 0.558 1.593 0.144 13.299 0.061 17.362 0.934 0.035 43.031
1024 * 1024 165.312 2.062 6.492 0.334 56.268 0.087 68.312 2.656 0.129 135.559
The top performer is indicated in bold, and the second best in italics for each case
iments clearly show the efficacy of the proposed method in
enhancing details and restoring the natural color, highlighting
its potential for underwater image restoration. Nevertheless,
our approach requires a more extended runtime for accurate
depth estimation than deep learning-based methods. In future
work, we aim to accelerate its speed and optimize the fitting
procedure.
Funding This work was supported in part by the National Natural
Science Foundation of China (No. 61702074), the Liaoning Provin-
cial Natural Science Foundation of China (No. 20170520196), and
the Fundamental Research Funds for the Central Universities (Nos.
3132019205 and 3132019354).
Data Availibility Data underlying the results presented in our work are
available in UIEBD, U45, UCCS, MABLs, and EUVP.
Declaration
Conflict of interest The authors declare no conflict of interest.
References
Akkaynak, D., & Treibitz, T. (2018). A revised underwater image for-
mation model. In Proceedings of the IEEE conferenceon computer
vision and pattern recognition (pp. 6723–6732).
Akkaynak, D., & Treibitz, T. (2019) Sea-thru: A method for removing
water from underwater images. In Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition (pp. 1682–
1691).
Akkaynak, D., Treibitz, T., Shlesinger, T., Loya, Y., Tamir, R., & Iluz,
D. (2017). What is the space of attenuation coefficients in under-
water computer vision? In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 4931–4940).
Ancuti, C. O., Ancuti, C., De Vleeschouwer, C., & Bekaert, P. (2017).
Color balance and fusion for underwater image enhancement.
IEEE Transactions on image processing, 27(1), 379–393.
Ancuti, C., Ancuti, C. O., Haber, T., & Bekaert, P. (2012). Enhancing
underwater images and videos by fusion. In 2012 IEEE conference
on computer vision and pattern recognition (pp. 81–88). IEEE.
Berman, D., Treibitz, T., & Avidan, S. (2017). Diving into haze-lines:
Color restoration of underwater images. In Proc. British machine
vision conference (BMVC) (Vol 1).
123
International Journal of Computer Vision
Bhoi, A. (2019). Monocular depth estimation: A survey.
arXiv:1901.09402.
Carlevaris-Bianco, N., Mohan, A., & Eustice, R.M. (2010) Initial results
in underwater single image dehazing. In Oceans 2010 Mts/IEEE
seattle (pp. 1–8 ). IEEE
Dong, X., & Wen, J. (2016). Low lighting image enhancement using
local maximum color value prior. Frontiers of Computer Science,
10(1), 147–156.
Drews, P., Nascimento, E., Moraes, F., Botelho, S., & Campos, M.
(2013). Transmission estimation in underwater single images. In
Proceedings of the IEEE international conference on computer
vision workshops (pp. 825–830).
Ebner, M., & Hansen, J. (2013). Depth map color constancy. Bio-
Algorithms and Med-Systems, 9(4), 167–177.
Fu, Z., Wang, W., Huang, Y., Ding, X., & Ma, K.-K. (2022) Uncertainty
inspired underwater image enhancement. In European conference
on computer vision (pp. 465–482). Springer.
Galdran, A., Pardo, D., Picón, A., & Alvarez-Gila, A. (2015). Automatic
red-channel underwater image restoration. Journal of Visual Com-
munication and Image Representation, 26, 132–145.
Godard, C., Mac Aodha, O., Firman, M., & Brostow, G. J. (2019).
Digging into self-supervised monocular depth estimation. In Pro-
ceedings of the IEEE/CVF international conference on computer
vision (pp. 3828–3838).
He, K., Sun, J., & Tang, X. (2010). Single image haze removal using
dark channel prior. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 33(12), 2341–2353.
Hitam, M. S., Awalludin, E. A., Yussof, W. N. J. H. W., & Bachok, Z.
(2013) Mixture contrast limited adaptive histogram equalization
for underwater image enhancement. In 2013 international con-
ference on computer applications technology (ICCAT) (pp. 1–5).
IEEE.
Islam, M. J., Xia, Y., & Sattar, J. (2020). Fast underwater image
enhancement for improved visual perception. IEEE Robotics and
Automation Letters, 5(2), 3227–3234.
Jaffe, J. S. (1990). Computer modeling and the design of optimal under-
water imaging systems. IEEE Journal of Oceanic Engineering,
15(2), 101–111.
Jiang, Q., Gu, Y., Li, C., Cong, R., & Shao, F. (2022). Underwater
image enhancement quality evaluation: Benchmark dataset and
objective metric. IEEE Transactions on Circuits and Systems for
Video Technology, 32(9), 5959–5974.
Jiang, Z., Li, Z., Yang, S., Fan, X., & Liu, R. (2022). Target ori-
ented perceptual adversarial fusion network for underwater image
enhancement. IEEE Transactions on Circuits and Systems for
Video Technology, 32(10), 6584–6598.
Kang, Y., Jiang, Q., Li, C., Ren, W., Liu, H., & Wang, P. (2023). A
perception-aware decomposition and fusion framework for under-
water image enhancement. IEEE Transactions on Circuits and
Systems for Video Technology, 33(3), 988–1002.
Li, C., Anwar, S., Hou, J., Cong, R., Guo, C., & Ren, W. (2021).
Underwater image enhancement via medium transmission-guided
multi-color space embedding. IEEE Transactions on Image Pro-
cessing, 30, 4985–5000.
Li, C., Anwar, S., & Porikli, F. (2020). Underwater scene prior inspired
deep underwater image and video enhancement. Pattern Recogni-
tion, 98, 107038.
Li, C., Guo, C., Ren, W., Cong, R., Hou, J., Kwong, S., & Tao, D.
(2019). An underwater image enhancement benchmark dataset and
beyond. IEEE Transactions on Image Processing, 29, 4376–4389.
Li, H., Li, J., & Wang, W. (2019). A fusion adversarial under-
water image enhancement network with a public test dataset.
arXiv:1906.06819.
Li, J., Skinner, K. A., Eustice, R. M., & Johnson-Roberson, M. (2017).
Watergan: Unsupervised generative network to enable real-time
color correction of monocular underwater images. IEEE Robotics
and Automation letters, 3(1), 387–394.
Li, K., Wu, L., Qi, Q., Liu, W., Gao, X., Zhou, L., & Song, D. (2022).
Beyond single reference for training: Underwater image enhance-
ment via comparative learning. IEEE Transactions on Circuits and
Systems for Video Technology 1–1.
Liu, J., Fan, X., Jiang, J., Liu, R., & Luo, Z. (2021). Learning a deep
multi-scale feature ensemble and an edge-attention guidance for
image fusion. IEEE Transactionson Circuits and Systems for Video
Technology, 32(1), 105–119.
Liu, J., Shang, J., Liu, R., & Fan, X. (2022). Attention-guided global-
local adversarial learning for detail-preserving multi-exposure
image fusion. IEEE Transactionson Circuits and Systems for Video
Technology, 32(8), 5026–5040.
Liu, R., Fan, X., Zhu, M., Hou, M., & Luo, Z. (2020). Real-world
underwater enhancement: Challenges, benchmarks, and solutions
under natural light. IEEE Transactions on Circuits and Systems
for Video Technology, 30(12), 4861–4875.
Liu, R., Jiang, Z., Yang, S., & Fan, X. (2022). Twin adversarial con-
trastive learning for underwater image enhancement and beyond.
IEEE Transactions on Image Processing, 31, 4922–4936.
Marques, T. P., & Albu, A. B. (2020) L2uwe: A framework for the
efficient enhancement of low-light underwater images using local
contrast and multi-scale fusion. In Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition workshops
(pp 538–539).
Mittal, A., Soundararajan, R., & Bovik, A. C. (2012). Making a “com-
pletely blind” image quality analyzer. IEEE Signal Processing
Letters, 20(3), 209–212.
Pan, J., Sun, D., Zhang, J., Tang, J., Yang, J., Tai, Y.-W., & Yang, M.-H.
(2022). Dual convolutional neural networks for low-level vision.
International Journal of Computer Vision, 130(6), 1440–1458.
Peng, Y.-T., Cao, K., & Cosman, P. C. (2018). Generalization of the dark
channel prior for single image restoration. IEEE Transactions on
Image Processing, 27(6), 2856–2868.
Peng, Y.-T., & Cosman, P. C. (2017). Underwater image restoration
based on image blurriness and light absorption. IEEE Transactions
on Image Processing, 26(4), 1579–1594.
Qi, Q., Li, K., Zheng, H., Gao, X., Hou, G., & Sun, K. (2022). Sguie-net:
Semantic attention guided underwater image enhancement with
multi-scale perception. IEEE Transactions on Image Processing,
31, 6816–6830.
Qi, Q., Zhang, Y., Tian, F., Wu, Q. J., Li, K., Luan, X., & Song, D.
(2021). Underwater image co-enhancement with correlation fea-
ture matching and joint learning. IEEE Transactions on Circuits
and Systems for Video Technology, 32(3), 1133–1147.
Ren, W., Pan, J., Zhang, H., Cao, X., & Yang, M.-H. (2020). Sin-
gle image dehazing via multi-scale convolutional neural networks
with holistic edges. International Journal of Computer Vision, 128,
240–259.
Ren, W., Zhang, J., Pan, J., Liu, S., Ren, J. S., Du, J., Cao, X., & Yang,
M.-H. (2021). Deblurring dynamic scenes via spatially varying
recurrent neural networks. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 44(8), 3974–3987.
Song, W., Wang, Y., Huang, D., Liotta, A., & Perra, C. (2020). Enhance-
ment of underwater images with statistical model of background
light and optimization of transmission map. IEEE Transactions on
Broadcasting, 66(1), 153–169.
Song, W., Wang, Y., Huang, D., & Tjondronegoro, D. (2018) A rapid
scene depth estimation model based on underwater light atten-
uation prior for underwater image restoration. In Pac ifi c Ri m
conference on multimedia (pp. 678–688). Springer.
Tang, Y., Iwaguchi, T., Kawasaki, H., Sagawa, R., & Furukawa, R.
(2022) Autoenhancer: Transformer on u-net architecture search
for underwater image enhancement. In Proceedings of the Asian
conference on computer vision (pp. 1403–1420).
123
International Journal of Computer Vision
Xu, X., Cai, Y., Liu, C., Jia, K., & Shen, L. (2008). Color cast detection
and color correction methods based on image analysis. Measure-
ment & Control Technology, 27(5), 10–12.
Yan, J., Li, J., & Fu, X. (2019). No-reference quality assess-
ment of contrast-distorted images using contrast enhancement.
arXiv:1904.08879.
Yang H.-H., Huang K.-C., & Chen, W.-T. (2021) Laffnet: A lightweight
adaptive feature fusion network for underwater image enhance-
ment. In 2021 IEEE international conference on robotics and
automation (ICRA) (pp. 685–692). IEEE.
Yang, M., & Sowmya, A. (2015). An underwater color image qual-
ity evaluation metric. IEEE Transactions on Image Processing,
24(12), 6062–6071.
Yuan, J., Cai, Z., & Cao, W. (2021). Tebcf: Real-world underwater
image texture enhancement model based on blurriness and color
fusion. IEEE Transactions on Geoscience and Remote Sensing,
60, 1–15.
Zang, Y., Zhou, K., Huang, C., & Loy, C. C. (2023). Semi-supervised
and long-tailed object detection with cascadematch. International
Journal of Computer Vision, 131, 1–15.
Zhang, K., Ren, W., Luo, W., Lai, W.-S., Stenger, B., Yang, M.-H.,
& Li, H. (2022). Deep image deblurring: A survey. International
Journal of Computer Vision, 130(9), 2103–2130.
Zhang, W., Dong, L., Pan, X., Zhou, J., Qin, L., & Xu, W. (2019). Single
image defogging based on multi-channel convolutional MSRCR.
IEEE Access, 7, 72492–72504.
Zhou, J., Wang, Y., & Zhang, W. (2022). Underwater image restoration
via information distribution and light scattering prior. Computers
and Electrical Engineering, 100, 107908.
Zhou, J., Yang, T., Chu, W., & Zhang, W. (2022). Underwater image
restoration via backscatter pixel prior and color compensation.
Engineering Applications of Artificial Intelligence, 111, 104785.
Zhou, J., Yang, T.,Ren, W., Zhang, D., & Zhang, W. (2021). Underwater
image restoration via depth map and illumination estimation based
on a single image. Optics Express, 29(19), 29864–29886.
Zhuang, P., & Ding, X. (2020). Underwater image enhancement using
an edge-preserving filtering retinex algorithm. Multimedia Tools
and Applications, 79, 17257–17277.
Zhuang, P., Li, C., & Wu, J. (2021). Bayesian retinex underwater image
enhancement. Engineering Applications of Artificial Intelligence,
101, 104171.
Zhuang, P., Wu, J., Porikli, F., & Li, C. (2022). Underwater image
enhancement with hyper-Laplacian reflectance priors. IEEE Trans-
actions on Image Processing, 31, 5442–5455.
Publisher’s Note Springer Nature remains neutral with regard to juris-
dictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds
exclusive rights to this article under a publishing agreement with the
author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such
publishing agreement and applicable law.
123
A preview of this full-text is provided by Springer Nature.
Content available from International Journal of Computer Vision
This content is subject to copyright. Terms and conditions apply.