ArticlePDF Available

Underwater Camera: Improving Visual Perception Via Adaptive Dark Pixel Prior and Color Correction

August 2023
International Journal of Computer Vision

August 2023

DOI:10.1007/s11263-023-01853-3

Authors:

Jingchun Zhou

The Hong Kong Polytechnic University

Show all 6 authorsHide

We present a novel method for underwater image restoration, which combines a Comprehensive Imaging Formation Model with prior knowledge and unsupervised techniques. Our approach has two main components: depth map estimation using a Channel Intensity Prior (CIP) and backscatter elimination through Adaptive Dark Pixels (ADP). The CIP effectively mitigates issues caused by solid-colored objects and highlighted regions in underwater scenarios. The ADP, utilizing a dynamic depth conversion, addresses issues associated with narrow depth ranges and backscatter. Furthermore, an unsupervised method is employed to enhance the accuracy of monocular depth estimation and reduce artificial illumination influence. The final output is refined via color compensation and a blue-green channel color balance factor, delivering artifact-free images. Experimental results show that our approach outperforms state-of-the-art methods, demonstrating its efficacy in dealing with uneven lighting and diverse underwater environments.

Backsactter-Estimation

…

Underwater imaging model and light absorption schematic

…

+10

The flowchart of the proposed approach. Our methodology comprises three steps: depth estimation, backscatter removal, and color reconstruction. Specifically, the CIP+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$CIP^{+}$$\end{document} depth map is derived by fusing the Channel Intensity Prior (CIP) depth map, which accounts for distinct light attenuation laws, with the MONO2 depth map obtained from an unsupervised approach. Backscatter is then removed using an Adaptive Dark Pixel (ADP) technique, dynamically adapted according to varying degrees of image degradation. Finally, the image’s color and luminance are reconstructed via color compensation and balancing

…

Figures - available from: International Journal of Computer Vision

This content is subject to copyright. Terms and conditions apply.

Content uploaded by Jingchun Zhou

Content may be subject to copyright.

International Journal of Computer Vision

https://doi.org/10.1007/s11263-023-01853-3

Underwater Camera: Improving Visual Perception Via Adaptive Dark

Pixel Prior and Color Correction

Jingchun Zhou1·Qian Liu1·Qiuping Jiang2·Wenqi Ren3·Kin-Man Lam4·Weishi Zhang1

Received: 8 February 2023 / Accepted: 12 July 2023

Abstract

We present a novel method for underwater image restoration, which combines a Comprehensive Imaging Formation Model

with prior knowledge and unsupervised techniques. Our approach has two main components: depth map estimation using a

Channel Intensity Prior (CIP) and backscatter elimination through Adaptive Dark Pixels (ADP). The CIP effectively mitigates

issues caused by solid-colored objects and highlighted regions in underwater scenarios. The ADP, utilizing a dynamic depth

conversion, addresses issues associated with narrow depth ranges and backscatter. Furthermore, an unsupervised method is

employed to enhance the accuracy of monocular depth estimation and reduce artiﬁcial illumination inﬂuence. The ﬁnal output

is reﬁned via color compensation and a blue-green channel color balance factor, delivering artifact-free images. Experimental

results show that our approach outperforms state-of-the-art methods, demonstrating its efﬁcacy in dealing with uneven lighting

and diverse underwater environments.

Keywords Underwater camera imaging ·Underwater image ·Image restoration ·Image enhancement ·Light scattering

Communicated by Chongyi Li.

BKin-Man Lam

enkmlam@polyu.edu.hk

BWei sh i Zh an g

teesiv@dlmu.edu.cn

Jingchun Zhou

zhoujingchun03@qq.com

Qian Liu

qianliu@dlmu.edu.cn

Qiuping Jiang

jiangqiuping@nbu.edu.cn

Wenqi Ren

rwq.renwenqi@gmail.com

1School of Information Science and Technology, Dalian

Maritime University, No. 1 Lingshui Road, Dalian 116026,

Liaoning, China

2School of Information Science and Engineering, Ningbo

University, No. 818 Fenghua Road, Ningbo 315211,

Zhejiang, China

3School of Cyber Science and Technology, Sun Yat-Sen

University, Shenzhen Campus, Shenzhen 518107,

Guangdong, China

4Department of Electronic and Information Engineering, Hong

Kong Polytechnic University, Hong Kong 999077, China

1 Introduction

Underwater imaging is one of the critical technologies for

studying and exploring the underwater world. This technique

gathers information and images of the underwater environ-

ment and objects utilizing high-tech equipment and sensors,

such as laser radar, sonar, and imaging sensors (Kang et

al., 2023; Pan et al., 2022; Zhang et al., 2022). Underwa-

ter cameras can directly capture images of the underwater

environment and marine life, providing critical observational

data and evidence for research and exploration in ocean

energy development and marine life monitoring. Addition-

ally, underwater cameras can support the work of underwater

robots and divers, improving their operational efﬁciency

and safety (Jiang et al., 2022). However, underwater image

restoration poses more signiﬁcant challenges than terrestrial

image restoration, due to selective absorption and scattering

caused by diverse aquatic media, inadequate lighting, and

inferior underwater imaging equipment (Ren et al., 2020;Qi

et al., 2022; Li et al., 2022; Liu et al., 2021; Jiang et al.,

2022). Speciﬁcally, color distortion in underwater images is

often caused by the selective absorption of light by water.

Furthermore, light scattering contributes to the degradation

of image clarity, resulting in low contrast and blurry details

(Zhuang et al., 2022). While the addition of artiﬁcial illumi-

123

International Journal of Computer Vision

nation has made the underwater environment more complex,

the reconstruction of color loss in underwater images remains

an essential and valuable area of research that has received

signiﬁcant attention (Liu et al., 2022a,b; Qi et al., 2021;

Ren et al., 2021; Yuan et al., 2021). High-quality underwa-

ter images are valuable for various tasks, including target

detection (Zang et al., 2013), recognition, and segmentation.

To overcome these challenges, IFM-based methods are

used to reduce backscattering and improve color distortion

and contrast in underwater imaging. The depth map of a scene

is crucial for IFM, and traditional depth estimation methods

rely on hand-crafted priors, which can lead to errors. Deep

learning-based depth estimation methods are more accurate

and robust, but require a large dataset for network training,

which is difﬁcult to obtain underwater.

Therefore, we propose a depth estimation method that

combines prior and unsupervised methods based on Com-

prehensive Imaging Formation Model (CIFM) to reconstruct

underwater images. Extensive experiments conducted on

multiple underwater databases have demonstrated that our

method can produce enhanced results with superior visual

quality compared to other relevant techniques. The major

contributions of this paper can be summarized as follows:

(1) We propose a novel restoration strategy based on

a CIFM, which involves three stages: monocular depth

estimation, backscatter removal, and color correction. This

approach leverages absolute depth map estimations and an

adaptive dark pixel prior for efﬁcient and dynamic backscat-

ter elimination across varying depths, followed by a color

correction to rectify color bias and enhance image bright-

ness.

(2) We design the CIP to estimate the depth map consid-

ering the underwater light attenuation rate. Integrating the

depth map of the CIP prior with an unsupervised method

to generate the fused depth map (CIP+), which effectively

overcomes prior failure and unsupervised errors.

(3) We construct the ADP to calculate the minimum and

maximum distances in the dynamic depth transformation

based on varying image degradation levels and NIQE met-

rics. ADP not only accelerates algorithmic operations but

also minimizes backscatter ﬁtting errors via calculating the

sum channel and strategic dark pixel selection for different

depth intervals.

(4) We develop a color compensation strategy that

improves the precision of the attenuation coefﬁcient ﬁt-

ting by deﬁning the minimum distance between consecutive

data points. Concurrently, we devise a color balance proce-

dure that accounts for the pixel intensity distribution within

the blue and green channels to establish the color bal-

ance factor. Our approach adeptly circumvents the issue of

over-ampliﬁcation artifacts commonly associated with the

low-intensity red channel in underwater imaging.

The remainder of this paper is organized as follows. In

Sect. 2, we introduce two imaging models and provide a suc-

cinct recap of previous work in the domain of underwater

image enhancement. Our proposed method is presented in

detail in Sect. 3. Subsequently, Sect. 4presents an exten-

sive series of experiments validating the effectiveness of our

approach. Finally, we conclude and consolidate our ﬁndings

in Sect. 5, where we also discuss potential directions for

future work.

2 Background

In this section, we will ﬁrst provide an overview of two under-

water imaging models. Subsequently, we review research

related to underwater image enhancement, including physi-

cal model-based methods, non-physical model-based meth-

ods, and deep learning-based methods.

2.1 Underwater Image Formation Model

The Jaffe–McGlamey imaging model (Jaffe, 1990) depicted

in Fig. 1, the imaging process can be represented as a com-

bination of three elements: direct scattering, backscattering,

and forward scattering. However, direct scattering is typically

disregarded, allowing the imaging model to be simpliﬁed as

follows:

Ic=Jcexp(−βcz)+Ac(1−exp(−βcz)), c∈{R,G,B}(1)

where Iand Jare the degraded and clear images, respec-

tively. Ais the global background light, βis the light

attenuation coefﬁcient, and zis the distance from the camera

to the scene. exp(−βz)is the medium transmission map indi-

cating the portion of Jthat reaches the camera. Jexp(−βz)

is the forward scattering, which is the main cause of blur and

fog effects. A(1−exp(−βz)) is the backward scattering,

which causes contrast degradation and color bias in under-

water images.

According to Akkaynak et al. (2017), we noted the lima-

tion of the Jaffe–McGlamey imaging model in effectively

portraying the multifaceted nature of underwater imaging.

This is due to its failure to account for the varying dependence

of direct scattering attenuation coefﬁcient and backward scat-

tering coefﬁcient, and instead, merely assuming them to be

identical. To delve deeper into the imaging process, they con-

ducted in-situ experiments (Akkaynak & Treibitz, 2018)in

two different types of optical water bodies and analyzed the

functional relationships and parameter dependencies. Their

work exposed the inaccuracies originating from this oversim-

pliﬁcation, prompting them to design a revised, more robust

underwater imaging model.

123

International Journal of Computer Vision

Fig. 1 Underwater imaging model and light absorption schematic

Ic=Jcexp(−βD

c(νD)·z)+A∞

c(1−exp(−βB

c(νB)·z))

(2)

where I,J,A,z, and care the same as in the Jaffe–McGlamey

imaging model, and the vectors νDand νBdenote the param-

eter dependence of the attenuation coefﬁcient βD

cin direct

scattering and the scattering coefﬁcient βB

cin backward

scattering, respectively, as follows:

νD={z,ξ,H,Rs,β},νB={H,Rs,γ,β}(3)

where ξ,H,Rsdenotes the scene reﬂectance, irradiance,

and sensor spectral response parameters, respectively. β

and γdenote the beam attenuation and scattering coefﬁ-

cients, respectively. zdenotes the amount of change in

distance. Based on the wavelength λof visible light and the

global background light A∞(λ),βD

cand βB

ccan be further

expressed as:

βD

c=ln Rs(λ)ξ(λ) H(λ) exp(−β(λ)(z))dλ

Rs(λ)ξ(λ) H(λ) exp(−β(λ)(z+z))dλ/z

(4)

βB

c=−ln 1−Rs(λ) A∞(λ)(1−exp(−β (λ)(z)))dλ

Rs(λ)A∞(λ)dλ/z

(5)

However, compared to the Jaffe–McGlamey imaging

model, the CIFM to invert underwater degradation processes

is limited to certain scenarios due to the reliance on precise

depth information and a series of manually measured optical

parameters.

2.2 Related Work

Recently, various techniques have been devised to enhance

the clarity of underwater images. These underwater image

enhancement (UIE) methods can be broadly categorized into

three groups: physical model-based, non-physical model-

based, and deep learning-based methods.

Physical model-based methods The method is rooted in

a physical imaging model for underwater environments,

which employs speciﬁc prior constraints to determine the

background light and transmission maps, thereby inverting

the degradation process and producing high-quality images.

Several prior-based depth estimation methods have been

proposed for underwater imaging. Carlevaris-Bianco et al.

(2010) introduced the scene depth Prior (MIP) that estimates

depth using the signiﬁcant differences in light attenuation

across the three color channels in water. Drews et al. (2013)

designed the Underwater Dark Channel Prior (UDCP), which

is based on the selective absorption of light by water, exclud-

ing the red channel. Peng and Cosman (2017) proposed

the Image Blurriness and Light Absorption (IBLA) method,

which relies on image ambiguity and light absorption prior

to estimating the scene depth and restoring degraded under-

water images. Recently, Berman et al. (2017) suggested the

Haze-Line prior to dealing with wavelength-dependent atten-

uation in underwater images. Song et al. (2018) proposed a

123

International Journal of Computer Vision

Table 1 Underwater image depth estimation prior

Method Prior on IcFormula

Carlevaris-Bianco et al. (2010)Dmip minc∈r,y∈(x){Ic(y)}−maxc∈(g,b),y∈(x){Ic(y)}

He et al. (2010)Irgb

dark(x)minc∈(r,g,b),y∈(x){Ic(y)}

Drews et al. (2013)Igb

dark(x)minc∈(g,b), y∈(x){Ic(y)}

Galdran et al. (2015)Irgb

dark(x)miny∈(x)1−Ir(y), Ig(y), Ib(y)

Dong and Wen (2016)Irgb

dark (x)minc∈(r,g,b),y∈(x){1−Ic(y)}

Peng and Cosman (2017)Ir(x)Pblr Dmip maxy∈(x){Ir(y)}Crmaxy∈(x)1

nn

i=1|Ig(y)−Gri,ri(y)|

minc∈r,y∈(x){Ic(y)}−maxc∈(g,b),y∈(x){Ic(y)}

Song et al. (2018)Irgb(x)maxc∈(g,b),y∈(x){Ic(y)}−minc∈r,y∈(x){Ic(y)}

Zhou et al. (2022)L(x)Dmip 1

2maxc∈(r,g,b),y∈(x){Ic(y)}+minc∈(r,g,b),y∈(x){Ic(y)}

maxc∈(g,b),y∈(x){Ic(y)}−minc∈r,y∈(x){Ic(y)}

quick depth estimation model based on the Underwater Light

Attenuation Prior (ULAP). The model’s coefﬁcients were

trained using supervised linear regression with a learning-

based approach. Furthermore, Akkaynak and Treibitz (2019)

improved the underwater imaging model, called Sea-Thru,

by considering underwater speciﬁcities. Based on CIFM

(Akkaynak & Treibitz, 2019), Zhou et al. (2021,2022)

proposed a new underwater unsupervised depth estimation

method and a backward scattering-based color compensa-

tion method, respectively. Table 1lists several prior-based

depth estimation methods. These methods based on physi-

cal models are usually efﬁcient, yet they heavily depend on

manually designed prior knowledge.

Non-physical model-based methods This method

enhances the image quality without relying on physical mod-

els by directly manipulating pixel values to produce more

visually appealing underwater images. Popular techniques

include image fusion (Ancuti et al., 2012,2017), histogram

stretching (Hitam et al., 2013), and Retinex-based methods

(Zhuang & Ding, 2020; Zhuang et al., 2021). For example,

Ancuti et al. (2012) suggested a fusion approach that com-

bined different feature images into a single image through

weight assignment. Following this, Ancuti et al. (2017)

advanced their approach by creating a multiscale fusion

technique that combined the white balance method’s color

correction and the histogram method’s contrast-boosted ver-

sion, resulting in favorable outcomes for underwater images

that experience substantial red channel reduction. Hitam et

al. (2013) developed a hybrid Contrast-Limited Adaptive

Histogram Equalization (CLAHE) method. This method car-

ries out CLAHE operations on both RGB and HSV color

models and then uses Euclidean parametric to merge the

results, thereby improving the image contrast in small areas.

Zhuang et al. (2021) proposed a Bayesian Retinex algorithm,

which simpliﬁes the complex underwater image enhance-

ment process by dividing it into two simpler denoising sub-

problems using multi-order gradient priors on reﬂectance and

illumination. However, these image enhancement techniques

lack the consideration of the fundamental principles of under-

water imaging, which can result in over-enhancement and

overexposure in the output images.

Deep learning-based methods The trend of deep learning

in the ﬁeld of UIE has emerged due to its exceptional capabil-

ity in robust and powerful feature learning. Li et al. (2020), an

underwater image enhancement network (UWCNN) based

on CNN was developed, utilizing synthetic underwater

images for training. The network aims to restore clear under-

water images through an end-to-end approach that considers

the optical properties of various underwater environments.

Nonetheless, the UWCNN lacks the capability to determine

the appropriate water type automatically. An unsupervised

color correction network named WaterGAN was presented

by Li et al. (2017). It integrated a generative adversarial net-

work (GAN) with a physical underwater imaging model,

generating a dataset of improved underwater images and

accompanying depth information. Li et al. (2019) proposed

a gated fusion network, known as WaterNet, and created an

underwater image enhancement benchmark dataset (UIEBD)

that includes a variety of scenes. They also developed cor-

responding high-quality reference images. Li et al. (2021)

developed a Ucolor network by taking inspiration from a

physical underwater imaging model. The network incor-

porates multicolor spatial embedding and media transport

guidance to improve its response to areas with degraded

quality. The proposed Lightweight Adaptive Feature Fusion

Network (LAFFNet) in Yang et al. (2021) incorporates mul-

tiple adaptive feature fusion modules from the codec model

to generate multi-scale feature mappings and utilizes chan-

nel attention to merge these features dynamically. Tang et

al. (2022) introduced a new search space that incorporates

transformers and employs neural structure search to ﬁnd the

optimal U-Net structure for enhancing underwater images.

This leads to the creation of an effective and lightweight

deep network with ease. Fu et al. (2022) proposed a new

123

International Journal of Computer Vision

Fig. 2 The ﬂowchart of the proposed approach. Our methodology com-

prises three steps: depth estimation, backscatter removal, and color

reconstruction. Speciﬁcally, the CIP+depth map is derived by fus-

ing the Channel Intensity Prior (CIP) depth map, which accounts for

distinct light attenuation laws, with the MONO2 depth map obtained

from an unsupervised approach. Backscatter is then removed using an

Adaptive Dark Pixel (ADP) technique, dynamically adapted according

to varying degrees of image degradation. Finally, the image’s color and

luminance are reconstructed via color compensation and balancing

probabilistic network PUIE, to enhance the distribution of

degraded underwater images and mitigate bias in reference

map markers. However, deep learning-based UIE methods

face a common challenge: the need for large, high-quality

public training datasets.

3 Proposed Method

In this study, we propose a novel approach for underwater

image restoration leveraging the Comprehensive Imaging

Formation Model (CIFM). The proposed method encom-

passes the development of an enhanced Channel Intensity

Prior (CIP+) for depth estimation, the deployment of

Adaptive Dark Pixels (ADP) for backscatter removal, and

advanced techniques for color reconstruction. The overall

process is illustrated in Fig. 2, which will be detailed in the

following sections.

3.1 Simplified Model

For many years, IFM-based methods have been popular for

recovering underwater images. However, unlike the atmo-

sphere, its results are inconsistent and depend on the under-

water environment, making them unreliable and unstable.

This is due to the dependence of the underwater environment

on wavelength and scene. IFM assumes that the attenuation

and scattering coefﬁcients are equal. However, the improved

underwater imaging model (CIFM) explicitly considers their

differences (Zhou et al., 2021). However, the model faces

difﬁculties in its application to underwater image recovery

methods, due to the numerous parameters. That is difﬁcult

to estimate. To tackle this issue, our method simpliﬁes the

improved underwater imaging model. Based on Akkaynak

and Treibitz (2018), we understand that there is one attenua-

tion coefﬁcient value for each color and distance in the scene,

the imaging distance has the greatest effect on the attenua-

123

International Journal of Computer Vision

Fig. 3 Depth estimation for underwater images. aDCP (He et al., 2010),

bUDCP (Drews et al., 2013), cMIP (Carlevaris-Bianco et al., 2010),

dIRC (invert the R channel intensity) (Galdran et al., 2015), and eOur

depth estimation method. The top row shows the estimated results for

the natural light scene, while the bottom row demonstrates estimates

for the artiﬁcial light scene. The darker areas represent regions further

from the camera. The images are sourced from the UIEBD dataset

tion coefﬁcient, and there is only one scattering coefﬁcient

value for the whole scene. In addition, the backward scatter-

ing increases exponentially with the imaging distance, i.e.,

in a ﬁxed scene (water type), the value of the scattering coef-

ﬁcient also depends on the imaging distance. Therefore, we

ignore other parameters with a smaller effect and focus only

on the effect of imaging distance on the attenuation and scat-

tering coefﬁcients. The simpliﬁed improved imaging model

obtained is as follows:

Ic=Jce−βD

c·z+Ac1−e−βB

c·z(6)

where Ic,Jc,Ac,z,βD

cand βB

care consistent with the revised

model. Compared with the IFM, the simpliﬁed and improved

imaging model considers the various functional dependen-

cies between the direct reﬂection and backward scattering

components, leading to a more precise representation of

underwater imaging.

3.2 Depth Estimation

The simpliﬁed and improved imaging model is contingent on

the absolute depth value of the scene, estimating the depth

map as a crucial aspect of the processing procedure. The

sea-thru method (Akkaynak & Treibitz, 2019) employs the

Structure From Motion (SFM) method to obtain the depth

map in meters. However, the SFM approach requires multi-

ple images of the same scene, which becomes a limitation

given the variable and challenging underwater conditions.

This constraint underscores the importance of monocular

depth estimation in enhancing the ﬂexibility of underwater

imaging applications.

Traditional depth estimation methods using DCP and MIP

can obtain accurate results under ideal lighting conditions

for monocular depth estimation. However, when underwater

lighting conditions are not ideal, these prior assumptions are

violated, causing decreased accuracy in in-depth estimation

and recovery outcomes.

The ﬁrst row in Fig. 3presents a natural illumination

image of the shallow water area. Regarding the DCP, the

ﬁsh and coral in the foreground exhibit dark pixels, resulting

in the dark channels having low values. Hence, they are accu-

rately identiﬁed to be near. Conversely, the background lacks

extremely dark pixels, leading the dark channel to exhibit

high values, and these areas are inferred to be relatively

distant. For the MIP, the closer sites yield more signiﬁcant

Dmip values than the farther sites, thereby providing accurate

depth estimation. However, the far end of this image exhibits

substantially greater brightness than the near point, i.e., the

distant R-channel intensity is larger, leading to an error in the

IRC depth estimation. The second row shows an underwa-

ter image captured under artiﬁcial lighting. The traditional

depth estimation methods yield unsatisfactory results. The

DCP incorrectly classiﬁes the bright ﬁsh in the foreground as

distant, and the depth span is not obvious. The RGB channel

values are similar across the image result, leading to incorrect

estimation of the MIP and IRC channels. Unlike these meth-

ods, our method accurately delineates the foreground and

the background under different lighting conditions achiev-

ing more precise depth variations.

Deep learning-based techniques leverage the remarkable

feature extraction capacity of neural networks, resulting in

increased robustness and precision. However, obtaining the

pixel-level depth datasets necessary for training supervised

depth estimation methods is challenging (Bhoi, 2019). In

contrast, instead of minimizing the error in the classical depth

map, the unsupervised depth estimation method monodepth2

(Godard et al., 2019) network estimates the pose between

two images, computes the depth map, and then uses the esti-

mated pose and depth map to compute the reprojection of the

ﬁrst image on the second image. The loss to be minimized

is the reconstruction error of that estimate. This enables the

123

International Journal of Computer Vision

Fig. 4 Monodepth2 performs depth estimation of underwater images and uses them directly in the recovery results of our method. The top row

showcases a successful outcome, whereas the bottom row demonstrates a failure case. The source of these images is the UIEBD

network to receive training solely from stereo images, elim-

inating the limitations posed by the dataset, and producing

precise depth map estimates.

Nevertheless, compared to atmospheric images, underwa-

ter images often exhibit signiﬁcant color cast due to selective

light absorption by water. Monodepth2 is inefﬁcient in deal-

ing with heavily skewed underwater images, leading to a

failure in image recovery. The top row in Fig.4is suitable

for the monodepth2 method, which allows for precise depth

map estimation and optimal recovery outcomes. The second-

row image has a classical deep-sea image, with severely

attenuated red and blue channels and a greenish hue. The

monodepth2 method is unsuitable for depth estimation and

restoration of these types of images.

In order to make Monodepth2 work in underwater scenar-

ios, we present CIP+, a solution that merges unsupervised

and prior techniques to calculate the scene depth. Our

proposal starts with the channel intensity prior (CIP). In

underwater environments, red light with the longest wave-

length decays quickly, followed by blue-green light. This

means that the distance of an object is proportional to the

lower intensity of the red channel and the higher difference

in intensity between the blue and green channels. We then

elaborate on how CIP and unsupervised methods are com-

bined using color deviation factors.

The deﬁnition of the red channel map R is as follows:

R(x)=min

y∈(x)

{1−Ir(y)}(7)

where (x)is a square local patch centered at x,Icis the

observed intensity in color channel cof the input image at

pixel x. Based on CIP, the red light with the longest wave-

length decays quickest as distance increases. As a result, the

farther away from the camera, the lesser the red channel per-

centage of the image ﬁeld. Hence, we can calculate the depth

estimation directly from the red channel map, denoted 

dr:



dr=Ns(R)(8)

where Nsis a normalized function, deﬁned as follows:

Ns(ν) =ν−min ν

max ν−min ν(9)

where νis a vector.

The chromatic aberration map M is deﬁned as:

M(x)=max

y∈(x)Ig(y), Ib(y)−max

y∈(x)Ir(y)(10)

Adopting the CIP, the greater the difference in intensity

between the three channels, the greater the distance. Depth

estimate is obtained, denoted 

dm:



dm=Ns(M)(11)

Combining Eqs. (8) and (11), the CIP depth is obtained as

follows:



dcip =α

dm+(1−α)

dr(12)

123

International Journal of Computer Vision

where α=SSum(Igra y >127.5)

Size(Igr ay ),0.2,Sum(x)and Size(y)

count the number of pixels that match the condition xand the

number of all pixels in y, respectively. The sigmod function

S(a,δ)is deﬁned as follows:

S(a,δ)=1

1+e−s(a−δ) (13)

The value of αhinges upon the global illumination of the

image. When the percentage of pixels in the grayscale image

Igra y, with values greater than 127.5, is signiﬁcantly below

0.2, αis set to 0. This implies that majority of the pixel points

in the image exhibit lower intensity, the overall brightness of

the image is dim, and the intensity variation between the

three channels is negligible, making 

dminapplicable. Con-

sequently, the depth can only be accurately represented by



dr. Conversely, when the proportion of pixels with values

larger than 127.5 in the grayscale map Igr ay signiﬁcantly sur-

passes 0.2, αis set to 1. The image is brighter, the background

light becomes relatively brighter, and the intensity accounted

for by the background light becomes more signiﬁcant at the

pixel points farther away, resulting in the possibility that the

more distant pixel exhibit larger values in the red channel and

are incorrectly assumed to be closer. Thus, for the brighter

images, 

dmis utilized to represent the scene depth. For cases

in these two cases, the depth is obtained by a weighted com-

bination of the two methods. The optimal values of 127.5

and 0.2 were experimentally derived, and we recommend

variations between [110–140] and [0.15–0.35]. Changing the

127.5 and 0.2 to values beyond these ranges will invariably

result in αbeing either to 1 or 0.

The unsupervised depth estimation is directly obtained by

the monodepth2 approach, which is denoted as 

dmono.Com-

bining the prior and unsupervised estimations, the CIP+

depth of the underwater scene is described as follows:



dcip+=β

dcip +(1−β)

dmono (14)

where β=S(k,2),kis the image color bias factor. When

kis signiﬁcantly larger than 2, β= 1, this implies a heavily

color biased, the unsupervised method is infeasible, and the

depth is represented by the prior depth estimate 

dcip. Con-

versely, if kis substantially less than 2, β= 0, there indicates

no color bias and a more accurate depth estimate can be rep-

resented by 

dmono. Between these two cases, the depth map

is obtained though a weighted combination of both meth-

ods. Through experimental validation, 2 is established as the

optimal value for classiﬁed images, and we recommended

operational range of [1.5−3.5]. Changing 2 to a smaller or

larger value will result in βalways being ﬁxed to 1 or 0,

undermining the adaptivity of the method.

Determining the color bias coefﬁcient kis a critical aspect

in image enhancement. To address this, we adopt the equiv-

alent circle-based chromaticity detection method described

in Xu et al. (2008). The traditional methods exhibit certain

limations, which merely depend on the average image chro-

maticity or the image luminance extreme chromaticity to

measure the degree of image chromaticity. The working prin-

ciple of this method is that if the chromaticity distribution

in the two-dimensional histogram on the a-bchromaticity

coordinate plane is a single peak or the distribution is more

concentrated. The chromaticity average plays a crucial role

in assessing the level of color deviation in a sample. Gen-

erally, a larger chromaticity average often indicates a more

severe color bias.

Leveraging the principle of equivalent circle, we derive

the color bias coefﬁcient, denoted as k:

k=D

M(15)

where D=d2

a+d2

bis the average image chromaticity and

M=m2

a+m2

bis the chromaticity center distance.

da=W

i=1V

i=1a

WV (16)

db=W

i=1V

i=1b

WV (17)

ma=W

i=1V

i=1(a−da)2

WV (18)

mb=W

i=1V

i=1(b−db)2

WV (19)

where Wand Vare the width and height of the image, respec-

tively, in pixels. In the a-bchromaticity plane, the coordinate

center of the equivalent circle is (da,db). The color balance

of the image is established by the location of the equivalent

circle on the coordinate system. If da>0, the overall image

hue is red. Otherwise, it is green; if db>0, the overall image

hue is yellow. Otherwise, it is blue. As the value of the color

bias factor kincreases, the severity of the color bias becomes

more severe.

Details of the Depth-Estimation Algorithm are outlined in

Algorithm 1.

3.3 Backscatter Estimation

In the preceding section, the obtained depth map is a relative

depth rather than an absolute depth, and its depth values are

dimensionless and only related to other objects in the scene

rather than an absolute depth in meters. To address this issue,

we propose adaptive dark pixels, which aim to dynamically

convert the relative depth to absolute depth and effectively

remove backscatter.

123

International Journal of Computer Vision

Algorithm 1 Depth-Estimate

Require: Ic

Ensure: 

dcip +

while c∈(R,G,B)do

Ic⇐Ic/255

end while

R⇐get Min (1−Ir,5∗5)



dr⇐Fs(R)

M⇐getG BM ax(Ic,5∗5)−get R Max(Ir,5∗5)



dm⇐Fs(M)



dmono ⇐monodepth2(Ic)

Igray ⇐BG R2GRAY(Ic)

α⇐S[sum(Igray >127.5)/si ze(Igray ), 0.2]

Ilab ⇐BGR2LAB(Ic)

β⇐S[sqrt(a2+b2)/sqr t(var (a)2+var(b)2), 2]



dcip ⇐α

dm+(1−α)



dcip +⇐β

dcip +(1−β)

dmono

First, we categorize underwater images into two groups,

each deﬁned by the background: images with seawater as the

background and images with other backgrounds. For the for-

mer category, the theoretical maximum distance is ∞,butthe

visibility decreases rapidly with increasing distance. There-

fore, we deﬁne a maximum visibility dmax withapre-set

default value of 12meters. For images where other elements

form the background, this default visibility limit is reduced

to 8 meters. It is important to note that these values, 12 and 8,

are determined empirically based on the underwater camera’s

visibility, and we recommend a range of [8–15] for these val-

ues. Any alteration to these numbers, either smaller or larger,

could result in corresponding changes in the absolute depths,

potentially increasing the backward scattering ﬁtting error.

Moreover, the limited ﬁeld of view of camera lenses

often results in elements situated at shallow depths being

not discernible in the captured image. To resolve the con-

cern, we add the nearest distance dmin in meters for each

depth within the scene. By estimating the maximum differ-

ence between the pixel intensity of the maximum depth value

and the observed intensity Icin the input image, the estimated

dmin ∈[0,1]can be efﬁciently computed.

dmin =1−max

x,c∈{r,g,b}

|θ−Ic(x)|

max(θ , 255 −θ) (20)

where θ=Ic(arg max d(x)) represents the pixel intensity at

the maximum depth, i.e., the global background light. Con-

sequently, when the global background light contributes a

signiﬁcant portion to the pixel intensity of the nearest pixel

point, the gap between the closest and farthest pixel points

decreases, resulting in an increase in dmin.

We employ a linear conversion method to convert the rel-

ative depth values xin the original depth map to absolute

depth values y. The conversion equation is:

y=dmax −dmin

d

max −d

min

x−d

min

dmax −dmin

d

max −d

min

+dmin (21)

where d

max represents the highest value in the relative depth

map and d

min represents the lowest value, introducing dmax

and dmin to adjust the relative depth.

Finally, the optimal recovery map is governed by the

dmax value, as determined by the NIQE index. Unlike other

current non-reference image quality assessment (IQA) algo-

rithms, which require prior knowledge of image distortion

and training on subjective human evaluations, NIQE imple-

ments “quality-aware” statistical feature sets by means of

a simple and effective statistical model of natural scenes

in the spatial domain, thereby solely requiring measurable

deviations from the statistical patterns observed in natural

images. Thus, NIQE does not require contact with distorted

images and avoids the instability of subjective factors. As

indicated by Mittal et al. (2012), NIQE outperforms the full-

reference peak signal-to-noise ratio (PSNR) and structural

similarity (SSIM) metrics and provides the same perfor-

mance as the top-performing no-reference, option aware, and

distortion-aware IQA algorithms. The more details of the

Depth-Conversion process can be found in Algorithm 2.

Algorithm 2 Depth-Convert

Require: Ic,

dcip +

Ensure: 

dabsolute

while c∈(R,G,B)do

ML ⇐Ic{argmax[

dcip +(x)]}

dmin ⇐1−max[| ML −Ic|/max(ML,255 −ML)]

end while

d

max ⇐max(

dcip +)

d

min ⇐min(

dcip +)

while dmax ∈(8,12)do

df⇐(dmax −dmin )/(d

max −d

min )



dabsolute ⇐df

dcip ++

dmin

end while

In addressing backscatter removal, we employ the prin-

ciple of dark pixel prior knowledge (Akkaynak & Treibitz,

2019). This is grounded on the following assumption that in

any given underwater scenario, there exist regions that exhibit

zero reﬂectance (ξ=0), which indicates an absence of any

color light reﬂection. Such regions are potentially attributable

to black objects or the shadow cast by various objects.

Unlike the dark pixel prior, our approach posits that the

dark pixel should be the pixel with the smallest summation

of the R,G, and Bchannels. This is underpinned by the

recognition that the intensity of the black pixel is the sum of

Bc, while the other pixels are dictated by Bc+Dc.Given

123

International Journal of Computer Vision

the same depth, it is inherently evident that the former will

invariably be smaller than the latter.

Subsequently, the pixel points in the degraded image are

separated into Tgroups based on the depth map. The total

sum of RGB values is computed for each group of pixel

points, and the ﬁrst Ypixel points with the lowest values are

selected as the initial estimates for backward scattering. T

and Yare calculated as follows:

T=dmax −2(22)

Y=min Ni∗T

10000 ,N(23)

where the dynamic value of Tis adopted to avoid the viola-

tion of the dark pixel prior, which could occur due to the small

depth span of individual intervals in close scenes where the

overall brightness of the image is high. Nirepresents the total

number of pixels in a particular group. Given the backscatter

estimation, we only require a small number of backscattered

pixels, thus Nis set to 500. As can be observed in the second

row of Fig. 2, the selected black pixels are denoted by red

dots.

Upon obtaining the initial estimate of the backscattering

Bcand its corresponding depth value z, we ascertain the value

of Jc,βD

c,Ac,βB

cin Eq. (24). The correlation between the

tri-channel backscattering value and the depth value facili-

tates the derivation.

Bc=Jc(x)e−βD

cz(x)+Ac(1−e−βB

cz(x))(24)

where Jc,Ac∈[0,1],βD

c,andβB

c∈[0,10]. In the ﬁt-

ting procedure, we noticed irregularity and discreteness in

shallow-depth data, which affected the ﬁtting. To address this

issue, we set a minimum threshold for the color depth uti-

lized in the estimation process, defaulting 0.1% of the depth

values. Consequently, we can calculate the direct reﬂection

Dcof the scene as detailed below:

Dc=Ic−Bc(25)

Details of backscatter estimation are displayed in Algo-

rithm 3.

3.4 Color Reconstruction

The process of removing backward scattering from the raw

image merely resolves the haze effect attributable to scatter-

ing, while does not correct color distortion induced by light

absorption, as illustrated in Fig. 5. Consequently, it neces-

sitates the implementation of both color compensation and

color balance to reconstruct the image’s color and luminance

to yield a more natural recovery outcome.

Algorithm 3 Backsactter-Estimation

Require: Ic,

dabsolute,dmax

Ensure: Dc

(w, h)⇐size(Ir)

Sum ⇐zeros(w, h)

while c∈(R,G,B)do

Sum+=Ic

end while

start ⇐min(

dabsolute)

T⇐dmax −2

scope ⇐[max(

dabsolute)−min(

dabsolute)]/T

end ⇐start +scope

while i∈1→Tdo

Ni⇐find(

dabsolute >start&

dabsolute <end)

start ⇐end

end ⇐end +scope

if Ni∗0.001 >500 then

Y⇐500

else

Y⇐Ni∗0.001

end if

[indexx,indexy]⇐sort(Sum)

while j∈(1→Y)do

B⇐Ic(indexx(j),indexy(j))

D⇐

dabsolute(indexx(j), indexy(j))

end while

parameter ⇐lsqcurvefit(f,D,B)

Backscatter ⇐f(

dabsolute,parameter)

Dc=Ic−Backscatter

end while

Building on the work of Akkaynak and Treibitz Akkaynak

and Treibitz (2019), we model the color compensation factor

βD

c(z)as the summation of two exponentials, as detailed

follows:

βD

c(z)=a∗ebz +c∗edz (26)

A preliminary estimate of βD

c(z)can be obtained from the

scene light source diagram Hc(Ebner & Hansen, 2013), as

follows:

−log Hc

z=βD

c(z)(27)

Next, employing the already established range map z,we

reﬁne the estimate of βD

c(z)obtained via minβD

c(z)z−z

from Eq. (28). This yields the following:

z=−log Hc

βD

c(z)(28)

In the reﬁnement stage, we set the minimum distance

between consecutive depth inputs to at least 1% of the total

depth range. This setting balances the distribution of pixels

at varying depths within the input dataset, allowing for an

accurate estimation of the ﬁtting exponential trend, rather

than being dominated by dense clusters of data points at the

123

International Journal of Computer Vision

Fig. 5 Backscatter removal process: aoriginal image from UIEBD, bresults of scattering removal by our method, ccorresponding RGB channel

backscatter ﬁt curve, with the horizontal axis representing the imaging distance in meters and the vertical axis representing the color value

minimum and maximum depths.

Jc=DceβD

c(z)z(29)

After image reconstruction using Eq. (29), we enhance

the visual appeal by calculating the color balance factor,

grounded in the CIP prior. This factor considers the pixel

intensity distribution in the blue and green channels, avoid-

ing the red artifacts caused by over-compensation of the red

channel in extreme cases.

wg=avg(max

10% (Ig)) (30)

wb=avg(max

10% (Ib)) (31)

where avg(max10%(Ic)) is calculated by taking the average

of the intensity of the larger top 10% pixel points among

all the pixel points of channel Ic. Further, the green channel

color balance factor Wgand the blue channel color balance

factor Wbcan be calculated as:

Wg=wg

2∗(wb+wg)(32)

Wb=wb

2∗(wb+wg)(33)

The green and blue channels, Igand Ib, are updated as

follows:

Ig=Wg∗Ig(34)

Ib=Wb∗Ib(35)

4 Experiment and Analysis

In this section, we evaluate the effectiveness of our method

against several traditional and deep learning methods. We

also discuss the impact of critical components of our method

through detailed enhancement and ablation studies. Finally,

we further analyze the time complexity of our method.

4.1 Experimental Settings

Comparison Methods To evaluate the efﬁcacy of our method-

ology, we conducted a comparative study with nine other

techniques used to improve underwater images. These

included ﬁve methods based on physical models, including

IBLA (TIP’17) (Peng & Cosman, 2017), GDCP (TIP’18)

(Peng et al., 2018), ULAP (RCM’18) (Song et al., 2018),

SMBL (TB’20) (Song et al., 2020), L2UWE (CVPR’20)

(Marques & Albu, 2020), as well as four deep learning-based

methods like WaterNet (TIP’19) (Li et al., 2019), UWCNN

123

International Journal of Computer Vision

Fig. 6 Visual comparisons on images of different color bias: aorig-

inal image from UIEBD, b–kdisplay the results acquired by IBLA,

GDCP, ULAP, WaterNet, SMBL, UWCNN, L2UWE, Ucolor, PUIE,

and our method, respectively. The best UCIQE scores in each case are

highlighted in red, and the second-best scores are denoted in blue

(PR’20) (Li et al., 2020), Ucolor (TIP’21) (Li et al., 2021),

and PUIE (ECCV’22) (Fu et al., 2022).

Benchmark Datasets We evaluated our method using sev-

eral datasets, including the UIEBD (Li et al., 2019), MABLs

(Song et al., 2020), UCCS (Liu et al., 2020), U-45 (Li et

al., 2019) and EUVP (Islam et al., 2020) datasets. The UIEB

dataset (Li et al., 2019) includes 890 image pairs captured in

real underwater environments, while the UCCS dataset (Liu

et al., 2020) is divided into subsets of different water hues to

test different color correction techniques. MABLs (Song et

al., 2020) come with manual annotations of background light

in images. U-45 (Li et al., 2019) is a commonly used dataset

for underwater image testing, and EUVP (Islam et al., 2020)

encompasses a diverse range of underwater objects.

Evaluation Metrics To assess the image quality, we

employed four metrics: Underwater Color Image Quality

Evaluation (UCIQE) (Yang & Sowmya, 2015), Contrast-

changed Image Quality Measure (CEIQ) Yan et al. (2019),

Naturalness Image Quality Evaluator (NIQE) (Mittal et al.,

2012), and Information Entropy (IE) (Zhang et al., 2019).

UCIQE evaluates image quality through chromaticity, sat-

uration, and contrast, with a higher score indicating better

quality. CEIQ assesses the overall quality using ﬁve contrast-

related features, with higher scores indicating higher quality.

NIQE gauges image quality by comparing it to a model

derived from natural scenes, with lower scores implying

better quality. Finally, IE represents the average amount of

information in the image, and a higher score means more

information and richer color.

4.2 Subjective Assessment

The performance of various color correction methods was

evaluated using the UIEB dataset, as depicted in Fig.6.In

instances where there is a strong green color cast, the out-

comes produced by IBLA (Peng & Cosman, 2017), GDCP

(Peng et al., 2018), ULAP (Song et al., 2018), and L2UWE

(Marques & Albu, 2020) fall short of expectations. This is due

to the near-zero values of the red and blue channels, result-

ing in a prior failure. SMBL (Song et al., 2020) and L2UWE

(Marques & Albu, 2020) enhance the clarity of low-visibility

underwater images, but are not fully successful in eliminat-

ing the haze. In high scattering images, the values in the RGB

channels tend to be similar, which leads to the ULAP prior

not working effectively. This results in an overcompensation

of the red channel. WaterNet (Li et al., 2019), UWCNN (Li

et al., 2020), Ucolor (Li et al., 2021), and PUIE (Fu et al.,

2022) correct color distortion. Still, their loss functions do

not prioritize luminance information, resulting in inadequate

improvement of contrast in low-visibility images and result-

ing in local darkness. From the RGB histogram in Fig. 6k, it

can be seen that our method effectively removes color bias

and enhances contrast, thereby effectively resolving artiﬁcial

artifacts. UCIQE values conﬁrm the superior visual quality

of the results.

123

International Journal of Computer Vision

Fig. 7 Visual comparisons on images of various illumination condi-

tions: aoriginal image from MABLs, EUVP, and U-45, b–killustrate

the results obtained by IBLA, GDCP, ULAP, WaterNet, SMBL,

UWCNN, L2UWE, Ucolor, PUIE, and our method, respectively. The

best UCIQE scores in each case are highlighted in red, and the second-

best scores are denoted in blue

Fig. 8 Visual comparison results of high scattering and high dis-

torted color images: aoriginal image from MABLs, b–kshowcase the

results obtained by IBLA, GDCP, ULAP, WaterNet, SMBL, UWCNN,

L2UWE, Ucolor, PUIE, and our method, respectively. The best UCIQE

scores in each case are highlighted in red, and the second-best scores

are denoted in blue

To effectively handle diverse underwater lighting condi-

tions and address the problem of non-uniform illumination

caused by artiﬁcial light sources, we evaluated the enhance-

ment results of images with different illumination conditions

on the MABLs, U-45, and EUVP datasets. As depicted in

Fig. 7. Existing methods such as IBLA (Peng & Cosman,

2017), GDCP (Peng et al., 2018), SMBL (Song et al., 2020),

and L2UWE (Marques & Albu, 2020) resulted in overexpo-

sure images when enhancing artiﬁcially illuminated images.

Although methods like WaterNet (Li et al., 2019), UWCNN

(Lietal.,2020), Ucolor (Li et al., 2021), and PUIE (Fu et

al., 2022) performed well on artiﬁcially illuminated images,

they introduced local darkness in low-illumination images,

with WaterNet exhibiting the worst performer in this regard.

However, IBLA, SMBL, and L2UWE still exist with overex-

posure. In contrast, our approach surpasses the performance

of the compared methods in enhancing contrast and pre-

serving details, avoiding over or under-enhancement, and

preventing the creation of dark regions. The UCIQE scores

demonstrate that our approach effectively enhances contrast

and removes haze under various lighting conditions.

To evaluate the effectiveness and robustness of various

techniques, we conducted image enhancement experiments

on a dataset of MABLs with high backscatter and color bias,

asshowninFig.8. It is evident that several compared methods

encounter difﬁculties when applied to enhance challenging

underwater images. GDCP (Peng et al., 2018) induce undesir-

able color distortions. while L2UWE boosts the texture and

edge sharpness, it also causes a blurring of the image’s details.

IBLA, WaterNet, UWCNN, and Ucolor tend to introduce

localized color bias without effectively correcting the overall

darkness of the image. SMBL (Song et al., 2020) effectively

improves the color of the image, but does not eliminate the

fog effect of the image. In contrast, our method successfully

eliminates unnatural colors, while improving visibility, ren-

dering more details and vibrant colors. Additionally, UCIQE

values also show that our method is effective and robust.

123

International Journal of Computer Vision

Table 2 Quantitative evaluations of various techniques on the UIEB, U-45, UCCS, and MABLS datasets were conducted

Datasets Indexes IBLA GDCP ULAP WaterNet SMBL UWCNN L2UWE Ucolor PUIE Ours

UIEBD UCIQE 0.5941 0.5827 0.6034 0.5837 0.6031 0.5364 0.5371 0.5715 0.5659 0.6214

CEIQ 3.3118 3.2605 3.3862 3.1614 3.3323 3.1605 3.3176 3.2609 3.3186 3.4019

NIQE 3.2226 3.4754 3.2046 3.1625 3.1986 3.3233 3.2634 3.3454 3.2053 3.1466

IE 7.2962 7.2250 7.2071 7.1125 7.3107 7.0533 7.3021 7.2164 7.3078 7.4304

MABLs UCIQE 0.5737 0.5727 0.5677 0.5845 0.5846 0.5136 0.5220 0.5542 0.5519 0.6177

CEIQ 3.1511 3.1836 3.0552 3.1595 3.1874 3.0564 3.2114 3.1807 3.2481 3.3220

NIQE 3.6640 4.1634 3.6809 3.8062 3.6710 3.6593 4.2178 3.4600 3.7372 3.5594

IE 7.0392 7.1375 6.8782 7.1206 7.0908 6.8962 7.1757 7.0913 7.1951 7.3149

U-45 UCIQE 0.5922 0.5937 0.5960 0.5989 0.5836 0.5461 0.5503 0.5730 0.5631 0.6365

CEIQ 2.9121 3.1914 3.1912 3.1863 3.2491 3.2126 3.1984 3.2826 3.2957 3.4089

NIQE 5.1003 4.1202 3.7641 4.5488 3.7272 4.0803 3.8118 4.4583 3.8646 3.9783

IE 6.6261 7.1924 7.1842 7.1991 7.2534 7.1709 7.1926 7.2951 7.3142 7.4761

UCCS UCIQE 0.5462 0.5636 0.5175 0.5620 0.5679 0.4947 0.4920 0.5473 0.5337 0.5906

CEIQ 3.0100 3.0470 2.9056 3.2210 3.1846 3.1712 3.2439 3.2845 3.2421 3.4109

NIQE 3.6123 4.1728 3.6624 3.9630 3.6612 3.9591 3.8705 3.7834 3.8896 3.4745

IE 6.8711 6.9760 6.7098 7.2134 7.1663 7.1045 7.2795 7.2701 7.2363 7.4844

The top performer is indicated in bold, and the second best in italics for each case

4.3 Objective Assessment

To validate the earlier subjective observations, we employed

an objective evaluation technique to conduct a more com-

prehensive assessment of the quality of the restored images.

Table 2presents the average scores of the four no-reference

quality metrics (UCIQE, CEIQ, NIQE, and IE) for vari-

ous methods. These methods were applied to the UIEBD,

MABLs, U-45, and UCCS datasets.

It can be observed that deep learning-based methods such

as WaterNet, UWCNN, Ucolor, and PUIE have performed

well on the four test datasets. They exhibit lower UCIQE

and favorable CEIQ, NIQE, and IE scores. The convolutional

capabilities of these deep learning methods allow them to

correct color distortion effectively. However, they may not

be as effective as traditional methods that employ physical

models in terms of enhancing contrast and increasing color

vividness.

The physics-based model, including IBLA, GDCP, ULAP,

SMBL, and L2UWE, have demonstrated lower CEIQ, NIQE,

and IE scores on the four test datasets. The root cause of

this phenomenon is that these methods use physics-based

atmospheric imaging models. However, atmospheric imag-

ing models fall short of accurately describing the degradation

of underwater image quality. These physical-based methods

achieve optimal performance and necessitate the incorpora-

tion of accurate prior information, a requirement they often

fail to meet in underwater imaging scenarios.

Thanks to the differentiation of attenuation and scattering

coefﬁcients with CIFM and dynamic removal of backscatter

based on image type by ADP, our method achieves the highest

UCIQE, CEIQ, and IE scores on all four test datasets. More-

over, our method outperforms other state-of-the-art methods

regarding the NIQE score. In conclusion, our approach’s

superiority in color correction is proved by both qualitative

and quantitative evaluations.

4.4 Comparisons of Detail Enhancement

Precise ﬁne structural details are essential for generating

high-quality underwater images. To evaluate the effective-

ness of various enhancement techniques in enhancing the

detailed portions of the images, we conducted a comparison

by localized zoom, as illustrated by the red and blue boxes

in Fig. 9. From a global perspective, our method effectively

removes color distortion and improves contrast. On a local

scale, it enhances image structure details and signiﬁcantly

enhances clarity and information.

4.5 Ablation Study

To validate the efﬁcacy of the core components in our method,

i.e., the CIP+and ADP modules, we conducted an exten-

sive series of ablation studies on the UIEB dataset. The tested

variants include (a) the original image, (b) our method with-

out channel intensity prior depth estimation (-w/o CIP), (c)

our method without self-supervised depth estimation (-w/o

MONO), (d) our method with only red channel prior depth

estimates (-o/y R), (e) our method with only chromatic aber-

ration prior depth estimates (-o/y M), (f) our method without

123

International Journal of Computer Vision

Fig. 9 Detail enhances the visual effect of different methods: aoriginal image from MABLs, b–kresults obtained by IBLA, GDCP, ULAP,

WaterNet, SMBL, UWCNN, L2UWE, Ucolor, PUIE, and Our method, respectively

Fig. 10 Qualitative ablation results are presented for each key component of our method: aoriginal image from UIEBD, b–kresults obtained by

-w/o CIP, -w/o MONO, -o/y R, -o/y M, -w/o ADP, and Our method (full model), respectively

adaptive dark pixel (-w/o ADP), and (g) our method (full

model).

To assess the effectiveness of the CIP+module, we per-

formed a detailed comprehensive ablation study. As depicted

in Fig. 10(b)–(e), the backscatter was successfully removed in

certain areas, and the contrast was improved. However, there

existed inaccurate depth estimation, resulting in the introduc-

tion of artifacts and over-enhancement in some regions. The

depth estimation results for each portion from (b) to (e) are

shown in Fig. 11. Both MONO2 and M depth maps exhibited

inaccuracies of varying extents, which were primarily due

to color bias and luminance loss. Although the R depth map

appeared accurate, it incorporated excessive image detail and

lacked smoothness. Contrarily, the CIP+model provides a

more accurate and smoother depth estimate. The objective

ablation experiments results are reported in Table 3. All met-

rics of the incomplete CIP+module experienced varying

degrees of the decline attributed to depth estimation errors.

Therefore, the CIP+module is the superior choice.

To explore the effectiveness of the ADP module, we elim-

inate the ADP module and obtain a variant of the method.

As shown in Fig. 10, it can be seen that (f) successfully

removes color bias and enhances texture detail, but it also

causes local darkness due to the ﬁtting error. Meanwhile,

Fig. 10(g) includes all the critical components for the best

visual outcome. The ﬁtting results with and without the ADP

module are displayed in Fig. 12, and it is evident that the ﬁt-

ting loss is smaller with the addition of the ADP module in

various scenarios. To further validate our ablation study, we

employed the full reference metrics SSIM (Structural Sim-

ilarity Index) and PSNR (Peak Signal-to-Noise Ratio). The

detailed ablation results are presented in Table 3. It’s evident

that all indicators of the method without the ADP module

trigger noticeable declines across all performance metrics.

This is attributed to the ADP module’s ability to dynami-

cally select dark pixel points based on the degradation level

of each image, thereby minimizing ﬁtting errors. As a result,

the ADP module proves to be a crucial component of our

method.

4.6 Running Time Comparisons

To evaluate the computational efﬁciency of our method, we

created an underwater image dataset composed of 100 images

123

International Journal of Computer Vision

Fig. 11 Qualitative ablation results for each key component of our

depth estimation method: aoriginal image from UIEBD, b–fresults

obtained by MONO, M, R, CIP, and CIP+, respectively. The x-axis of

gis the depth, and the y-axis is the chromaticity, reﬂecting the depth

values represented by the different colors in the depth map

Table 3 Ablation study on the

UIEB dataset -w/o CIP -w/o MONO -o/y R -o/y M -w/o ADP Ours

UCIQE 0.6192 0.6096 0.6009 0.6100 0.6189 0.6214

CEIQ 3.3982 3.3587 3.3026 3.3379 3.3750 3.4019

IE 7.4164 7.3681 7.2503 7.3475 7.3983 7.4304

PSNR 16.517 16.803 16.660 16.420 16.473 17.374

SSIM 0.8038 0.8039 0.8048 0.7929 0.7980 0.8178

The top performer is indicated in bold, and the second best in italics for each case

for each of the following sizes: 256 ×256, 512 ×512, and

1024 ×1024. We tested a PC with an Intel Xeon Silver

4215R CPU @ 3.20 GHz 3.19 GHz and an NVIDIA Tesla

V100 PCIE 32GB GPU. The traditional method was run

using MATLAB R2019a, while Python and PyTorch were

employed for executing the deep learning-based method.

Table 4indicates that deep learning methods generally

perform faster than traditional ones due to their compre-

hensive training and GPU utilization. Conversely, traditional

restoration methods, like IBLA and our method, consume

a considerable amount of time to calculate the background

light and transmission map. This leads to an extended run-

time, primarily driven by the need for repetitive transmission

map estimations. Although our approach may not outperform

other methods in terms of processing speed, it effectively

removes blur and color bias while addressing the issue of

artiﬁcial light.

5 Conclusion

In this paper, we propose a novel method for artiﬁcial light

removal by combining an adaptive dark pixel prior and color

correction technique within the CIMF framework. We adopt

the CIP+depth estimation technique based on the law

of light attenuation and unsupervised methods, considering

the degree of image degradation. Additionally, we employ

ADP to remove backward scattering effectively. Our method

demonstrates robust performance across various underwater

environments and illumination conditions, yielding visu-

ally pleasing images. Objective experiments report that our

UCIQE/CEIQ outperforms the GDCP method by 6.64%

and 6.79%, and the recently data-driven PUIE approach by

11.36% and 3.35%, evidencing signiﬁcant improvements in

color recovery and detail enhancement. The extensive exper-

123

International Journal of Computer Vision

Fig. 12 Qualitative ablation results for the key component of our scattering ﬁt: aoriginal image from UIEBD, bthe result of ﬁtting the backward

scattering coefﬁcient of -w/o ADP, and cthe result of ﬁtting the backward scattering coefﬁcient of our method

Table 4 Average runtime of different underwater image enhancement techniques

Methods IBLA GDCP ULAP WaterNet SMBL UWCNN L2UWE Ucolor PUIE Ours

Time(s) 256 * 256 9.134 0.163 0.560 0.091 3.070 0.050 4.829 0.576 0.018 15.114

512 * 512 38.955 0.558 1.593 0.144 13.299 0.061 17.362 0.934 0.035 43.031

1024 * 1024 165.312 2.062 6.492 0.334 56.268 0.087 68.312 2.656 0.129 135.559

The top performer is indicated in bold, and the second best in italics for each case

iments clearly show the efﬁcacy of the proposed method in

enhancing details and restoring the natural color, highlighting

its potential for underwater image restoration. Nevertheless,

our approach requires a more extended runtime for accurate

depth estimation than deep learning-based methods. In future

work, we aim to accelerate its speed and optimize the ﬁtting

procedure.

Funding This work was supported in part by the National Natural

Science Foundation of China (No. 61702074), the Liaoning Provin-

cial Natural Science Foundation of China (No. 20170520196), and

the Fundamental Research Funds for the Central Universities (Nos.

3132019205 and 3132019354).

Data Availibility Data underlying the results presented in our work are

available in UIEBD, U45, UCCS, MABLs, and EUVP.

Declaration

Conﬂict of interest The authors declare no conﬂict of interest.

References

Akkaynak, D., & Treibitz, T. (2018). A revised underwater image for-

mation model. In Proceedings of the IEEE conferenceon computer

vision and pattern recognition (pp. 6723–6732).

Akkaynak, D., & Treibitz, T. (2019) Sea-thru: A method for removing

water from underwater images. In Proceedings of the IEEE/CVF

conference on computer vision and pattern recognition (pp. 1682–

1691).

Akkaynak, D., Treibitz, T., Shlesinger, T., Loya, Y., Tamir, R., & Iluz,

D. (2017). What is the space of attenuation coefﬁcients in under-

water computer vision? In Proceedings of the IEEE conference on

computer vision and pattern recognition (pp. 4931–4940).

Ancuti, C. O., Ancuti, C., De Vleeschouwer, C., & Bekaert, P. (2017).

Color balance and fusion for underwater image enhancement.

IEEE Transactions on image processing, 27(1), 379–393.

Ancuti, C., Ancuti, C. O., Haber, T., & Bekaert, P. (2012). Enhancing

underwater images and videos by fusion. In 2012 IEEE conference

on computer vision and pattern recognition (pp. 81–88). IEEE.

Berman, D., Treibitz, T., & Avidan, S. (2017). Diving into haze-lines:

Color restoration of underwater images. In Proc. British machine

vision conference (BMVC) (Vol 1).

123

International Journal of Computer Vision

Bhoi, A. (2019). Monocular depth estimation: A survey.

arXiv:1901.09402.

Carlevaris-Bianco, N., Mohan, A., & Eustice, R.M. (2010) Initial results

in underwater single image dehazing. In Oceans 2010 Mts/IEEE

seattle (pp. 1–8 ). IEEE

Dong, X., & Wen, J. (2016). Low lighting image enhancement using

local maximum color value prior. Frontiers of Computer Science,

10(1), 147–156.

Drews, P., Nascimento, E., Moraes, F., Botelho, S., & Campos, M.

(2013). Transmission estimation in underwater single images. In

Proceedings of the IEEE international conference on computer

vision workshops (pp. 825–830).

Ebner, M., & Hansen, J. (2013). Depth map color constancy. Bio-

Algorithms and Med-Systems, 9(4), 167–177.

Fu, Z., Wang, W., Huang, Y., Ding, X., & Ma, K.-K. (2022) Uncertainty

inspired underwater image enhancement. In European conference

on computer vision (pp. 465–482). Springer.

Galdran, A., Pardo, D., Picón, A., & Alvarez-Gila, A. (2015). Automatic

red-channel underwater image restoration. Journal of Visual Com-

munication and Image Representation, 26, 132–145.

Godard, C., Mac Aodha, O., Firman, M., & Brostow, G. J. (2019).

Digging into self-supervised monocular depth estimation. In Pro-

ceedings of the IEEE/CVF international conference on computer

vision (pp. 3828–3838).

He, K., Sun, J., & Tang, X. (2010). Single image haze removal using

dark channel prior. IEEE Transactions on Pattern Analysis and

Machine Intelligence, 33(12), 2341–2353.

Hitam, M. S., Awalludin, E. A., Yussof, W. N. J. H. W., & Bachok, Z.

(2013) Mixture contrast limited adaptive histogram equalization

for underwater image enhancement. In 2013 international con-

ference on computer applications technology (ICCAT) (pp. 1–5).

IEEE.

Islam, M. J., Xia, Y., & Sattar, J. (2020). Fast underwater image

enhancement for improved visual perception. IEEE Robotics and

Automation Letters, 5(2), 3227–3234.

Jaffe, J. S. (1990). Computer modeling and the design of optimal under-

water imaging systems. IEEE Journal of Oceanic Engineering,

15(2), 101–111.

Jiang, Q., Gu, Y., Li, C., Cong, R., & Shao, F. (2022). Underwater

image enhancement quality evaluation: Benchmark dataset and

objective metric. IEEE Transactions on Circuits and Systems for

Video Technology, 32(9), 5959–5974.

Jiang, Z., Li, Z., Yang, S., Fan, X., & Liu, R. (2022). Target ori-

ented perceptual adversarial fusion network for underwater image

enhancement. IEEE Transactions on Circuits and Systems for

Video Technology, 32(10), 6584–6598.

Kang, Y., Jiang, Q., Li, C., Ren, W., Liu, H., & Wang, P. (2023). A

perception-aware decomposition and fusion framework for under-

water image enhancement. IEEE Transactions on Circuits and

Systems for Video Technology, 33(3), 988–1002.

Li, C., Anwar, S., Hou, J., Cong, R., Guo, C., & Ren, W. (2021).

Underwater image enhancement via medium transmission-guided

multi-color space embedding. IEEE Transactions on Image Pro-

cessing, 30, 4985–5000.

Li, C., Anwar, S., & Porikli, F. (2020). Underwater scene prior inspired

deep underwater image and video enhancement. Pattern Recogni-

tion, 98, 107038.

Li, C., Guo, C., Ren, W., Cong, R., Hou, J., Kwong, S., & Tao, D.

(2019). An underwater image enhancement benchmark dataset and

beyond. IEEE Transactions on Image Processing, 29, 4376–4389.

Li, H., Li, J., & Wang, W. (2019). A fusion adversarial under-

water image enhancement network with a public test dataset.

arXiv:1906.06819.

Li, J., Skinner, K. A., Eustice, R. M., & Johnson-Roberson, M. (2017).

Watergan: Unsupervised generative network to enable real-time

color correction of monocular underwater images. IEEE Robotics

and Automation letters, 3(1), 387–394.

Li, K., Wu, L., Qi, Q., Liu, W., Gao, X., Zhou, L., & Song, D. (2022).

Beyond single reference for training: Underwater image enhance-

ment via comparative learning. IEEE Transactions on Circuits and

Systems for Video Technology 1–1.

Liu, J., Fan, X., Jiang, J., Liu, R., & Luo, Z. (2021). Learning a deep

multi-scale feature ensemble and an edge-attention guidance for

image fusion. IEEE Transactionson Circuits and Systems for Video

Technology, 32(1), 105–119.

Liu, J., Shang, J., Liu, R., & Fan, X. (2022). Attention-guided global-

local adversarial learning for detail-preserving multi-exposure

image fusion. IEEE Transactionson Circuits and Systems for Video

Technology, 32(8), 5026–5040.

Liu, R., Fan, X., Zhu, M., Hou, M., & Luo, Z. (2020). Real-world

underwater enhancement: Challenges, benchmarks, and solutions

under natural light. IEEE Transactions on Circuits and Systems

for Video Technology, 30(12), 4861–4875.

Liu, R., Jiang, Z., Yang, S., & Fan, X. (2022). Twin adversarial con-

trastive learning for underwater image enhancement and beyond.

IEEE Transactions on Image Processing, 31, 4922–4936.

Marques, T. P., & Albu, A. B. (2020) L2uwe: A framework for the

efﬁcient enhancement of low-light underwater images using local

contrast and multi-scale fusion. In Proceedings of the IEEE/CVF

conference on computer vision and pattern recognition workshops

(pp 538–539).

Mittal, A., Soundararajan, R., & Bovik, A. C. (2012). Making a “com-

pletely blind” image quality analyzer. IEEE Signal Processing

Letters, 20(3), 209–212.

Pan, J., Sun, D., Zhang, J., Tang, J., Yang, J., Tai, Y.-W., & Yang, M.-H.

(2022). Dual convolutional neural networks for low-level vision.

International Journal of Computer Vision, 130(6), 1440–1458.

Peng, Y.-T., Cao, K., & Cosman, P. C. (2018). Generalization of the dark

channel prior for single image restoration. IEEE Transactions on

Image Processing, 27(6), 2856–2868.

Peng, Y.-T., & Cosman, P. C. (2017). Underwater image restoration

based on image blurriness and light absorption. IEEE Transactions

on Image Processing, 26(4), 1579–1594.

Qi, Q., Li, K., Zheng, H., Gao, X., Hou, G., & Sun, K. (2022). Sguie-net:

Semantic attention guided underwater image enhancement with

multi-scale perception. IEEE Transactions on Image Processing,

31, 6816–6830.

Qi, Q., Zhang, Y., Tian, F., Wu, Q. J., Li, K., Luan, X., & Song, D.

(2021). Underwater image co-enhancement with correlation fea-

ture matching and joint learning. IEEE Transactions on Circuits

and Systems for Video Technology, 32(3), 1133–1147.

Ren, W., Pan, J., Zhang, H., Cao, X., & Yang, M.-H. (2020). Sin-

gle image dehazing via multi-scale convolutional neural networks

with holistic edges. International Journal of Computer Vision, 128,

240–259.

Ren, W., Zhang, J., Pan, J., Liu, S., Ren, J. S., Du, J., Cao, X., & Yang,

M.-H. (2021). Deblurring dynamic scenes via spatially varying

recurrent neural networks. IEEE Transactions on Pattern Analysis

and Machine Intelligence, 44(8), 3974–3987.

Song, W., Wang, Y., Huang, D., Liotta, A., & Perra, C. (2020). Enhance-

ment of underwater images with statistical model of background

light and optimization of transmission map. IEEE Transactions on

Broadcasting, 66(1), 153–169.

Song, W., Wang, Y., Huang, D., & Tjondronegoro, D. (2018) A rapid

scene depth estimation model based on underwater light atten-

uation prior for underwater image restoration. In Pac iﬁ c Ri m

conference on multimedia (pp. 678–688). Springer.

Tang, Y., Iwaguchi, T., Kawasaki, H., Sagawa, R., & Furukawa, R.

(2022) Autoenhancer: Transformer on u-net architecture search

for underwater image enhancement. In Proceedings of the Asian

conference on computer vision (pp. 1403–1420).

123

International Journal of Computer Vision

Xu, X., Cai, Y., Liu, C., Jia, K., & Shen, L. (2008). Color cast detection

and color correction methods based on image analysis. Measure-

ment & Control Technology, 27(5), 10–12.

Yan, J., Li, J., & Fu, X. (2019). No-reference quality assess-

ment of contrast-distorted images using contrast enhancement.

arXiv:1904.08879.

Yang H.-H., Huang K.-C., & Chen, W.-T. (2021) Laffnet: A lightweight

adaptive feature fusion network for underwater image enhance-

ment. In 2021 IEEE international conference on robotics and

automation (ICRA) (pp. 685–692). IEEE.

Yang, M., & Sowmya, A. (2015). An underwater color image qual-

ity evaluation metric. IEEE Transactions on Image Processing,

24(12), 6062–6071.

Yuan, J., Cai, Z., & Cao, W. (2021). Tebcf: Real-world underwater

image texture enhancement model based on blurriness and color

fusion. IEEE Transactions on Geoscience and Remote Sensing,

60, 1–15.

Zang, Y., Zhou, K., Huang, C., & Loy, C. C. (2023). Semi-supervised

and long-tailed object detection with cascadematch. International

Journal of Computer Vision, 131, 1–15.

Zhang, K., Ren, W., Luo, W., Lai, W.-S., Stenger, B., Yang, M.-H.,

& Li, H. (2022). Deep image deblurring: A survey. International

Journal of Computer Vision, 130(9), 2103–2130.

Zhang, W., Dong, L., Pan, X., Zhou, J., Qin, L., & Xu, W. (2019). Single

image defogging based on multi-channel convolutional MSRCR.

IEEE Access, 7, 72492–72504.

Zhou, J., Wang, Y., & Zhang, W. (2022). Underwater image restoration

via information distribution and light scattering prior. Computers

and Electrical Engineering, 100, 107908.

Zhou, J., Yang, T., Chu, W., & Zhang, W. (2022). Underwater image

restoration via backscatter pixel prior and color compensation.

Engineering Applications of Artiﬁcial Intelligence, 111, 104785.

Zhou, J., Yang, T.,Ren, W., Zhang, D., & Zhang, W. (2021). Underwater

image restoration via depth map and illumination estimation based

on a single image. Optics Express, 29(19), 29864–29886.

Zhuang, P., & Ding, X. (2020). Underwater image enhancement using

an edge-preserving ﬁltering retinex algorithm. Multimedia Tools

and Applications, 79, 17257–17277.

Zhuang, P., Li, C., & Wu, J. (2021). Bayesian retinex underwater image

enhancement. Engineering Applications of Artiﬁcial Intelligence,

101, 104171.

Zhuang, P., Wu, J., Porikli, F., & Li, C. (2022). Underwater image

enhancement with hyper-Laplacian reﬂectance priors. IEEE Trans-

actions on Image Processing, 31, 5442–5455.

Publisher’s Note Springer Nature remains neutral with regard to juris-

dictional claims in published maps and institutional afﬁliations.

Springer Nature or its licensor (e.g. a society or other partner) holds

exclusive rights to this article under a publishing agreement with the

author(s) or other rightsholder(s); author self-archiving of the accepted

manuscript version of this article is solely governed by the terms of such

publishing agreement and applicable law.

123

A preview of this full-text is provided by Springer Nature.

Learn more

Content available from International Journal of Computer Vision

This content is subject to copyright. Terms and conditions apply.

Underwater image enhancement via color correction and multi-feature image fusion

Article

Full-text available

Jun 2024
MEAS SCI TECHNOL

The light attenuation underwater causes the actual underwater images to suffer from color cast, low contrast, and weak illumination. To address these issues, an effective fusion-based method is proposed, which realizes color correction (CC), brightness adjustment, contrast, and detail enhancement of underwater images. Concretely, we first design an adaptive CC method via dominant color channel judgment and lower color channel compensation. Then, we detect the brightness of each input image and propose a gamma correction function based on the gradient of the cumulative histogram to adjust the brightness of the low-light images. Subsequently, global histogram stretching and adaptive fractional differentiation techniques are employed to process the brightness-adjusted image, and then the global contrast-enhanced version and detail-enhanced version are generated respectively. To integrate the advantages of both versions, a channel fusion method based on the Lab color space is used to fuse the luminance and color of the two versions separately. The experimental results demonstrate the effectiveness of the proposed method in improving the color and illumination of underwater images, as well as enhancing the clarity of images. Moreover, the testing results on multiple datasets validate the excellent stability of this method.

Single Underwater Image Restoration Using Variational Framework Guided by Imaging Model With Noise

Article

Full-text available

Jan 2024

Underwater images typically present poor visibility, color distortion, and noise, which limit the application in several high-level tasks of image analysis. To address these corruptions, a novel method is proposed to reconstruct high-quality underwater images, which is designed by integrating imaging model with noise and variational framework. Specifically, an improved underwater imaging model is first introduced by separating noise from real underwater scene. Subsequently, the hazy curves of degraded colors are decomposed to estimate transmission map, and a color loss prior is employed to correct the transmission map. Moreover, a first-order gradient guided filter is proposed to refine the transmission map. An evaluation formula is designed by combining illumination, contrast, and color deviation priors to accurately search for the background region. Finally, a variational model is established to restore underwater images and suppress noise based on the improved imaging model and image priors. Experimental results validate that the proposed method surpasses several outstanding approaches, demonstrating its well effectiveness in improving contrast, correcting color, and suppressing noise.

Towards the in-situ Trunk Identification and Length Measurement of Sea Cucumbers via B\'{e}zier Curve Modelling

Preprint

Jun 2024

We introduce a novel vision-based framework for in-situ trunk identification and length measurement of sea cucumbers, which plays a crucial role in the monitoring of marine ranching resources and mechanized harvesting. To model sea cucumber trunk curves with varying degrees of bending, we utilize the parametric B\'{e}zier curve due to its computational simplicity, stability, and extensive range of transformation possibilities. Then, we propose an end-to-end unified framework that combines parametric B\'{e}zier curve modeling with the widely used You-Only-Look-Once (YOLO) pipeline, abbreviated as TISC-Net, and incorporates effective funnel activation and efficient multi-scale attention modules to enhance curve feature perception and learning. Furthermore, we propose incorporating trunk endpoint loss as an additional constraint to effectively mitigate the impact of endpoint deviations on the overall curve. Finally, by utilizing the depth information of pixels located along the trunk curve captured by a binocular camera, we propose accurately estimating the in-situ length of sea cucumbers through space curve integration. We established two challenging benchmark datasets for curve-based in-situ sea cucumber trunk identification. These datasets consist of over 1,000 real-world marine environment images of sea cucumbers, accompanied by B\'{e}zier format annotations. We conduct evaluation on SC-ISTI, for which our method achieves mAP50 above 0.9 on both object detection and trunk identification tasks. Extensive length measurement experiments demonstrate that the average absolute relative error is around 0.15.

Underwater Image Enhancement by Diffusion Model with Customized CLIP-Classifier

Preprint

May 2024

In this paper, we propose a novel underwater image enhancement method, by utilizing the multi-guided diffusion model for iterative enhancement. Unlike other image enhancement tasks, underwater images suffer from the unavailability of real reference images. Although existing works exploit synthetic images, manually selected well-enhanced images as reference images, to train enhancement networks, their enhancement performance always comes with subjective preferences that are inherited from the manual selection. To address this issue, we also use the image synthesis strategy, but the synthetic images derive from in-air natural images degraded into corresponding underwater images, guided by the underwater domain. Based on this strategy, the diffusion model can learn the prior knowledge of image enhancement from the underwater degradation domain to the real in-air natural domain. However, it is inevitable to fine-tune the model to suit downstream tasks, and this may erase the prior knowledge. To mitigate this, we combine the prior knowledge from the in-air natural domain with Contrastive Language-Image Pretraining (CLIP) to train a classifier for controlling the diffusion model generation process. Moreover, for image enhancement tasks, we find that the image-to-image diffusion model and the CLIP-Classifier mainly act in the high-frequency region during the fine-tuning process. Therefore, we propose a fast fine-tuning strategy focusing on the high-frequency region, which can be up to 10 times faster than the traditional strategy. Extensive experiments demonstrate that our method, abbreviated as CLIP-UIE, exhibit a more natural appearance.

The Specific Application and Meaning Expression of Folk Art in Urban Landscape Design Under the Optimized Interactive Genetic Algorithm

Article

Full-text available

May 2024

Haifeng Na

With the rapid and stable growth of China's economy and the rapid development of urban construction, urban complexes have sprung up in major cities. The lag in urban ecological environmental protection, maintenance and construction has brought great pressure to the needs of today's urban residents in terms of life, safety, and sense of belonging. In the major cities with many high-rise buildings in contemporary China, land resources are becoming more and more scarce, and the urban ecological environment is in urgent need of recycling, and due to the blind imitation of Western culture and design mode and the neglect of my country's traditional regional culture, the urban landscape lacks interaction, resonance, and sense of belonging with citizens, and the phenomenon of landscape similarity emerges in various cities, focusing on the landscape space of urban complexes. There are also these problems. Urban residents urgently need a third space that can adjust their physical, mental, and spiritual needs. How to design an urban complex landscape that meets the aesthetic needs and humanistic needs of contemporary cities and has regional characteristics has become the first important task of my research. Folk art is an artistic treasure created by the working people in their production and life. Folk art is the embodiment of cultural regionality and the foundation of national culture. It awakens people's awareness of the importance of local culture, awakens people's sense of belonging, and is closer to the local public life. Today, the living soil and social and humanistic environment of folk art are in the process of urbanization in our country, and there is a trend of gradual disappearance of lifestyle changes. How to make the contemporary urban complex landscape an organic soil for the survival, expression, application, and development of folk art is an important task in contemporary urban landscape design. Based on optimization, related concepts such as symbols, folk art symbols, urban complexes, urban complex landscape design, etc. have been sorted out. The relevant experimental results show that the construction land accuracy of the logistic regression model based on genetic algorithm has increased from 78.0% to 85.3%. kappa increased from 74.5% to 81.2%. Research shows that the logistic regression parameter optimization method based on genetic algorithm has better simulation effect than the conventional logistic regression method and is more suitable for the situation of unbalanced data distribution and many data features in the simulation of large-scale urban land dynamic changes.

Harnessing Traditional Business Culture Resources: Scalable Parameter Server Architecture with Distributed Machine Learning

Article

Full-text available

May 2024

Chengcai Xing

This paper explores the development and utilization of Traditional business culture resources, focusing on Yan'an's rich historical significance and its potential political, cultural, educational, and economic value. Leveraging distributed machine learning systems within a parameter server architecture, the research addresses challenges related to scalability and robustness, particularly in the face of random node downtime and network interruptions. Through intelligent simulation and experimentation, the study demonstrates the effectiveness of employing machine learning techniques to achieve high accuracy in modeling Traditional business culture's development and utilization.

Realistic Problems and Practical Paths of "Three-Wide Education" in Higher Education Based on Text Analysis and Mining

Article

Full-text available

May 2024

Tianyu Gao

"Three complete education" is a powerful attempt to meet the requirements of the times. Students must have high quality, and they are compound talents who meet the requirements of the times "Universal education" is to provide students with comprehensive training to improve their cultural level and spiritual and moral level to meet the requirements of future social competition. To study the comprehensive reform and development of the moral education in universities, this paper reviews and standardizes the formation and development of the comprehensive reform in universities, which is related to the development of the comprehensive reform from three aspects: strengthening the system interconnection in the process of all staff training, strengthening the effective interaction between cultures in the whole process, and ensuring the complete organic integration of cultures. It puts forward a solution to the comprehensive reform of "three complete education" for universities. Provide some reasonable suggestions for the intensive development of moral education in universities.

Decoupled variational retinex for reconstruction and fusion of underwater shallow depth-of-field image with parallax and moving objects

Article

May 2024
INFORM FUSION

Underwater imaging often suffers from poor quality due to the complex underwater environment and limitations of hardware equipment, leading to images with shallow depth of field and moving objects, which pose a challenge for information fusion of image sequences from the same underwater scene. To effectively address these problems, we propose a decoupled variational Retinex method for reconstructing and fusing underwater shallow depth of field images. Specifically, we first construct a module that adopts the decoupled variational Retinex model to adjust pixel dynamic range and luminance components, enhance non-local properties' extraction with higher-order data constraints, and significantly improve image quality. Then, we develop a precision alignment strategy for image sequences by calculating and correcting control point deviations in the overlapping areas, achieving accurate registration of the image sequences, and effectively reconstructing scenes with parallax. Moreover, scenes with moving objects within the image sequence are reconstructed by redistributing overlapping areas. We design a novel cost function based on the neighborhood information of seams, which facilitates iterative optimization of these solved seams. This process improves the segmentation accuracy within these regions, achieving more precise scene reconstruction. Compared with state-of-the-art approaches, our method demonstrates superior performance in rectifying degraded image quality and reconstructing visually appealing images, with the resulting reconstructed images showing enhanced subjective visual quality.

Breaking the water dilemma: Transmission-guided bilevel adaptive learning for underwater imagery

Article

May 2024
NEUROCOMPUTING

Denoising Multiscale Back-Projection Feature Fusion for Underwater Image Enhancement

Article

Full-text available

May 2024

In recent decades, enhancing underwater images has become a crucial challenge when obtaining high-quality visual information in underwater environment detection, attracting increasing attention. Original underwater images are affected by a variety of underwater environmental factors and exhibit complex degradation phenomena such as low contrast, blurred details, and color distortion. However, most encoder-decoder-based methods fail to restore the details of underwater images due to information loss during downsampling. The noise in images also influences the recovery of underwater images with complex degradation. In order to address these challenges, this paper introduces a simple but effective denoising multiscale back-projection feature fusion network, which represents a novel approach to restoring underwater images with complex degradation. The proposed method incorporates a multiscale back-projection feature fusion mechanism and a denoising block to restore underwater images. Furthermore, we designed a multiple degradation knowledge distillation strategy to extend our method to enhance various types of degraded images, such as snowy images and hazy images. Extensive experiments on the standard datasets demonstrate the superior performance of the proposed method. Qualitative and quantitative analyses validate the effectiveness of the model compared to several state-of-the-art models. The proposed method outperforms previous deep learning models in recovering both the blur and color bias of underwater images.

Semi-Supervised and Long-Tailed Object Detection with CascadeMatch

Article

Full-text available

Jan 2023
INT J COMPUT VISION

This paper focuses on long-tailed object detection in the semi-supervised learning setting, which poses realistic challenges, but has rarely been studied in the literature. We propose a novel pseudo-labeling-based detector called CascadeMatch. Our detector features a cascade network architecture, which has multi-stage detection heads with progressive confidence thresholds. To avoid manually tuning the thresholds, we design a new adaptive pseudo-label mining mechanism to automatically identify suitable values from data . To mitigate confirmation bias, where a model is negatively reinforced by incorrect pseudo-labels produced by itself, each detection head is trained by the ensemble pseudo-labels of all detection heads. Experiments on two long-tailed datasets, i.e., LVIS and COCO-LT, demonstrate that CascadeMatch surpasses existing state-of-the-art semi-supervised approaches—across a wide range of detection architectures—in handling long-tailed object detection. For instance, CascadeMatch outperforms Unbiased Teacher by 1.9 APFix on LVIS when using a ResNet50-based Cascade R-CNN structure, and by 1.7 APFix when using Sparse R-CNN with a Transformer encoder. We also show that CascadeMatch can even handle the challenging sparsely annotated object detection problem. Code: https://github.com/yuhangzang/CascadeMatch.

Underwater Image Enhancement With Hyper-Laplacian Reflectance Priors

Article

Full-text available

Aug 2022

Underwater image enhancement aims at improving the visibility and eliminating color distortions of underwater images degraded by light absorption and scattering in water. Recently, retinex variational models show remarkable capacity of enhancing images by estimating reflectance and illumination in a retinex decomposition course. However, ambiguous details and unnatural color still challenge the performance of retinex variational models on underwater image enhancement. To overcome these limitations, we propose a hyper-laplacian reflectance priors inspired retinex variational model to enhance underwater images. Specifically, the hyper-laplacian reflectance priors are established with the $l_{1/2}$ -norm penalty on first-order and second-order gradients of the reflectance. Such priors exploit sparsity-promoting and complete-comprehensive reflectance that is used to enhance both salient structures and fine-scale details and recover the naturalness of authentic colors. Besides, the $l_{2}$ norm is found to be suitable for accurately estimating the illumination. As a result, we turn a complex underwater image enhancement issue into simple subproblems that separately and simultaneously estimate the reflection and the illumination that are harnessed to enhance underwater images in a retinex variational model. We mathematically analyze and solve the optimal solution of each subproblem. In the optimization course, we develop an alternating minimization algorithm that is efficient on element-wise operations and independent of additional prior knowledge of underwater conditions. Extensive experiments demonstrate the superiority of the proposed method in both subjective results and objective assessments over existing methods. The code is available at: https://github.com/zhuangpeixian/HLRP .

Deep Image Deblurring: A Survey

Article

Full-text available

Sep 2022
INT J COMPUT VISION

Image deblurring is a classic problem in low-level computer vision with the aim to recover a sharp image from a blurred input image. Advances in deep learning have led to significant progress in solving this problem, and a large number of deblurring networks have been proposed. This paper presents a comprehensive and timely survey of recently published deep-learning based image deblurring approaches, aiming to serve the community as a useful literature review. We start by discussing common causes of image blur, introduce benchmark datasets and performance metrics, and summarize different problem formulations. Next, we present a taxonomy of methods using convolutional neural networks (CNN) based on architecture, loss function, and application, offering a detailed review and comparison. In addition, we discuss some domain-specific deblurring applications including face images, text, and stereo image pairs. We conclude by discussing key challenges and future research directions.

AutoEnhancer: Transformer on U-Net Architecture Search for Underwater Image Enhancement

Chapter

Mar 2023

Deep neural architecture plays an important role in underwater image enhancement in recent years. Although most approaches have successfully introduced different structures (e.g., U-Net, generative adversarial network (GAN) and attention mechanisms) and designed individual neural networks for this task, these networks usually rely on the designer’s knowledge, experience and intensive trials for validation. In this paper, we employ Neural Architecture Search (NAS) to automatically search the optimal U-Net architecture for underwater image enhancement, so that we can easily obtain an effective and lightweight deep network. Besides, to enhance the representation capability of the neural network, we propose a new search space including diverse operators, which is not limited to common operators, such as convolution or identity, but also transformers in our search space. Further, we apply the NAS mechanism to the transformer and propose a selectable transformer structure. In our transformer, the multi-head self-attention module is regarded as an optional unit and different self-attention modules can be used to replace the original one, thus deriving different transformer structures. This modification is able to further expand the search space and boost the learning capability of the deep model. The experiments on widely used underwater datasets are conducted to show the effectiveness of the proposed method. The code is released at https://github.com/piggy2009/autoEnhancer.

Beyond Single Reference for Training: Underwater Image Enhancement via Comparative Learning

Article

Jan 2022

Due to the wavelength-dependent light absorption and scattering, the raw underwater images are usually inevitably degraded. Underwater image enhancement (UIE) is of great importance for underwater observation and operation. Data-driven methods, such as deep learning-based UIE approaches, tend to be more applicable to real underwater scenarios. However, the training of deep models is limited by the extreme scarcity of underwater images with enhancement references, resulting in their poor performance in dynamic and diverse underwater scenes. As an alternative, enhancement reference achieved by volunteer voting alleviate the sample shortage to some extent. Since such artificially acquired references are not veritable ground truth, they are far from complete and accurate to provide correct and rich supervision for the enhancement model training. Beyond training with single reference, we propose the first comparative learning framework for UIE problem, namely CLUIE-Net, to learn from multiple candidates of enhancement reference. This new strategy also supports semi-supervised learning mode. Besides, we propose a regional quality-superiority discriminative network (RQSD-Net) as an embedded quality discriminator for the CLUIE-Net. Comprehensive experiments demonstrate the effectiveness of RQSD-Net and the comparative learning strategy for UIE problem. The code, models and new dataset RQSD-UI are available at: https://justwj.github.io/CLUIE-Net.html/ .

Uncertainty Inspired Underwater Image Enhancement

Chapter

Nov 2022

A main challenge faced in the deep learning-based Underwater Image Enhancement (UIE) is that the ground truth high-quality image is unavailable. Most of the existing methods first generate approximate reference maps and then train an enhancement network with certainty. This kind of method fails to handle the ambiguity of the reference map. In this paper, we resolve UIE into distribution estimation and consensus process. We present a novel probabilistic network to learn the enhancement distribution of degraded underwater images. Specifically, we combine conditional variational autoencoder with adaptive instance normalization to construct the enhancement distribution. After that, we adopt a consensus process to predict a deterministic result based on a set of samples from the distribution. By learning the enhancement distribution, our method can cope with the bias introduced in the reference map labeling to some extent. Additionally, the consensus process is useful to capture a robust and stable result. We examined the proposed method on two widely used real-world underwater image enhancement datasets. Experimental results demonstrate that our approach enables sampling possible enhancement predictions. Meanwhile, the consensus estimate yields competitive performance compared with state-of-the-art UIE methods. Code available at https://github.com/zhenqifu/PUIE-Net.KeywordsUnderwater image enhancementDeep learningProbabilistic networkAdaptive instance normalizationConditional variational autoencoder

SGUIE-Net: Semantic Attention Guided Underwater Image Enhancement with Multi-Scale Perception

Article

Oct 2022
IEEE T IMAGE PROCESS

Due to the wavelength-dependent light attenuation, refraction and scattering, underwater images usually suffer from color distortion and blurred details. However, due to the limited number of paired underwater images with undistorted images as reference, training deep enhancement models for diverse degradation types is quite difficult. To boost the performance of data-driven approaches, it is essential to establish more effective learning mechanisms that mine richer supervised information from limited training sample resources. In this paper, we propose a novel underwater image enhancement network, called SGUIE-Net, in which we introduce semantic information as high-level guidance via region-wise enhancement feature learning. Accordingly, we propose semantic region-wise enhancement module to better learn local enhancement features for semantic regions with multi-scale perception. After using them as complementary features and feeding them to the main branch, which extracts the global enhancement features on the original image scale, the fused features bring semantically consistent and visually superior enhancements. Extensive experiments on the publicly available datasets and our proposed dataset demonstrate the impressive performance of SGUIE-Net. The code and proposed dataset are available at https://trentqq.github.io/SGUIE-Net.html .

A Perception-Aware Decomposition and Fusion Framework for Underwater Image Enhancement

Article

Jan 2022

This paper presents a perception-aware decomposition and fusion framework for underwater image enhancement (UIE). Specifically, a general structural patch decomposition and fusion (SPDF) approach is introduced. SPDF is built upon the fusion of two complementary pre-processed inputs in a perception-aware and conceptually independent image space. First, a raw underwater image is pre-processed to produce two complementary versions including a contrast-corrected image and a detail-sharpened image. Then, each of them is decomposed into three conceptually independent components, i.e., mean intensity, contrast, and structure, via structural patch decomposition (SPD). Afterwards, the corresponding components are fused using tailored strategies. The three components after fusion are finally integrated via inverting the decomposition to reconstruct a final enhanced underwater image. The main advantage of SPDF is that two complementary pre-processed images are fused in a perception-aware and conceptually independent image space and the fusions of different components can be performed separately without any interactions and information loss. Comprehensive comparisons on two benchmark datasets demonstrate that SPDF outperforms several state-of-the-art UIE algorithms qualitatively and quantitatively. Moreover, the effectiveness of SPDF is also verified on another two relevant tasks, i.e., low-light image enhancement and single image dehazing. The code will be made available soon.

Twin Adversarial Contrastive Learning for Underwater Image Enhancement and Beyond

Article

Jul 2022

Underwater images suffer from severe distortion, which degrades the accuracy of object detection performed in an underwater environment. Existing underwater image enhancement algorithms focus on the restoration of contrast and scene reflection. In practice, the enhanced images may not benefit the effectiveness of detection and even lead to a severe performance drop. In this paper, we propose an object-guided twin adversarial contrastive learning based underwater enhancement method to achieve both visual-friendly and task-orientated enhancement. Concretely, we first develop a bilateral constrained closed-loop adversarial enhancement module, which eases the requirement of paired data with the unsupervised manner and preserves more informative features by coupling with the twin inverse mapping. In addition, to confer the restored images with a more realistic appearance, we also adopt the contrastive cues in the training phase. To narrow the gap between visually-oriented and detection-favorable target images, a task-aware feedback module is embedded in the enhancement process, where the coherent gradient information of the detector is incorporated to guide the enhancement towards the detection-pleasing direction. To validate the performance, we allocate a series of prolific detectors into our framework. Extensive experiments demonstrate that the enhanced results of our method show remarkable amelioration in visual quality, the accuracy of different detectors conducted on our enhanced images has been promoted notably. Moreover, we also conduct a study on semantic segmentation to illustrate how object guidance improves high-level tasks. Code and models are available at https://github.com/Jzy2017/TACL</uri

Target Oriented Perceptual Adversarial Fusion Network for Underwater Image Enhancement

Article

Oct 2022

Due to the refraction and absorption of light by water, underwater images usually suffer from severe degradation, such as color cast, hazy blur, and low visibility, which would degrade the effectiveness of marine applications equipped on autonomous underwater vehicles. To eliminate the degradation of underwater images, we propose a target oriented perceptual adversarial fusion network, dubbed TOPAL. Concretely, we consider the degradation factors of underwater images in terms of turbidity and chromatism. And according to the degradation issues, we first develop a multi-scale dense boosted module to strengthen the visual contrast and a deep aesthetic render module to perform the color correction, respectively. After that, we employ the dual channel-wise attention module and guide the adaptive fusion of latent features, in which both diverse details and credible appearance are integrated. To bridge the gap between synthetic and real-world images, a global-local adversarial mechanism is introduced in the reconstruction. Besides, perceptual information is also embedded into the process to assist the understanding of scenery content. To evaluate the performance of TOPAL, we conduct extensive experiments on several benchmarks and make comparisons among state-of-the-art methods. Quantitative and qualitative results demonstrate that our TOPAL improves the quality of underwater images greatly and achieves superior performance than others.

Underwater Camera: Improving Visual Perception Via Adaptive Dark Pixel Prior and Color Correction

Abstract and Figures

Recommended publications

Fighting fake Chinese Herbal Medicines

Simulation model prepares cardiologists for surgeries

Enhancement of low-lighting underwater images using dark channel prior and fast guided filters

A non-uniform illumination image enhancement method based on fusion of events and frames

Multi-color Light Attenuation Modeling for Underwater Image Restoration

Adaptive weighted multiscale retinex for underwater image enhancement

Multi-view underwater image enhancement method via embedded fusion mechanism

Light Attenuation and Color Fluctuation for Underwater Image Restoration