ArticlePDF Available

Prediction of effluent concentration in a wastewater treatment plant using machine learning models

Authors:

Abstract

Of growing amount of food waste, the integrated food waste and waste water treatment was regarded as one of the efficient modeling method. However, the load of food waste to the conventional waste treatment process might lead to the high concentration of total nitrogen (T-N) impact on the effluent water quality. The objective of this study is to establish two machine learning models-artificial neural networks (ANNs) and support vector machines (SVMs), in order to predict 1-day interval T-N concentration of effluent from a wastewater treatment plant in Ulsan, Korea. Daily water quality data and meteorological data were used and the performance of both models was evaluated in terms of the coefficient of determination (R(2)), Nash-Sutcliff efficiency (NSE), relative efficiency criteria (drel). Additionally, Latin-Hypercube one-factor-at-a-time (LH-OAT) and a pattern search algorithm were applied to sensitivity analysis and model parameter optimization, respectively. Results showed that both models could be effectively applied to the 1-day interval prediction of T-N concentration of effluent. SVM model showed a higher prediction accuracy in the training stage and similar result in the validation stage. However, the sensitivity analysis demonstrated that the ANN model was a superior model for 1-day interval T-N concentration prediction in terms of the cause-and-effect relationship between T-N concentration and modeling input values to integrated food waste and waste water treatment. This study suggested the efficient and robust nonlinear time-series modeling method for an early prediction of the water quality of integrated food waste and waste water treatment process. Copyright © 2015. Published by Elsevier B.V.
Prediction of effluent concentration in a wastewater treatment
plant using machine learning models
Hong Guo
1
, Kwanho Jeong
1
, Jiyeon Lim
2
, Jeongwon Jo
2
, Young Mo Kim
1
, Jong-pyo Park
3
,
Joon Ha Kim
1
, Kyung Hwa Cho
2,
1. School of Environmental Science and Engineering, Gwangju Institute of Science and Technology (GIST), 261 Cheomdan-gwagiro, Buk-gu,
Gwangju 500-712, Republic of Korea
2. School of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, Ulsan 689-798, Republic of Korea
3. HECOREA. INC, 405, Woori Venture Town II, 70, Seonyu-ro, Yeongdeungpo-gu, Seoul, Republic of Korea
ARTICLE INFO ABSTRACT
Article history:
Received 30 June 2014
Revised 11 December 2014
Accepted 22 January 2015
Available online 20 April 2015
Of growing amount of food waste, the integrated food waste and waste water treatment
was regarded as one of the efficient modeling method. However, the load of food waste to
the conventional waste treatment process might lead to the high concentration of total
nitrogen (T-N) impact on the effluent water quality. The objective of this study is to
establish two machine learning modelsartificial neural networks (ANNs) and support
vector machines (SVMs), in order to predict 1-day interval T-N concentration of effluent
from a wastewater treatment plant in Ulsan, Korea. Daily water quality data and
meteorological data were used and the performance of both models was evaluated in
terms of the coefficient of determination (R
2
), NashSutcliff efficiency (NSE), relative
efficiency criteria (d
rel
). Additionally, Latin-Hypercube one-factor-at-a-time (LH-OAT) and a
pattern search algorithm were applied to sensitivity analysis and model parameter
optimization, respectively. Results showed that both models could be effectively applied
to the 1-day interval prediction of T-N concentration of effluent. SVM model showed a
higher prediction accuracy in the training stage and similar result in the validation stage.
However, the sensitivity analysis demonstrated that the ANN model was a superior model
for 1-day interval T-N concentration prediction in terms of the cause-and-effect
relationship between T-N concentration and modeling input values to integrated food
waste and waste water treatment. This study suggested the efficient and robust nonlinear
time-series modeling method for an early prediction of the water quality of integrated food
waste and waste water treatment process.
© 2015 The Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences.
Published by Elsevier B.V.
Keywords:
Artificial neural network
Support vector machine
Effluent concentration
Prediction accuracy
Sensitivity analysis
Introduction
Following the restrictive landfill legislation passed by the
European Union (EU) in 1999, many developed countries have
implemented various policies and technical developments for
reducing the quantity of biodegradable waste landfill (Burnley
et al., 2011; García et al., 2005). The South Korean government
also prohibited the landfill of municipal solid sludge (MSS)
and food waste (FW) in the early 21st century (S. Cheon et al.,
2013). However, this strict regulation causes the dumping of
both the sludge and FW water (i.e., leachate) at sea, conse-
quently leading to the prohibition of its disposal in the ocean
JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
Corresponding author.E-mail: khcho@unist.ac.kr (Kyung Hwa Cho).
http://dx.doi.org/10.1016/j.jes.2015.01.007
1001-0742/© 2015 The Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V.
Available online at www.sciencedirect.com
ScienceDirect
www.journals.elsevier.com/journal-of-environmental-sciences
by the Marine Environment Management Act (Behera et al.,
2010; S. Cheon et al., 2013). At this time, basic environmental
treatment facilities such as wastewater treatment plants
(WWTPs) appear to be an alternative inland treatment to
resolve the problem. In the inland treatment with wastewater
treatment plant, over 80% of FW, which is recyclable organic
waste of municipal solid wastes (MSW), is dehydrated, and
the remaining waste goes through recycling processes such
as composting, feed, and anaerobic digestion to generate
biomass energy (Chelliapan et al., 2012; Li et al., 2012). In
particular, methane gas, one of the biogases, can be utilized as
a biomass energy source (Lee et al., 2009). However, a large
amount of FW leachate inevitably occurs in all recycling
processes because of a high moisture content from the FW
leachate (Han et al., 2012), resulting in a significant burden on
the wastewater treatment systems.
According to several research reports (Kim et al., 2008;
Sosnowski et al., 2003), the water treatment process can be
more effective by using FW leachate in WWTP. This is because
the FW leachate contains a large amount of acid fermentation
liquid (AFL) which can be utilized as an organic carbon source
for removing nitrogen and phosphorus in advanced waste-
water treatment (AWT) processes (Han et al., 2012; Lee et al.,
2003). The digestion process with only sewage sludge could be
less effective due to the low carbon/nitrogen (C/N) ratio and
low level of biodegradable organic compounds. FW leachate
contains a high amount of solid contents as well as a high C/N
ratio, while containing a low amount of the nutrient-type
elements (Mata-Alvarez, 2003). Therefore, the combined
treatment of sewage sludge and FW improves the removal
efficiency of nitrogen and phosphorus in AWT, enhancing the
stability of the digestion process. Furthermore, higher pro-
duction of methane gas is an additional benefit from the
co-digestion with FW leachate (Cecchi et al., 1988; Hamzawi et al.,
1998; Mata-Alvarez et al., 1990; Poggi-Varaldo and Oleszkiewicz,
1992; Schmit and Ellis, 2001). Owing to these advantages, the
anaerobic digestion process of sewage sludge with FW has been
increased in WWTPs in Korea. However, this process also faces
critical issues which are associated with the side effects of
co-digestion. One of the issues is that the influent water quality is
degraded by mixing with the returned FW leachate from the
anaerobic co-digestion process, so it tends to increase liquor
suspendedsolids(MLSS)andcausesalargeamountofscumin
the activated sludge reactor (Kim and Shin, 2009; Mahmoud et al.,
2003). As well, a sudden increase of the FW leachate could cause
an unstable digestion process and lower the level of effluent
water quality from WWTPs (S. Cheon et al., 2013).
Generally, water quality of a WWTP is sensitive to parameters
such as pH, temperature, concentrations of specific substrates,
and contaminants. This is because wastewater is treated by the
metabolism processes of microorganisms. However, biological
treatment still exhibits time-varying and highly nonlinear
characteristics affected by various known and unknown param-
eters (Hamed et al., 2004; Hong et al., 2003; Mjalli et al., 2007). Due
to these complicated features, many previous studies evaluated
and diagnosed the performance of WWTP by using a mathemat-
ical model for the process simulation and control (Gernaey et al.,
2004; Hamed et al., 2004; Hong et al., 2003; Iacopozzi et al., 2007;
Mjalli et al., 2007; Rivas et al., 2008; Wintgens et al., 2003).
Thereinto, a machine learning model has proved to be a useful
tool because it has a relatively high accuracy for dealing with
complicated systems. Furthermore, a key advantage of these
models to the evaluation of WWTP performance is that these can
directly predict output values from input values only after
training and validation step. Artificial neural networks (ANNs)
and support vector machines (SVMs) are representative machine-
learning techniques (Dreyfus, 2005; Shon and Moon, 2007). Two
machine learning models' performance studies have been widely
discussed before (Hamed et al., 2004; Palani et al., 2008; Singh et
al., 2009; Yoon et al., 2011). However, only black box modeling has
the limitation on the process control and there has yet to
elucidate the cause-and-effect relationship for input and output
value for process control.
In this study, two machine learning models would be
developed for predicting effluent T-N concentration for the
integrated food waste and waste water treatment plant in
Ulsan Metropolitan city, Korea. Moreover, by sensitivity
analysis between input values and output values, the
cause-and-effect relationship would be elucidated for the
future process control and selection of the prior machine
learning model for integrated food waste and waste water
treatment. The objective of this study is: a) development of
reliable 1-day interval early T-N concentration prediction
model by parameter optimization method; b) evaluation of
the building model by sensitivity analysis to find the cause
effect based reasonable model as future decision-making tool;
c) to propose an early warning prediction tool to avoid the
impact of FW leachate loading to the integrated food waste
and waste water treatment.
1. Method and materials
1.1. Field sampling
We collected water samples in an attempt to investigate the
effect of FW leachate on Yong-yeon (YY) WWTP in Ulsan. The
samples were collected from 6 different spots, including
influent, flow-distribution tank, aeration tank, effluent, FW
leachate, and pre-treated FW leachate (Fig. 1). The collected
samples were delivered to a laboratory at the Ulsan National
Institute of Science and Technology (UNIST) and were
analyzed in terms of total suspended solids (TSS), chemical
oxygen demand (COD), total nitrogen (T-N), and total phos-
phorus (T-P); water temperature and pH were measured in-situ
at the sampling stations.
1.2. Sample analysis
TSS of a water sample was measured by filtering a 20 mL
sample through pre-weighed 47 mm Glass-Fiber paper (with
1.2 μm pore size), then weighing the filter again after drying to
remove all water in the sample. COD, T-P, and T-N were
measured through absorptiometric analysis. COD and T-P were
measured for 4 sampling locations: influent, flow-distribution
tank, aeration tank, and effluent. T-N was measured for 6
sampling locations including 2 additional stations (i.e.,pre-and
post-aerobic transamination of FW leachate). The absorbance
of samples, which were mixed with the proper reagents was
quantified under the 200-900 nm wavelength and the target
91JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
components were quantified. Distilled water was used for the
referencesolution. For COD quantification, 0.5 mL of each water
sample was put into the sulfuric acidic solution, and then
0.6 mL of standard potassium permanganate (KMnO
4
) solution
(0.005 mol/L) was added. The mixed solution was heated for
15 min at 100°C. After the reaction, the oxygen demand was
measured by the amount of consumed potassium permanga-
nate. For T-P measurement, the water sample was pre-treated
by putting persulfuric acid potassiuminto 5 mL of water sample
and heating for 30 min at 120°C. After heating the pre-treated
water sample, a mixture of 2 mL of ammonium molybdate with
ascorbic acid was put into the sample. The reference solution
was observed under the 880 nm wavelength and the T-P of the
water sample was measured by quantifying the amount of
reduced phosphate. For quantifying T-N, the samples from pre-
and post-aerobic treatment of FW leachate had to be diluted to
1/25 ratio due to their high concentration levels. The water
samples were pre-treated by putting alkaline persulfuric acid
potassium into 0.5 mL of water sample, which was then heated
for 30 min at 120°C. After adding hydrochloricacid to make pH 2
to 3, T-N wasfinally measured by the absorption of wavelengths
under 220 nm. Consequently, the T-N of the water sample was
measured by oxidizing nitrogenous compounds to nitrate ions
and calculated by the difference of light intensity between the
reference and sample.
1.3. Modeling approaches
1.3.1. Artificial neural networks (ANNs)
As the name implies, an artificial neural network is a data-
based flexible mathematical structure of a neural network
model which is a very powerful computational technique for
the modeling of complex non-linear relationships and analysis
of the explicit form of the relations between variables. It was
firstly introduced in the early years of the 1940s and developed
with the back-propagation (BP) algorithm in 1988 (Gallant, 1993;
McCulloch and Pitts, 1943; Rumelhart et al., 1988; Smith, 1993).
Artificial neural networks have been seen as the standard
data-based nonlinear estimator tools, and it is widely applied
for prediction and forecasting in the field of environment-
related areas, including water treatment (Liong et al., 2001;
Muttil and Chau, 2006), oceanography (Lee,2004), and ecological
science (Trichakis et al., 2011). Also, the use of data-based
modeling for water quality (Ahmed and Sarma, 2007; Cho et al.,
2011; Karul et al., 2000; Lee et al., 2010; Lek and Guégan, 1999;
Rogers and Dowla, 1994; Yan et al., 2010) has been successfully
completed for the past 20 years.
A common ANN structure, called a multilayer perception
network, consists of three distinctive layers: input, hidden,
and output with linked-nodes and functions. After data are
introduced into the ANN model, the network utilizes the
neurons which are non-linear algebraic functions (i.e., transfer
functions) (Dreyfus et al., 2002). The signal passes from one
neuron to another neuron by the weights and transfer
function (Govindaraju, 2000) and the back propagation algo-
rithm could effectively train the network for the nonlinear
neural network problems by adjusting weights in an attempt
to minimize the objective function during those processes
(Rumelhart and Mcclelland, 1986). The mathematical expres-
sion of the ANN model in this study is as follows (Khalil et al.,
2005):
yi¼fX
N
j¼1
WijXjþbi
0
@1
Að1Þ
where, X
j
is the jth nodal value for the previous layer and y
i
is
the ith nodal value in the current layer.
By multiplying the weighting factor (W
ij
) and adding the
bias of the ith node, we can calculate the current nodal value
for the aforementioned nodal value based on the activation
function fbased on Eq. (1). Three layers (input, hidden, and
output) of a feed-forward artificial neural network were built
for predicting effluent concentrations in the YY WWTP from 8
input variables, X
i
(i=1,, 8) (month, volumetric flow rate of
Fig. 1 Schematic diagram of wastewater treatment plants (WWTPs), dashed box indicates the system boundaries for the
machine learning model development for these studies. Sampling points are numbered as (1)(6). TN conc: total nitrogen
concentration.
92 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
inflow, pH, temperature, chemical oxygen demand, suspended
solid, T-N of inflow, T-N of pre-treated FW leachate) (Fig. 2). Fewer
hidden nodes are usually preferable, due to the better generali-
zation capabilities which can avoid over-fitting problems. How-
ever, insufficient nodes also lead to impaired performance of the
networking training and validation (Ahmed and Sarma, 2007;
Palani et al., 2008). The optimal parameter sets of the hidden
nodes, learning rate, and momentum for the model were
determined by pattern search algorithms. In addition, we
tested the logistic sigmoid function, tangent sigmoid function,
and the linear function as candidate transfer functions and
optimal fitness model comes from tangent sigmoid transfer
function.
1.3.2. Support vector machines (SVM)
Support vector machines (SVMs) are a data-based machine
learning model, which is based on structural risk minimiza-
tion (SRM) (Vapnik, 1995, 1999). The SRM minimizes the
empirical error and model complexity simultaneously. It
could contribute to the improvement of generalization ability
of the classification or regression problems (Yoon et al., 2011).
SVMs have been widely verified in numerous environmental
research areas. Dibike et al. (2001)applied various kernel
functions of SVM to predict rainfall, and Khalil et al. (2005)
used SVM to demonstrate the agriculture-dominated water-
shed by analyzing the spatial distribution features of ground-
water. For other fields, including the stream flow water level
of lakes and soil moisture prediction, it also widely applied
(Gill et al., 2006; Khalil et al., 2006; Khan and Coulibaly, 2006;
Liong and Sivapragasam, 2002).
SVM models could be classified into two types: linear
support vector regression and nonlinear support vector
regression. The nonlinear support vector regression mathe-
matical model was used for model development in this study.
Mathematically, it can be described as follows:
fX
i
ðÞ¼
X
N
i¼1
WiφXi
ðÞþbð2Þ
where, W
i
and bare the parameters of the linear support
vector regression function and φ(X
i
) is the nonlinear mapping
function. In order to simply calculate the nonlinear mapping
function, the kernel function, K(x
i
,x
j
)=(ϕ(x
i
)ϕ(x
j
))would
be applied to make the inner products, analyze the space, and
evaluate the feature-separating space as the mathematical
functions (Yu et al., 2006). We tested all kinds of the kernels
(such as linear, polynomial, sigmoid, and radial basis func-
tion) and found that the radial basis function could lead to an
optimal fitness model for effluent water quality prediction in
this study. In addition, for the key model parameters, the
optimal parameter sets of the cost constant (C), the radius of
insensitive tube (ε), and the scale parameter for stable
performance of model (σ) were determined by the optimiza-
tion algorithm.
1.4. Model construction
1.4.1. Input data preparation
MATLAB was used for building ANN and SVM models to predict
effluent T-N in the WWTP. As the architectures of ANN and
Fig. 2 Illustration of general conceptual model structure for artificial neural networks (ANNs) and support vector machines
(SVMs).
93JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
SVM models were shown in Fig. 3, the total dataset is divided
into two different groups: training and validation data set. For
the model input data, we chose particular data from January to
August for the training of the model and the data from
September to October for the validation after optimization
work on the training and validation period studies. All data
were normalized to range from 1to1.Afterthat,the
normalized data were used as input and output data for the
ANN and SVM models. The optimal model parameters of
these two models were determined by applying a global
optimization algorithm, respectively. After determining the
model parameters, the T-N concentration of the effluent
would be predicted by the ANN model and SVM model, then the
values were compared to the measured values to evaluate the
prediction accuracy. Other pollutants such as COD, BOD
(biochemical oxygen demand), and T-P were not considered as
the output values for modeling development because only the
concentration of T-N in the effluent was regarded as an indicator
of the effect of the FW leachate on the waste water treatment in
this study.
1.4.2. Model parameter optimization
Both of ANN and SVM, the model parameters greatly influence
the learning and prediction accuracy of the output values. Palani
et al. (2008) found that an insufficient number of nodes would
lead to an impaired performance of the network. Normally, the
optimum value of the parameters is determined by trial and
error, or is based on previous research. In this study, we used the
patternsearchalgorithm(Lewis and Torczon, 2002)todetermine
the optimum values for parameters of the ANN and SVM models,
as showed in Table 1. The initial ranges of each parameter were
selected based on previous research (Cho et al., 2009; Wang et al.,
2003).
1.4.3. Assessment of model performance
To judge the performance of each machine learning model
(for ANN and SVM), the suitable criterion selection is critical to
confirm the model performance. Also, as Krause found that
none of single usage of efficiency criteria could give us the full
explanation of model performance, since each of them has
their pros and cons, we applied the three criteria: coefficient of
determination (R
2
), NashSutcliff efficiency (NSE), and relative
efficiency criteria (d
rel
) for the training and validation, which are
most frequently applied in the water-sciencefield (Krause et al.,
2005).
The coefficient of determination could be defined and
calculated as follows,
R2¼X
i
QmiðÞQmið ÞÞðQoiðÞQoiðÞ

X
i
QmiðÞQmiðÞ

2X
i
QoiðÞQoiðÞ

2
:
ð3Þ
The value of the coefficient would be in the range from 0 to
1 (no correlation to a perfect fit), and it would tell us how the
Fig. 3 Logical flow for two machine learning modeling study (artificial neural networks (ANNs) and support vector machines
(SVMs)). SS: suspended solids; T-N: total nitrogen; tansig/tansig: the transfer function of ANNs; RBF: the kernel function of
SVMs.
94 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
dispersion of the measured value could be explained by the
modeling prediction.
NashSutcliffe efficiency was developed in the 1970s and it
was widely applied to access the hydrological models. It is
also very sensitive to the extreme value and might give
unoptimal results for the datasets, which contains extreme
data. It could be calculated as:
NSE ¼1X
T
t¼1
QoiðÞQmiðÞðÞ
2
X
T
t¼1
QoiðÞQoiðÞ

2ð4Þ
where, Q
o
,Q
m
, and QoiðÞ are the measured value, modeled
value, and average measured value at the ith order observa-
tion, respectively. And Nis the total number of samples.
We used the absolute values to check the difference between
the values of measured and modeled. Both of the coefficients
of determination and NashSutcliffe efficiency described the
difference between the measured and modeled value for the
absolute values. However, there might be an over- or under-
prediction due to higher or lower values. To counteract these
problems, we additionally applied the relative efficiency criteria
(d
rel
) to reduce the influence of the absolute differences between
the measured value and modeled value during high values
significantly.
drel ¼1X
i
Qoi
ðÞ
Qmi
ðÞ
Qoi
ðÞ

2
X
i
QmiðÞQoiðÞ
þQmiðÞQoiðÞ
QoiðÞ
!
2ð5Þ
The range of the relative efficiency criteria is also in the
range from 0 to 1.
Besides applying the aforementioned criteria, fitness of the
constructed models was checked through the residual anal-
ysis (Krause et al., 2005).
1.5. Sensitivity analysis
Latin Hypercube One factor At a Time (LH-OAT) sensitivity
analysis was used for input parameters that may have a
potential influence on the prediction of T-N concentration of
the effluent. As a sensitivity analysis method, which could
give the ranking of parameter sensitivity, LH-OAT combines
the One-factor At a Time (OAT) and Latin Hypercube (LH)
sampling methods. Under LH-OAT sensitivity analysis, all
parameters are sampled under the precision of the OAT
method so that any change of the output value can be clearly
attributed to the changed input. Additionally, LH-OAT is also a
very efficient method; for mintervals in the LH method, a total
of m×(p+ 1) steps are required (van Griensven et al., 2006).
For each input parameter, the boundary was set to the
minimum and maximum values.
2. Results and discussions
2.1. Water quality monitoring
The daily data for the three water quality parameters (T-N, T-P,
and TSS) measured over 10 months for all 6 monitoring points
stations (influent, flow-distribution tank, aeration tank, efflu-
ent, FW leachate, and pre-treated FW leachate) is presented in
Table 1. The T-N and TSS are for two important parameters for
the assessment of the water quality analysis, including the
measured data of T-N and TSS, months, volumetric flow rate of
the inflow, pH, temperature, and COD was used as the input
parameters for machine learning model construction. All of
these input parameters were determined through the sensitiv-
ity analysis for demonstration to the relationship with the T-N
concentration of effluent and finally selected from the model
development.
Box plots in Fig. 4 are the statistical analysis of the measured
water quality variables from Table 1. The results of the T-N
concentration for the flow distribution tank and aeration tank
are 38.884 ± 20.508 mg/L and 47.569 ± 20.933 mg/L, respectively.
After the food sludge treatment, the pre-treated FW leachate
would be recycled to the flow distribution tank. We measured
water quality parameters (T-N, T-P, and TSS) of the influent to the
each process and found that water quality of the aeration tank is
related to the effect of the pre-treated FW leachate recycling.
From Table 1, T-N concentration increased by 8.685 mg/L
on average. Also, T-P and TSS increased by 3.858 mg/L and
107.978 mg/L, respectively. Since the difference between the
T-N concentrations of the pre-treated FW leachate and flow
distribution tank is at least one order of magnitude difference, the
aforementioned results could be acceptable and we could observe
that the integrated treatment of food waste could greatly affect
the water quality of the WWTP treatment in Ulsan.
Table 1 10 months measured data of the water quality variables for each process in the Ulsan waste water treatment plant.
Influent Flow-distribution Aeration-sedimentation Effluent Supernatant Pretreatment
T-N
(mg/L)
Average 39.074 38.884 47.569 14.639 2578.980 1662.938
Std. 13.842 20.508 20.933 9.522 1138.601 1055.886
Median 37.390 35.180 42.185 13.618 2221.800 1323.800
T-P
(mg/L)
Average 5.544 2.701 6.559 1.083 ––
Std. 1.925 0.570 1.893 0.852 ––
Median 5.515 2.800 6.333 0.978 ––
TSS
(mg/L)
Average 266.532 180.328 288.306 97.581 ––
Std. 108.449 88.755 97.375 62.020 ––
Median 250.000 175.000 287.500 100.000 ––
T-N: total nitrogen; T-P: total phosphorus; TSS: total suspended solids.
95JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
2.2. Training and validation of ANN and SVM models
Different ANN models and SVM models were built for testing
in order to determine the optimum model for the prediction of
effluent T-N concentration in this study. For the ANN model,
the hyperbolic tangent transfer function (nonlinear transfer
function) was determined as the optimal function for both the
hidden and output layers. And for SVM, the RBF kernel
function was used in the transformation layer.
Additionally, the selection of the appropriate node num-
bers for both the ANN and SVM hidden layers is very critical,
because over-fitting results could result from extreme number
of nodes use. In this study, we applied the pattern search
algorithm to find the optimum parameters of the node
number of hidden layers for both the ANN and SVM models.
Table 2 shows the optimum parameters for ANN and SVM,
which were obtained from the pattern search algorithm study.
2.3. Parameter sensitivity analysis
Table 3 summarizes the sensitivity ranking for the perfor-
mance of the input parameters to the T-N concentration for
effluent. It showed the importance of the spatial and temporal
variables to the model predictions. In the ANN case, the
temperature was the most important parameter, followed by
the T-N of inflow water and pH. On the other hand, the
month, COD, and SS were the most three important param-
eters for the SVM model.
For a biological water treatment plant, the temperature, pH,
and organic carbon are the three most important operational
conditions for bacteria growth, which could affect the removal
efficiency of the T-N of biological water treatment. Hence, the
temperature and pH are a reasonable determination as the
most significant parameters for predicting the T-N concentra-
tion of effluent for machine learning models in this study.
Additionally, the T-N concentration of effluent was also an
important input parameter, which directly affects the input
amount of T-N into waste water treatment. Hence it could be
found that the ANN model could lead to a more reasonable
model compared to SVM based on the consideration of
characteristics of the biological treatment process. For the
control of process, the more reasonable physical relation
based ANN model could be more reliable model to apply on
the avoidance of the high T-N concentration impact manage-
ment on the system by adjusting the most physically related
parameters than SVM.
Machine learning models do not have to represent all the
physical meaning through the input and output variables, but
we can still observe from the result of sensitivity analysis that
the highest ranking parameter of ANN was temperature.
However, for SVM, the highest ranking parameter was the
month, which seems to not be related to any physical
meaning. By considering the relationship between tempera-
ture and month, and an additional effect of the ionic strength
and flocs based on the different season (or month), the result
of ANN was more acceptable (Zita and Hermansson, 1994).
Additionally, the values of the final effect in Table 3 showed
that there was not much difference for all variables in SVM. In
terms of the process control, ANN showed the more reliable
and reasonable results than the SVM model.
Fig. 4 Basic statistics analysis of the measured water quality data. T-N: total nitrogen (log scale); T-P: total phosphorus; TSS:
total suspended solids; a: influent; b: flow-distribution; c: aeration-sedimentation; d: effluent; e: supernatant; f: pretreatment.
Table 2 Comparison of the optimized artificial neural networks (ANN) and support vector machines (SVM) performances
for prediction of total nitrogen (T-N) concentration of the effluent from the wastewater treatment plant in Ulsan.
Site Model Model parameters R
2
NSE d
rel
Tr Vl Vl Tr Tr Vl
Ulsan wastewater treatment plant ANNs
(Tansig/Tansig)
lr: 0.50
mo: 0.742
#N:11
0.55 0.47 0.56 0.46 0.80 0.76
SVMs
(RBF)
C: 50.005
ε: 0.001
σ: 4.693
1.00 0.46 1.00 0.45 0.99 0.77
lr: the learning rate; mo, momentum; # N: number of hidden neurons; C: the cost constant; ε: the radius of insensitive tube; σ: the parameter of
the kernel function; R
2
: the coefficient of determination; NSE: NashSutcliffe model efficiency; d
rel
: relative efficiency criteria; Tr: the training
step; Vl: the validation step; Tansig/Tansig: the transfer function of ANNs selected in this study; RBF: the kernel function of SVMs selected in
this study.
96 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
2.4. Model test
Measured values of T-N concentration of effluent from WWTP
in Ulsan were compared to the modeled values by the
machine learning models (ANN and SVM) using both of
regressions model and residual values, to check the models'
performance. Fig. 5 shows the regression model plot of steps
of training and validation for the both of ANN and SVM. We
could inspect that both of the ANN and SVM model resulted in
a good fit for the modeled data. As Table 2 shows, for the
training step and validation step of ANN model, coefficient of
determination value were 0.55 and 0.47; NashSutcliff effi-
ciency (NSE) were 0.56 and 0.46; relative efficiency criteria
were 0.80 and 0.76. On the other hand, the results of the SVM
model showed that coefficient of determination (R
2
) values for
the training and validation were 1.00 and 0.46; NSE values
were 1.00 and 0.4;, relative efficiency criteria were 0.99 and
0.77 (Table 2).
Table 3 Sensitivity rank of input variables in artificial neural networks (ANNs) and support vector machines (SVMs) using
the Latin Hypercube One factor At a Time (LH-OAT) sensitivity analysis for the Ulsan wastewater treatment plant.
Rank ANN SVM
Variable Final effect Variable Final effect
1 Temperature 38.59 Month 1.45
2 Total nitrogen of inflow 33.37 Chemical oxygen demand 1.34
3 pH 32.60 Suspended solid 1.33
4 Volumetric flow rate of inflow 30.58 pH 1.29
5 Suspended solid 26.89 Temperature 1.28
6 Total nitrogen of food waste leachate 22.31 Total nitrogen of inflow 1.24
7 Month 23.58 Volumetric flow rate of inflow 1.22
8 Chemical oxygen demand 17.64 Total nitrogen of food waste leachate 1.17
Fig. 5 Comparison of the modeled and measured total nitrogen (T-N) concentration of effluent from the Ulsan waste water
treatment plant training and validation tests using artificial neural network (ANN) and support vector machine (SVM) model.
97JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
By considering the low value of the modeling performance
criteria, we additionally check the fitness of the created
machine learning models through the analysis of residuals
(Fig. 6). We could observe that the plot of residuals for training
and validation for the both ANN and SVM shows that the
relationship between residuals and modeled values of T-N
concentration are independent and random distribution. The
results could also support by further correlation analysis (R
2
for ANN: Training = 5.509e 7, Validation = 3.306e6; R
2
for
SVM: Training = 1.1e 6, Validation = 2.01e 5) in Fig. 7.
In this study, the low results of model performance criteria
obtained were likely from the data noise and short-term of the
input data. However, ANN and SVM models could give
acceptable modeling accuracy results for the future prediction
of effluent T-N concentration.
2.5. Comparison of models
Fig. 4 shows the measured and predicted T-N concentrations
by the ANN and SVM models with application of optimum
parameters. Prediction accuracy of the SVM was slightly higher
thantheaccuracyoftheANNduringthetrainingsteps,whereas
the accuracy of the SVM was almost identical during the
validation steps. Consequently, we observed a higher prediction
performance of SVM than ANN.
Recently, ANN was appliedto predict water qualityvariables.
Soyupak et al. (2003) studied the prediction of dissolved oxygen
concentration in three separate reservoirs. The correlation of
the evaluation coefficient was greater than 0.95 for predicting
dissolved oxygen concentration. SVM has also been used to
predict water quality. Singh et al. (2009) computed the DO
(dissolved oxygen) and BOD concentration in a polluted river
flowing through the northern alluvial Gangetic plains in India.
Root-mean-square error (RMSE) values for the predicted and
observed values of DO were 0.7 and 0.74 for training and
validation steps, respectively, while the predicted and observed
values of BOD were 0.85 and 0.85 for training and validation
steps, respectively (Singh et al., 2009). For the waste water
treatment plant area, Oliveira-Esquerre et al. (2002) applied
ANN for the prediction of the biochemical oxygen demand of
the biological wastewater treatment effluent with an average R
2
of 0.76. Additionally, Mjalli et al. (2007) used ANN for research
into wastewater treatment plant operation characteristics and
the prediction of the BOD, COD, and TSS for the Doha West
WWTP.
A relatively low value of the model performance criteria
(0.41.0 for R
2
, 0.41.0 for NSE, and 0.760.99 for relative
efficiency criteria) for output variables for the T-N concentra-
tion of pre-treated FW leachate could be observed in this
study. The reason might be the interpretation and prediction
Fig. 6 Plot of the modeled versus measured total nitrogen (T-N) concentration of effluent from the Ulsan waste water treatment
plant training and validation tests.
98 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
ability for the higher non-linear relationship. Balabin and
Lomakina found that higher nonlinear interferences could
lead to low accuracy for both the ANN and SVM models
(Balabin and Lomakina, 2011). On the other hand, the limited
number of input variables and data noise could also be the
reason for the lower values of the coefficient of determination.
Hamed et al. (2004) applied ANN to the modeling of waste
water treatment. The data set for modeling was collected
around 10 months and low values of coefficient of determi-
nation (<0.5 on average) could also be observed, which were
similar to our study. However, future studies are necessary for
the high performance models for more effective water quality
prediction than current results.
Machine learning models (ANN and SVM) were developed
for the prediction of T-N concentration in effluent from the
Ulsan wastewater treatment plant using water quality and
meteorological data. The results showed that the machine
learning model could be applied to model the complex
wastewater treatment process, which also parallel treated
the high T-N concentration of FW leachate from food waste
through the wastewater treatment process. The values of the
T-N concentration of the effluent were successfully predict-
ed by the machine learning models and these two models
(ANN and SVM) could also be applied to 1) estimate the T-N
concentration of effluent when real-time monitoring or
sampling is not possible, and 2) estimate the range of the
output parameters to avoid exceeding the water quality
regulations.
It should be noted that the measured data were checked daily.
Therefore, the current machine learning models (ANN and SVM)
were only applied to the prediction daily water quality change
with a very short period (10 months). A large data set based
model recalibration and revalidation would be required in future
studies for a more accurate prediction model. Additionally, other
input parameters may also be considered for future modeling
work. Nevertheless, the models which were constructed in this
study could still be effectively used for the prediction of the
effluent T-N concentration.
3. Conclusions
The aim of this study was to develop two reliable machine
learning models (ANN and SVM) to predict the early 1-day
interval T-N concentration of effluent to avoid impact of high
FW leachate T-N concentration loading to the waste water
treatment. Both of daily water quality data and meteorological
data were used as input parameters, and a pattern search
algorithm was used for model parameter optimization for
machine learning models. In addition, sensitivity analysis
was also conducted to determine the effectiveness of each
Fig. 7 Residuals plots versus modeled (predicted) total nitrogen (T-N) concentration of effluent from the Ulsan waste water
treatment plant training and validation tests.
99JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
input parameter by using the LH-OAT method. The present
study shows that: 1) the optimum model of ANN and SVM was
reliable to predict the trends of water quality at the
wastewater treatment plant of Ulsan; 2) based only on the
model performance assessment from prediction accuracy,
the SVM model performance was better than the ANN model;
and 3) however, from the sensitivity analysis, more physical
related cause-and-effect relationships between the T-N
concentration of effluent and other input parameters could
be elucidated from ANN than SVM. Thus, the ANN model
could be a more reasonable and reliable model than SVM for
the purpose of decision-making model building and process
control for the integrated food waste and waste water
treatment. This study showed that machine learning models
could be a reliable method for the water quality prediction as
early warning water quality control of waste water treat-
ment. For the future work, long-term modeling for the input
value sampling could be suggested in the future to improve
the accuracy of the ANN and SVM models.
Acknowledgments
This research was supported by a grant (12-TI-C04) from
Advanced Water Management Research Program funded by
Ministry of Land, Infrastructure and Transport of Korean
government.
REFERENCES
Ahmed, J.A., Sarma, A.K., 2007. Artificial neural network model
for synthetic streamflow generation. Water Resour. Manag.
21 (6), 10151029.
Balabin, R.M., Lomakina, E.I., 2011. Support vector machine
regression (SVR/LS-SVM)an alternative to neural networks
(ANN) for analytical chemistry? Comparison of nonlinear
methods on near infrared (NIR) spectroscopy data. Analyst 136
(8), 17031712.
Behera, S.K., Park, J.M., Kim, K.H., Park, H.-S., 2010. Methane
production from food waste leachate in laboratory-scale
simulated landfill. Waste Manag. 30 (8-9), 15021508.
Burnley, S., Phillips, R., Coleman, T., Rampling, T., 2011. Energy
implications of the thermal recovery of biodegradable
municipal waste materials in the United Kingdom. Waste
Manag. 31 (9-10), 19491959.
Cecchi, F., Traverso, P.G., Perin, G., Vallini, G., 1988. Comparison of
codigestion performance of two differently collected organic
fractions of municipal solid waste with sewage sludges.
Environ. Technol. 9 (5), 391400.
Chelliapan, S., Mahat, S.B., Din, M.F.M., Yuzir, A., Othman, N.,
2012. Anaerobic digestion of paper mill wastewater. Iranica
J. Energy Environ. 3, 8590.
Cho, K.H., Kang, J.-H., Ki, S.J., Park, Y., Cha, S.M., Kim, J.H., 2009.
Determination of the optimal parameters in regression
models for the prediction of chlorophyll-a: a case study of the
Yeongsan Reservoir, Korea. Sci. Total Environ. 407 (8),
25362545.
Cho, K.H., Sthiannopkao, S., Pachepsky, Y.A., Kim, K.-W., Kim, J.H.,
2011. Prediction of contamination potential of groundwater
arsenic in Cambodia, Laos, and Thailand using artificial neural
network. Water Res. 45 (17), 55355544.
Dibike, Y.B., Velickov, S., Solomatine, D., Abbott, M.B., 2001. Model
induction with support vector machines: introduction and
applications. J. Comput. Civ. Eng. 15 (3), 208216.
Dreyfus, G., 2005. Neural Networks: Methodology and
Applications. Springer, Heidelberg.
Dreyfus, G., Martinez, J.-M., Samuelides, A., Gordon, M.B., Badran,
F., Thiria, S., et al., 2002. Réseaux de Neurones de Méthodologie
et Applications. Eyrolles.
Gallant, S.I., 1993. Neural Network Learning and Expert Systems.
MIT Press, London.
García, A.J., Esteban, M.B., Márquez, M.C., Ramos, P., 2005.
Biodegradable municipal solid waste: characterization and
potential use as animal feedstuffs. Waste Manag. 25 (8),
780787.
Gernaey, K.V., van Loosdrecht, M.C.M., Henze, M., Lind, M.,
Jørgensen, S.B., 2004. Activated sludge wastewater treatment
plant modelling and simulation: state of the art. Environ.
Model. Software 19 (9), 763783.
Gill, M.K., Asefa, T., Kemblowski, M.W., McKee, M., 2006. Soil
moisture prediction using support vector machines. J. Am.
Water Resour. Assoc. 42 (4), 10331046.
Govindaraju, R.S., 2000. Artificial neural networks in hydrology. II:
hydrologic applications. J. Hydrol. Eng. 5 (2), 124137.
Hamed, M.M., Khalafallah, M.G., Hassanien, E.A., 2004. Prediction
of wastewater treatment plant performance using artificial
neural networks. Environ. Model. Software 19 (10), 919928.
Hamzawi, N., Kennedy, K.J., McLean, D.D., 1998. Technical
feasibility of anaerobic co-digestion of sewage sludge and
municipal solid waste. Environ. Technol. 19 (10), 9931003.
Han, M.J., Behera, S.K., Park, H.S., 2012. Anaerobic codigestion of
food waste leachate and piggery wastewater for methane
production: statistical optimization of key process parameters.
J. Chem. Technol. Biotechnol. 87 (11), 15411550.
Hong, Y.-S.T., Rosen, M.R., Bhamidimarri, R., 2003. Analysis of a
municipal wastewater treatment plant using a neural
network-based pattern analysis. Water Res. 37 (7), 16081618.
Iacopozzi, I., Innocenti, V., Marsili-Libelli, S., Giusti, E., 2007. A
modified activated sludge model no. 3 (ASM3) with two-step
nitrificationdenitrification. Environ. Model. Software 22 (6),
847861.
Karul, C., Soyupak, S., Çilesiz, A.F., Akbay, N., Germen, E., 2000.
Case studies on the use of neural networks in eutrophication
modeling. Ecol. Model. 134 (2-3), 145152.
Khalil, A., Almasri, M.N., McKee, M., Kaluarachchi, J.J., 2005.
Applicability of statistical learning algorithms in groundwater
quality modeling. Water Resour. Res. 41, W05010. http://dx.doi.
org/10.1029/2004WR003608.
Khalil,A.F.,McKee,M.,Kemblowski,M.,Asefa,T.,Bastidas,L.,
2006. Multiobjective analysis of chaotic dynamic systems
with sparse learning machines. Adv. Water Resour. 29 (1),
7288.
Khan, M.S., Coulibaly, P., 2006. Application of support vector
machine in lake water level prediction. J. Hydrol. Eng. 11 (3),
199205.
Kim, S.H., Shin, H.S., 2009. Acidogenesis of lipids-containing
wastewater in anaerobic sequencing batch reactor. J. Korean
Soc. Environ. Eng. 31 (12), 10751080.
Kim, J.K., Han, G.H., Oh, B.R., Chun, Y.N., Eom, C.-Y., Kim, S.W.,
2008. Volumetric scale-up of a three stage fermentation
system for food waste treatment. Bioresour. Technol. 99 (10),
43944399.
Krause, P., Boyle, D.P., Bäse, F., 2005. Comparison of different
efficiency criteria for hydrological model assessment. Adv.
Geosci. 5, 8997.
Lee, T.L., 2004. Back-propagation neural network for long-term
tidal predictions. Ocean Eng. 31 (2), 225238.
Lee, C.Y., Shin, H.S., Chae, S.R., Nam, S.Y., Paik, B.C., 2003. Nutrient
removal using anaerobically fermented leachate of food waste
in the BNR process. Water Sci. Technol. 47 (1), 159165.
100 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
Lee, D.H., Behera, S.K., Kim, J.W., Park, H.-S., 2009. Methane
production potential of leachate generated from Korean food
waste recycling facilities: a lab-scale study. Waste Manag. 29
(2), 876882.
Lee, E., Seong, C., Kim, H., Park, S., Kang, M., 2010. Predicting the
impacts of climate change on nonpoint source pollutant loads
from agricultural small watershed using artificial neural
network. J. Environ. Sci. 22 (6), 840845.
Lek, S., Guégan, J.-F., 1999. Artificial neural networks as a tool in
ecological modelling, an introduction. Ecol. Model. 120 (2-3),
6573.
Lewis, R.M., Torczon, V., 2002. A globally convergent augmented
Lagrangian pattern search algorithm for optimization with
general constraints and simple bounds. SIAM J. Optim. 12 (4),
10751089.
Li, X.M., Cheng, K.Y., Selvam, A., Wong, J.W., 2012. Bioelectricity
production from acidic food waste leachate using microbial
fuel cells: effect of microbial inocula. Process Biochem. 48 (2),
283288.
Liong, S.Y., Sivapragasam, C., 2002. Flood stage forecasting with
support vector machines. J. Am. Water Resour. Assoc. 38 (1),
173186.
Liong, S.Y., Khu, S.T., Chan, W.T., 2001. Derivation of Pareto front
with genetic algorithm and neural network. J. Hydrol. Eng. 6 (1),
5261.
Mahmoud, N., Zeeman, G., Gijzen, H., Lettinga, G., 2003. Solids
removal in upflow anaerobic reactors, a review. Bioresour.
Technol. 90 (1), 19.
Mata-Alvarez, J., 2003. Biomethanization of the Organic Fraction
of Municipal Solid wastes. IWA Publishing, London.
Mata-Alvarez, J., Cecchi, F., Pavan, P., Llabres, P., 1990. The
performances of digesters treating the organic fraction of
municipal solid wastes differently sorted. Biol. Wastes 33 (3),
181199.
McCulloch, W.S., Pitts, W., 1943. A logical calculus of the ideas
immanent in nervous activity. Bull. Math. Biophys. 5 (4),
115133.
Mjalli, F.S., Al-Asheh, S., Alfadala, H.E., 2007. Use of artificial
neural network black-box modeling for the prediction of
wastewater treatment plants performance. J. Environ. Manage.
83 (3), 329338.
Muttil, N., Chau, K.W., 2006. Neural network and genetic
programming for modelling coastal algal blooms. Int.
J. Environ. Pollut. 28, 223238.
Oliveira-Esquerre, K.P., Mori, M., Bruns, R.E., 2002. Simulation of
an industrial wastewater treatment plant using artificial
neural networks and principal components analysis. Braz.
J. Chem. Eng. 19 (4), 365370.
Palani, S., Liong, S.-Y., Tkalich, P., 2008. An ANN application for
water quality forecasting. Mar. Pollut. Bull. 56 (9), 15861597.
Poggi-Varaldo, H.M., Oleszkiewicz, J.A., 1992. Anaerobic
co-composting of municipal solid waste and waste sludge at
high total solids levels. Environ. Technol. 13 (5), 409421.
Rivas, A., Irizar, I., Ayesa, E., 2008. Model-based optimisation of
wastewater treatment plants design. Environ. Model. Software
23 (4), 435450.
Rogers, L.L., Dowla, F.U., 1994. Optimization of groundwater
remediation using artificial neural networks with parallel
solute transport modeling. Water Resour. Res. 30 (2), 457481.
Rumelhart, D.E., Mcclelland, J.L., 1986. Parallel Distributed
Processing: Explorations in the Microstructure of Cognition.
MIT Press, Cambridge, Mass.
Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1988. Learning
internal representations by error propagation. In: Collins, A.,
Smith, E.E. (Eds.), Readings in Cognitive Science. Morgan
Kaufmann, pp. 399421.
S. Cheon, J.H., Bae, Y., Park, S., Lim, J., Ha, C., Choi, Y., Lim, H., 2013.
Examination of Inlet Conditions for Effective Anaerobic
Digestion of Food Waste Leachate in Bio-reactor. SUDOKWON
Landfill Site Management Corp., Incheon, South Korea
(Available at: http://webbook.me.go.kr/DLi-File/094/003/002/
5561166.PDF).
Schmit, K.H., Ellis, T.G., 2001. Comparison of temperature-phased
and two-phase anaerobic co-digestion of primary sludge and
municipal solid waste. Water Environ. Res. 73 (3), 314321.
Shon, T., Moon, J., 2007. A hybrid machine learning approach to
network anomaly detection. Inform. Sci. 177 (18), 37993821.
Singh, K.P., Basant, A., Malik, A., Jain, G., 2009. Artificial neural
network modeling of the river water qualitya case study.
Ecol. Model. 220 (6), 888895.
Smith, M., 1993. Neural Networks for Statistical Modeling.
International Thomson Computer Press.
Sosnowski, P., Wieczorek, A., Ledakowicz, S., 2003. Anaerobic
co-digestion of sewage sludge and organic fraction of
municipal solid wastes. Adv. Environ. Res. 7 (3), 609616.
Soyupak, S., Karaer, F., Gürbüz, H., Kivrak, E., Sentürk, E., Yazici,
A., 2003. A neural network-based approach for calculating
dissolved oxygen profiles in reservoirs. Neural Comput. Appl.
12 (3-4), 166172.
Trichakis, I.C., Nikolos, I.K., Karatzas, G.P., 2011. Artificial Neural
Network (ANN) based modeling for Karstic groundwater level
simulation. Water Resour. Manag. 25 (4), 11431152.
van Griensven, A., Meixner, T., Grunwald, S., Bishop, T., Diluzio,
M., Srinivasan, R., 2006. A global sensitivity analysis tool for
the parameters of multi-variable catchment models. J. Hydrol.
324 (1-4), 1023.
Vapnik, V., 1995. The Nature of Statistical Learning Theory.
Springer, New York, USA.
Vapnik, V.N., 1999. An overview of statistical learning theory. IEEE
Trans. Neural Netw. 10 (5), 988999.
Wang, W.J., Xu, Z.B., Lu, W.Z., Zhang, X.Y., 2003. Determination of
the spread parameter in the Gaussian kernel for classification
and regression. Neurocomputing 55 (3-4), 643663.
Wintgens, T., Rosen, J., Melin, T., Brepols, C., Drensla, K.,
Engelhardt, N., 2003. Modelling of a membrane bioreactor
system for municipal wastewater treatment. J. Membr. Sci. 216
(1-2), 5565.
Yan, H., Zou, Z.H., Wang, H.W., 2010. Adaptive neuro fuzzy
inference system for classification of water quality status.
J. Environ. Sci. 22 (12), 18911896.
Yoon, H., Jun, S.-C., Hyun, Y., Bae, G.-O., Lee, K.-K., 2011. A
comparative study of artificial neural networks and support
vector machines for predicting groundwater levels in a coastal
aquifer. J. Hydrol. 396 (1-2), 128138.
Yu, P.-S., Chen, S.-T., Chang, I.F., 2006. Support vector regression
for real-time flood stage forecasting. J. Hydrol. 328 (3-4),
704716.
Zita, A., Hermansson, M., 1994. Effects of ionic strength on
bacterial adhesion and stability of flocs in a wastewater
activated sludge system. Appl. Environ. Microbiol. 60 (9),
30413048.
101JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90101
... Applications of data-driven AI algorithms, particularly ML and DL models, have gained prominence in predicting effluent quality with the objective of reducing the release of contaminants into the environment and improving socio-economic aspects linked to wastewater management [63]. Guo et al. (2015) [64] developed two models utilizing ANN and SVM, using daily water quality and meteorological data as input parameters to predict the TN concentration in the effluent of a plant where integrated treatment of food waste and wastewater occurs. In the study conducted by Manu and Talla (2017) [65], AI models based on a SVM and an ANFIS were employed to predict Kjeldahl Nitrogen removal efficiency in a full-scale domestic WWTP. ...
... Applications of data-driven AI algorithms, particularly ML and DL models, have gained prominence in predicting effluent quality with the objective of reducing the release of contaminants into the environment and improving socio-economic aspects linked to wastewater management [63]. Guo et al. (2015) [64] developed two models utilizing ANN and SVM, using daily water quality and meteorological data as input parameters to predict the TN concentration in the effluent of a plant where integrated treatment of food waste and wastewater occurs. In the study conducted by Manu and Talla (2017) [65], AI models based on a SVM and an ANFIS were employed to predict Kjeldahl Nitrogen removal efficiency in a full-scale domestic WWTP. ...
... Their methodology involved the combination of a convolutional neural network (CNN) with a distinctive version of a radial basis function neural network (RBFNN) to not only forecast but also evaluate data uncertainty. Similarly, Hong Guo and colleagues [11] suggested a machine learning-oriented approach for forecasting effluent concentration in a wastewater treatment plant. Their approach included employing Artificial Neural Network and Support Vector Machine models to predict the quantities of overall nitrogen present in wastewater discharges. ...
Article
Full-text available
Phosphorus in wastewater poses a significant environmental threat, leading to water pollution and eutrophication. However, it plays a crucial role in the water-energy-resource recovery-environment (WERE) nexus. Recovering Phosphorus from wastewater can close the phosphorus loop, supporting circular economy principles by reusing it as fertilizer or in industrial applications. Despite the recognized importance of phosphorus recovery, there is a lack of analysis of the cyber-physical framework concerning the WERE nexus. Advanced methods like automatic control, optimal process technologies, artificial intelligence (AI), and life cycle assessment (LCA) have emerged to enhance wastewater treatment plants (WWTPs) operations focusing on improving effluent quality, energy efficiency, resource recovery, and reducing greenhouse gas (GHG) emissions. Providing insights into implementing modeling and simulation platforms, control, and optimization systems for Phosphorus recovery in WERE (P-WERE) in WWTPs is extremely important in WWTPs. This review highlights the valuable applications of AI algorithms, such as machine learning, deep learning, and explainable AI, for predicting phosphorus (P) dynamics in WWTPs. It emphasizes the importance of using AI to analyze microbial communities and optimize WWTPs for different various objectives. Additionally, it discusses the benefits of integrating mechanistic and data-driven models into plant-wide frameworks, which can enhance GHG simulation and enable simultaneous nitrogen (N) and Phosphorus (P) removal. The review underscores the significance of prioritizing recovery actions to redirect Phosphorus from effluent to reusable products for future consideration
Article
Diurnal variation of air pollutants and their relationship with Land Surface Temperature in Bengaluru and Hyderabad Cities of India
Article
Wastewater treatment is important for pollutant reduction and reclaimed water production. Machine learning is increasing applied in environmental field for deciphering variables’ relationships and processing large datasets. However, multifarious sewage treatment systems, technologies and data processing methods led to the widespread application of machine learning in wastewater treatment. Here, we evaluated a total of 398 publications focus on machine learning-based wastewater treatment from 1993 to 2022 using bibliometric method. We aimed to provide a quantitative analysis on research hotpots, global trends and development prospects of wastewater treatment. Results showed that the related topic began in 1993 and publications’ number was significantly increased since 2018. In the past three decades, modeling-based prediction and optimization has always been a research hotspot in wastewater treatment, although the continuous increasing of multifarious research topics in this field. As the international collaboration network core, China published 22.9% of the literatures, followed by the United States (13.1%) and Spain (9.36%). Water Research is the most productive journal with 22 publications containing research articles and review papers. Pollutant and antibiotics removal prediction, and neutral network based regression prediction are three independent research categories. Future research focus will still be on modeling-based wastewater treatment prediction and optimization. The findings provide an important reference and international overview to recognize the potential opportunity for researchers whom are working on machine learning based wastewater treatment and related projects.
Article
Full-text available
In general, paper mill wastewater contains complex organic substances which could not be treated completely using conventional treatment processes, e.g. aerobic processes. As a result, anaerobic technology is a promising alternative for paper mill wastewater treatment due to its ability to degrade hard organic compounds. In the present study, treatment of paper mill wastewater using a stage anaerobic reactor was investigated. The more specific objectives of this study were to confirm whether paper mill wastewater can be tolerated by methanogenic sludge and to assess the stability of reactor for measured parameters (e.g. COD removal, and methane composition). Results showed up to 98% COD removal efficiency in the anaerobic reactor when the reactor was operated at an OLR of 1.560 kg COD/m3.d. Anaerobic digestion can provide high treatment efficiency for recalcitrant substrates, which generates robust microorganism (acidogenesis and methanogenesis), for the degradation of recalcitrant compounds such as in the paper mill wastewater.
Article
Full-text available
Renal transplantation is the treatment of choice for children with end-stage renal disease. The outcome of pediatric kidney transplantation has improved dramatically in recent years, with lower acute rejection rates, superior graft survival, and low mortality. These improvements have allowed increased attention to other aspects of care for long-term survivors. Taking this into consideration, this review article will focus on the key issues related to pediatric kidney transplantation such as growth, neurocognitive function, nonadherence, and posttransplantation infectious complications, including lymphoproliferative disease, to broaden the understanding of pediatricians who provide pre-and postoperative care to children with end-stage renal disease.
Chapter
In this chapter we consider bounds on the rate of uniform convergence. We consider upper bounds (there exist lower bounds as well (Vapnik and Chervonenkis, 1974); however, they are not as important for controlling the learning processes as the upper bounds).
Article
Anaerobic digestion is a biological process that converts organic matter into biogas, reducing the number of microorganisms. The editor of a new book on the subject, describes the development of the process and explains why it is important to future sustainable waste treatment.
Article
BACKGROUND: Anaerobic co-digestion of refractory liquid organic wastes is an alternative environmental management strategy with economic benefits arising out of biogas production. Laboratory-scale experimental investigations were carried out on the anaerobic co-digestion of two liquid organic wastes, food waste leachate (FWL) and piggery wastewater (PWW). Three important parameters affecting methane yield were chosen for this study, namely, mixing ratio, alkalinity and salinity, which were optimized using response surface methodology. RESULTS: The results were analyzed statistically and the optimum conditions identified as: mixing ratio (FWL: PWW) 33 (in terms of volatile solid, w/w) (2 on v/v), alkalinity 2850 mg CaCO3 L−1, and salinity 3.4 g NaCl L−1. Under the optimum conditions, a cumulative methane yield (CMY) of 310 mL CH4 g−1 VSadded and VS reduction (VSR) of 54% were predicted. Mixing ratio and alkalinity showed the greatest individual and interactive effects on CMY and VSR (P < 0.05). A confirmation experiment under optimum conditions showed a CMY and VSR of 323 mL CH4 g−1 VSadded and 50%, respectively. This was only 1.04% and 1.1%, respectively, different from the predicted values. CONCLUSION: Anaerobic co-digestion of FWL and PWW carried out under the optimum condition may be a feasible and efficient treatment option for methane production. Copyright
Article
In this new approach, optimal management solutions are found by 1) first training and artificial neural network (ANN) to predict the outcome of the flow and transport code, and 2) then using the trained ANN to search through many pumping realizations to find an optimal one for successful remediation. The behaviour of complex groundwater scenarios with spatially variable transport parameters and multiple contaminant plumes is simulated with a two-dimensional hybrid finite-difference/finite-element flow and transport code. The flow and transport code develops the set of examples upon which the network is trained. The input of the ANN characterizes the different realizations of pumping, with each input indicating the pumping level of a well. The output is capable of characterizing the objectives and constraints of the optimization, such as attainment of regulatory goals, value of cost functions and cleanup time, and mass of contaminant removal. The supervised learning algorithm of back propagation was used to train the network. The conjugate gradient method and weight elimination procedures are used to speed convergence and improve performance, respectively. Once trained, the ANN begins a search through various realizations of pumping patterns to determine whether or not they will be successful. -from Authors
Article
Experiments were conducted using one-litre anaerobic batch bioreactors operated mesophilically (37 °C) and fed a mixture of primary sludge (RAW), thickened waste activated sludge (TWAS) and simulated organic municipal solid waste (MSW). A mixture of 25% MSW and 75% sewage sludge (60% RAW, 40% TWAS) yielded the highest biogas production. Based on biogas production, the most anaerobically biodegradable components were the white paper and grass components of the MSW. The TWAS and the newspaper components of the MSW were found to be the least biodegradable components. Both particle size and total solids concentration of the feed had a significant impact on the performance of the process. Over the operating region studied, the optimal operating conditions in terms of biogas production and volatile solids removal were at small particle sizes (0.85 mm, the smallest studied) and high total solids concentrations (22.1%, the highest studied).
Article
Four algorithms are outlined, each of which has interesting features for predicting contaminant levels in groundwater. Artificial neural networks (ANN), support vector machines (SVM), locally weighted projection regression (LWPR), and relevance vector machines (RVM) are utilized as surrogates for a relatively complex and time-consuming mathematical model to simulate nitrate concentration in groundwater at specified receptors. Nitrates in the application reported in this paper are due to on-ground nitrogen loadings from fertilizers and manures. The practicability of the four learning machines in this work is demonstrated for an agriculture-dominated watershed where nitrate contamination of groundwater resources exceeds the maximum allowable contaminant level at many locations. Cross-validation and bootstrapping techniques are used for both training and performance evaluation. Prediction results of the four learning machines are rigorously assessed using different efficiency measures to ensure their generalization ability. Prediction results show the ability of learning machines to build accurate models with strong predictive capabilities and hence constitute a valuable means for saving effort in groundwater contamination modeling and improving model performance.