ArticlePDF Available

Prediction of effluent concentration in a wastewater treatment plant using machine learning models

April 2015
Journal of Environmental Sciences 32

April 2015
32

DOI:10.1016/j.jes.2015.01.007

Authors:

Hong Guo

Gwangju Institute of Science and Technology

Kwanho Jeong

Chosun University

Jiyeon Lim

Ulsan National Institute of Science and Technology

Show all 8 authorsHide

Of growing amount of food waste, the integrated food waste and waste water treatment was regarded as one of the efficient modeling method. However, the load of food waste to the conventional waste treatment process might lead to the high concentration of total nitrogen (T-N) impact on the effluent water quality. The objective of this study is to establish two machine learning models-artificial neural networks (ANNs) and support vector machines (SVMs), in order to predict 1-day interval T-N concentration of effluent from a wastewater treatment plant in Ulsan, Korea. Daily water quality data and meteorological data were used and the performance of both models was evaluated in terms of the coefficient of determination (R(2)), Nash-Sutcliff efficiency (NSE), relative efficiency criteria (drel). Additionally, Latin-Hypercube one-factor-at-a-time (LH-OAT) and a pattern search algorithm were applied to sensitivity analysis and model parameter optimization, respectively. Results showed that both models could be effectively applied to the 1-day interval prediction of T-N concentration of effluent. SVM model showed a higher prediction accuracy in the training stage and similar result in the validation stage. However, the sensitivity analysis demonstrated that the ANN model was a superior model for 1-day interval T-N concentration prediction in terms of the cause-and-effect relationship between T-N concentration and modeling input values to integrated food waste and waste water treatment. This study suggested the efficient and robust nonlinear time-series modeling method for an early prediction of the water quality of integrated food waste and waste water treatment process. Copyright © 2015. Published by Elsevier B.V.

Content uploaded by Hong Guo

Content may be subject to copyright.

Prediction of effluent concentration in a wastewater treatment

plant using machine learning models

Hong Guo

, Kwanho Jeong

, Jiyeon Lim

, Jeongwon Jo

, Young Mo Kim

, Jong-pyo Park

Joon Ha Kim

, Kyung Hwa Cho

⁎

1. School of Environmental Science and Engineering, Gwangju Institute of Science and Technology (GIST), 261 Cheomdan-gwagiro, Buk-gu,

Gwangju 500-712, Republic of Korea

2. School of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, Ulsan 689-798, Republic of Korea

3. HECOREA. INC, 405, Woori Venture Town II, 70, Seonyu-ro, Yeongdeungpo-gu, Seoul, Republic of Korea

ARTICLE INFO ABSTRACT

Article history:

Received 30 June 2014

Revised 11 December 2014

Accepted 22 January 2015

Available online 20 April 2015

Of growing amount of food waste, the integrated food waste and waste water treatment

was regarded as one of the efficient modeling method. However, the load of food waste to

the conventional waste treatment process might lead to the high concentration of total

nitrogen (T-N) impact on the effluent water quality. The objective of this study is to

establish two machine learning models—artificial neural networks (ANNs) and support

vector machines (SVMs), in order to predict 1-day interval T-N concentration of effluent

from a wastewater treatment plant in Ulsan, Korea. Daily water quality data and

meteorological data were used and the performance of both models was evaluated in

terms of the coefficient of determination (R

), Nash–Sutcliff efficiency (NSE), relative

efficiency criteria (d

rel

). Additionally, Latin-Hypercube one-factor-at-a-time (LH-OAT) and a

pattern search algorithm were applied to sensitivity analysis and model parameter

optimization, respectively. Results showed that both models could be effectively applied

to the 1-day interval prediction of T-N concentration of effluent. SVM model showed a

higher prediction accuracy in the training stage and similar result in the validation stage.

However, the sensitivity analysis demonstrated that the ANN model was a superior model

for 1-day interval T-N concentration prediction in terms of the cause-and-effect

relationship between T-N concentration and modeling input values to integrated food

waste and waste water treatment. This study suggested the efficient and robust nonlinear

time-series modeling method for an early prediction of the water quality of integrated food

waste and waste water treatment process.

Published by Elsevier B.V.

Keywords:

Artificial neural network

Support vector machine

Effluent concentration

Prediction accuracy

Sensitivity analysis

Introduction

Following the restrictive landfill legislation passed by the

European Union (EU) in 1999, many developed countries have

implemented various policies and technical developments for

reducing the quantity of biodegradable waste landfill (Burnley

et al., 2011; García et al., 2005). The South Korean government

also prohibited the landfill of municipal solid sludge (MSS)

and food waste (FW) in the early 21st century (S. Cheon et al.,

2013). However, this strict regulation causes the dumping of

both the sludge and FW water (i.e., leachate) at sea, conse-

quently leading to the prohibition of its disposal in the ocean

JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

⁎Corresponding author.E-mail: khcho@unist.ac.kr (Kyung Hwa Cho).

http://dx.doi.org/10.1016/j.jes.2015.01.007

Available online at www.sciencedirect.com

ScienceDirect

www.journals.elsevier.com/journal-of-environmental-sciences

by the Marine Environment Management Act (Behera et al.,

2010; S. Cheon et al., 2013). At this time, basic environmental

treatment facilities such as wastewater treatment plants

(WWTPs) appear to be an alternative inland treatment to

resolve the problem. In the inland treatment with wastewater

treatment plant, over 80% of FW, which is recyclable organic

waste of municipal solid wastes (MSW), is dehydrated, and

the remaining waste goes through recycling processes such

as composting, feed, and anaerobic digestion to generate

biomass energy (Chelliapan et al., 2012; Li et al., 2012). In

particular, methane gas, one of the biogases, can be utilized as

a biomass energy source (Lee et al., 2009). However, a large

amount of FW leachate inevitably occurs in all recycling

processes because of a high moisture content from the FW

leachate (Han et al., 2012), resulting in a significant burden on

the wastewater treatment systems.

According to several research reports (Kim et al., 2008;

Sosnowski et al., 2003), the water treatment process can be

more effective by using FW leachate in WWTP. This is because

the FW leachate contains a large amount of acid fermentation

liquid (AFL) which can be utilized as an organic carbon source

for removing nitrogen and phosphorus in advanced waste-

water treatment (AWT) processes (Han et al., 2012; Lee et al.,

2003). The digestion process with only sewage sludge could be

less effective due to the low carbon/nitrogen (C/N) ratio and

low level of biodegradable organic compounds. FW leachate

contains a high amount of solid contents as well as a high C/N

ratio, while containing a low amount of the nutrient-type

elements (Mata-Alvarez, 2003). Therefore, the combined

treatment of sewage sludge and FW improves the removal

efficiency of nitrogen and phosphorus in AWT, enhancing the

stability of the digestion process. Furthermore, higher pro-

duction of methane gas is an additional benefit from the

co-digestion with FW leachate (Cecchi et al., 1988; Hamzawi et al.,

1998; Mata-Alvarez et al., 1990; Poggi-Varaldo and Oleszkiewicz,

1992; Schmit and Ellis, 2001). Owing to these advantages, the

anaerobic digestion process of sewage sludge with FW has been

increased in WWTPs in Korea. However, this process also faces

critical issues which are associated with the side effects of

co-digestion. One of the issues is that the influent water quality is

degraded by mixing with the returned FW leachate from the

anaerobic co-digestion process, so it tends to increase liquor

suspendedsolids(MLSS)andcausesalargeamountofscumin

the activated sludge reactor (Kim and Shin, 2009; Mahmoud et al.,

2003). As well, a sudden increase of the FW leachate could cause

an unstable digestion process and lower the level of effluent

water quality from WWTPs (S. Cheon et al., 2013).

Generally, water quality of a WWTP is sensitive to parameters

such as pH, temperature, concentrations of specific substrates,

and contaminants. This is because wastewater is treated by the

metabolism processes of microorganisms. However, biological

treatment still exhibits time-varying and highly nonlinear

characteristics affected by various known and unknown param-

eters (Hamed et al., 2004; Hong et al., 2003; Mjalli et al., 2007). Due

to these complicated features, many previous studies evaluated

and diagnosed the performance of WWTP by using a mathemat-

ical model for the process simulation and control (Gernaey et al.,

2004; Hamed et al., 2004; Hong et al., 2003; Iacopozzi et al., 2007;

Mjalli et al., 2007; Rivas et al., 2008; Wintgens et al., 2003).

Thereinto, a machine learning model has proved to be a useful

tool because it has a relatively high accuracy for dealing with

complicated systems. Furthermore, a key advantage of these

models to the evaluation of WWTP performance is that these can

directly predict output values from input values only after

training and validation step. Artificial neural networks (ANNs)

and support vector machines (SVMs) are representative machine-

learning techniques (Dreyfus, 2005; Shon and Moon, 2007). Two

machine learning models' performance studies have been widely

discussed before (Hamed et al., 2004; Palani et al., 2008; Singh et

al., 2009; Yoon et al., 2011). However, only black box modeling has

the limitation on the process control and there has yet to

elucidate the cause-and-effect relationship for input and output

value for process control.

In this study, two machine learning models would be

developed for predicting effluent T-N concentration for the

integrated food waste and waste water treatment plant in

Ulsan Metropolitan city, Korea. Moreover, by sensitivity

analysis between input values and output values, the

cause-and-effect relationship would be elucidated for the

future process control and selection of the prior machine

learning model for integrated food waste and waste water

treatment. The objective of this study is: a) development of

reliable 1-day interval early T-N concentration prediction

model by parameter optimization method; b) evaluation of

the building model by sensitivity analysis to find the cause–

effect based reasonable model as future decision-making tool;

c) to propose an early warning prediction tool to avoid the

impact of FW leachate loading to the integrated food waste

and waste water treatment.

1. Method and materials

1.1. Field sampling

We collected water samples in an attempt to investigate the

effect of FW leachate on Yong-yeon (YY) WWTP in Ulsan. The

samples were collected from 6 different spots, including

influent, flow-distribution tank, aeration tank, effluent, FW

leachate, and pre-treated FW leachate (Fig. 1). The collected

samples were delivered to a laboratory at the Ulsan National

Institute of Science and Technology (UNIST) and were

analyzed in terms of total suspended solids (TSS), chemical

oxygen demand (COD), total nitrogen (T-N), and total phos-

phorus (T-P); water temperature and pH were measured in-situ

at the sampling stations.

1.2. Sample analysis

TSS of a water sample was measured by filtering a 20 mL

sample through pre-weighed 47 mm Glass-Fiber paper (with

1.2 μm pore size), then weighing the filter again after drying to

remove all water in the sample. COD, T-P, and T-N were

measured through absorptiometric analysis. COD and T-P were

measured for 4 sampling locations: influent, flow-distribution

tank, aeration tank, and effluent. T-N was measured for 6

sampling locations including 2 additional stations (i.e.,pre-and

post-aerobic transamination of FW leachate). The absorbance

of samples, which were mixed with the proper reagents was

quantified under the 200-900 nm wavelength and the target

91JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

components were quantified. Distilled water was used for the

referencesolution. For COD quantification, 0.5 mL of each water

sample was put into the sulfuric acidic solution, and then

0.6 mL of standard potassium permanganate (KMnO

) solution

(0.005 mol/L) was added. The mixed solution was heated for

15 min at 100°C. After the reaction, the oxygen demand was

measured by the amount of consumed potassium permanga-

nate. For T-P measurement, the water sample was pre-treated

by putting persulfuric acid potassiuminto 5 mL of water sample

and heating for 30 min at 120°C. After heating the pre-treated

water sample, a mixture of 2 mL of ammonium molybdate with

ascorbic acid was put into the sample. The reference solution

was observed under the 880 nm wavelength and the T-P of the

water sample was measured by quantifying the amount of

reduced phosphate. For quantifying T-N, the samples from pre-

and post-aerobic treatment of FW leachate had to be diluted to

1/25 ratio due to their high concentration levels. The water

samples were pre-treated by putting alkaline persulfuric acid

potassium into 0.5 mL of water sample, which was then heated

for 30 min at 120°C. After adding hydrochloricacid to make pH 2

to 3, T-N wasfinally measured by the absorption of wavelengths

under 220 nm. Consequently, the T-N of the water sample was

measured by oxidizing nitrogenous compounds to nitrate ions

and calculated by the difference of light intensity between the

reference and sample.

1.3. Modeling approaches

1.3.1. Artificial neural networks (ANNs)

As the name implies, an artificial neural network is a data-

based flexible mathematical structure of a neural network

model which is a very powerful computational technique for

the modeling of complex non-linear relationships and analysis

of the explicit form of the relations between variables. It was

firstly introduced in the early years of the 1940s and developed

with the back-propagation (BP) algorithm in 1988 (Gallant, 1993;

McCulloch and Pitts, 1943; Rumelhart et al., 1988; Smith, 1993).

Artificial neural networks have been seen as the standard

data-based nonlinear estimator tools, and it is widely applied

for prediction and forecasting in the field of environment-

related areas, including water treatment (Liong et al., 2001;

Muttil and Chau, 2006), oceanography (Lee,2004), and ecological

science (Trichakis et al., 2011). Also, the use of data-based

modeling for water quality (Ahmed and Sarma, 2007; Cho et al.,

2011; Karul et al., 2000; Lee et al., 2010; Lek and Guégan, 1999;

Rogers and Dowla, 1994; Yan et al., 2010) has been successfully

completed for the past 20 years.

A common ANN structure, called a multilayer perception

network, consists of three distinctive layers: input, hidden,

and output with linked-nodes and functions. After data are

introduced into the ANN model, the network utilizes the

neurons which are non-linear algebraic functions (i.e., transfer

functions) (Dreyfus et al., 2002). The signal passes from one

neuron to another neuron by the weights and transfer

function (Govindaraju, 2000) and the back propagation algo-

rithm could effectively train the network for the nonlinear

neural network problems by adjusting weights in an attempt

to minimize the objective function during those processes

(Rumelhart and Mcclelland, 1986). The mathematical expres-

sion of the ANN model in this study is as follows (Khalil et al.,

2005):

yi¼fX

j¼1

WijXjþbi

Að1Þ

where, X

is the jth nodal value for the previous layer and y

the ith nodal value in the current layer.

By multiplying the weighting factor (W

) and adding the

bias of the ith node, we can calculate the current nodal value

for the aforementioned nodal value based on the activation

function fbased on Eq. (1). Three layers (input, hidden, and

output) of a feed-forward artificial neural network were built

for predicting effluent concentrations in the YY WWTP from 8

input variables, X

(i=1,…, 8) (month, volumetric flow rate of

Fig. 1 –Schematic diagram of wastewater treatment plants (WWTPs), dashed box indicates the system boundaries for the

machine learning model development for these studies. Sampling points are numbered as (1)–(6). TN conc: total nitrogen

concentration.

92 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

inflow, pH, temperature, chemical oxygen demand, suspended

solid, T-N of inflow, T-N of pre-treated FW leachate) (Fig. 2). Fewer

hidden nodes are usually preferable, due to the better generali-

zation capabilities which can avoid over-fitting problems. How-

ever, insufficient nodes also lead to impaired performance of the

networking training and validation (Ahmed and Sarma, 2007;

Palani et al., 2008). The optimal parameter sets of the hidden

nodes, learning rate, and momentum for the model were

determined by pattern search algorithms. In addition, we

tested the logistic sigmoid function, tangent sigmoid function,

and the linear function as candidate transfer functions and

optimal fitness model comes from tangent sigmoid transfer

function.

1.3.2. Support vector machines (SVM)

Support vector machines (SVMs) are a data-based machine

learning model, which is based on structural risk minimiza-

tion (SRM) (Vapnik, 1995, 1999). The SRM minimizes the

empirical error and model complexity simultaneously. It

could contribute to the improvement of generalization ability

of the classification or regression problems (Yoon et al., 2011).

SVMs have been widely verified in numerous environmental

research areas. Dibike et al. (2001)applied various kernel

functions of SVM to predict rainfall, and Khalil et al. (2005)

used SVM to demonstrate the agriculture-dominated water-

shed by analyzing the spatial distribution features of ground-

water. For other fields, including the stream flow water level

of lakes and soil moisture prediction, it also widely applied

(Gill et al., 2006; Khalil et al., 2006; Khan and Coulibaly, 2006;

Liong and Sivapragasam, 2002).

SVM models could be classified into two types: linear

support vector regression and nonlinear support vector

regression. The nonlinear support vector regression mathe-

matical model was used for model development in this study.

Mathematically, it can be described as follows:

ðÞ¼

i¼1

WiφXi

ðÞþbð2Þ

where, W

and bare the parameters of the linear support

vector regression function and φ(X

) is the nonlinear mapping

function. In order to simply calculate the nonlinear mapping

function, the kernel function, K(x

)=〈(ϕ(x

)⋅ϕ(x

))〉would

be applied to make the inner products, analyze the space, and

evaluate the feature-separating space as the mathematical

functions (Yu et al., 2006). We tested all kinds of the kernels

(such as linear, polynomial, sigmoid, and radial basis func-

tion) and found that the radial basis function could lead to an

optimal fitness model for effluent water quality prediction in

this study. In addition, for the key model parameters, the

optimal parameter sets of the cost constant (C), the radius of

insensitive tube (ε), and the scale parameter for stable

performance of model (σ) were determined by the optimiza-

tion algorithm.

1.4. Model construction

1.4.1. Input data preparation

MATLAB was used for building ANN and SVM models to predict

effluent T-N in the WWTP. As the architectures of ANN and

Fig. 2 –Illustration of general conceptual model structure for artificial neural networks (ANNs) and support vector machines

(SVMs).

93JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

SVM models were shown in Fig. 3, the total dataset is divided

into two different groups: training and validation data set. For

the model input data, we chose particular data from January to

August for the training of the model and the data from

September to October for the validation after optimization

work on the training and validation period studies. All data

were normalized to range from −1to1.Afterthat,the

normalized data were used as input and output data for the

ANN and SVM models. The optimal model parameters of

these two models were determined by applying a global

optimization algorithm, respectively. After determining the

model parameters, the T-N concentration of the effluent

would be predicted by the ANN model and SVM model, then the

values were compared to the measured values to evaluate the

prediction accuracy. Other pollutants such as COD, BOD

(biochemical oxygen demand), and T-P were not considered as

the output values for modeling development because only the

concentration of T-N in the effluent was regarded as an indicator

of the effect of the FW leachate on the waste water treatment in

this study.

1.4.2. Model parameter optimization

Both of ANN and SVM, the model parameters greatly influence

the learning and prediction accuracy of the output values. Palani

et al. (2008) found that an insufficient number of nodes would

lead to an impaired performance of the network. Normally, the

optimum value of the parameters is determined by trial and

error, or is based on previous research. In this study, we used the

patternsearchalgorithm(Lewis and Torczon, 2002)todetermine

the optimum values for parameters of the ANN and SVM models,

as showed in Table 1. The initial ranges of each parameter were

selected based on previous research (Cho et al., 2009; Wang et al.,

2003).

1.4.3. Assessment of model performance

To judge the performance of each machine learning model

(for ANN and SVM), the suitable criterion selection is critical to

confirm the model performance. Also, as Krause found that

none of single usage of efficiency criteria could give us the full

explanation of model performance, since each of them has

their pros and cons, we applied the three criteria: coefficient of

determination (R

), Nash–Sutcliff efficiency (NSE), and relative

efficiency criteria (d

rel

) for the training and validation, which are

most frequently applied in the water-sciencefield (Krause et al.,

2005).

The coefficient of determination could be defined and

calculated as follows,

R2¼X

QmiðÞ−Qmið ÞÞðQoiðÞ−QoiðÞ



QmiðÞ−QmiðÞ



QoiðÞ−QoiðÞ



ð3Þ

The value of the coefficient would be in the range from 0 to

1 (no correlation to a perfect fit), and it would tell us how the

Fig. 3 –Logical flow for two machine learning modeling study (artificial neural networks (ANNs) and support vector machines

(SVMs)). SS: suspended solids; T-N: total nitrogen; tansig/tansig: the transfer function of ANNs; RBF: the kernel function of

SVMs.

94 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

dispersion of the measured value could be explained by the

modeling prediction.

Nash–Sutcliffe efficiency was developed in the 1970s and it

was widely applied to access the hydrological models. It is

also very sensitive to the extreme value and might give

unoptimal results for the datasets, which contains extreme

data. It could be calculated as:

NSE ¼1−X

t¼1

QoiðÞ−QmiðÞðÞ

t¼1

QoiðÞ−QoiðÞ



2ð4Þ

where, Q

, and QoiðÞ are the measured value, modeled

value, and average measured value at the ith order observa-

tion, respectively. And Nis the total number of samples.

We used the absolute values to check the difference between

the values of measured and modeled. Both of the coefficients

of determination and Nash–Sutcliffe efficiency described the

difference between the measured and modeled value for the

absolute values. However, there might be an over- or under-

prediction due to higher or lower values. To counteract these

problems, we additionally applied the relative efficiency criteria

rel

) to reduce the influence of the absolute differences between

the measured value and modeled value during high values

significantly.

drel ¼1−X

Qoi

ðÞ

−Qmi

ðÞ

Qoi

ðÞ



QmiðÞ−QoiðÞ

þQmiðÞ−QoiðÞ



QoiðÞ

2ð5Þ

The range of the relative efficiency criteria is also in the

range from 0 to 1.

Besides applying the aforementioned criteria, fitness of the

constructed models was checked through the residual anal-

ysis (Krause et al., 2005).

1.5. Sensitivity analysis

Latin Hypercube One factor At a Time (LH-OAT) sensitivity

analysis was used for input parameters that may have a

potential influence on the prediction of T-N concentration of

the effluent. As a sensitivity analysis method, which could

give the ranking of parameter sensitivity, LH-OAT combines

the One-factor At a Time (OAT) and Latin Hypercube (LH)

sampling methods. Under LH-OAT sensitivity analysis, all

parameters are sampled under the precision of the OAT

method so that any change of the output value can be clearly

attributed to the changed input. Additionally, LH-OAT is also a

very efficient method; for mintervals in the LH method, a total

of m×(p+ 1) steps are required (van Griensven et al., 2006).

For each input parameter, the boundary was set to the

minimum and maximum values.

2. Results and discussions

2.1. Water quality monitoring

The daily data for the three water quality parameters (T-N, T-P,

and TSS) measured over 10 months for all 6 monitoring points

stations (influent, flow-distribution tank, aeration tank, efflu-

ent, FW leachate, and pre-treated FW leachate) is presented in

Table 1. The T-N and TSS are for two important parameters for

the assessment of the water quality analysis, including the

measured data of T-N and TSS, months, volumetric flow rate of

the inflow, pH, temperature, and COD was used as the input

parameters for machine learning model construction. All of

these input parameters were determined through the sensitiv-

ity analysis for demonstration to the relationship with the T-N

concentration of effluent and finally selected from the model

development.

Box plots in Fig. 4 are the statistical analysis of the measured

water quality variables from Table 1. The results of the T-N

concentration for the flow distribution tank and aeration tank

are 38.884 ± 20.508 mg/L and 47.569 ± 20.933 mg/L, respectively.

After the food sludge treatment, the pre-treated FW leachate

would be recycled to the flow distribution tank. We measured

water quality parameters (T-N, T-P, and TSS) of the influent to the

each process and found that water quality of the aeration tank is

related to the effect of the pre-treated FW leachate recycling.

From Table 1, T-N concentration increased by 8.685 mg/L

on average. Also, T-P and TSS increased by 3.858 mg/L and

107.978 mg/L, respectively. Since the difference between the

T-N concentrations of the pre-treated FW leachate and flow

distribution tank is at least one order of magnitude difference, the

aforementioned results could be acceptable and we could observe

that the integrated treatment of food waste could greatly affect

the water quality of the WWTP treatment in Ulsan.

Table 1 –10 months measured data of the water quality variables for each process in the Ulsan waste water treatment plant.

Influent Flow-distribution Aeration-sedimentation Effluent Supernatant Pretreatment

T-N

(mg/L)

Average 39.074 38.884 47.569 14.639 2578.980 1662.938

Std. 13.842 20.508 20.933 9.522 1138.601 1055.886

Median 37.390 35.180 42.185 13.618 2221.800 1323.800

T-P

(mg/L)

Average 5.544 2.701 6.559 1.083 ––

Std. 1.925 0.570 1.893 0.852 ––

Median 5.515 2.800 6.333 0.978 ––

TSS

(mg/L)

Average 266.532 180.328 288.306 97.581 ––

Std. 108.449 88.755 97.375 62.020 ––

Median 250.000 175.000 287.500 100.000 ––

T-N: total nitrogen; T-P: total phosphorus; TSS: total suspended solids.

95JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

2.2. Training and validation of ANN and SVM models

Different ANN models and SVM models were built for testing

in order to determine the optimum model for the prediction of

effluent T-N concentration in this study. For the ANN model,

the hyperbolic tangent transfer function (nonlinear transfer

function) was determined as the optimal function for both the

hidden and output layers. And for SVM, the RBF kernel

function was used in the transformation layer.

Additionally, the selection of the appropriate node num-

bers for both the ANN and SVM hidden layers is very critical,

because over-fitting results could result from extreme number

of nodes use. In this study, we applied the pattern search

algorithm to find the optimum parameters of the node

number of hidden layers for both the ANN and SVM models.

Table 2 shows the optimum parameters for ANN and SVM,

which were obtained from the pattern search algorithm study.

2.3. Parameter sensitivity analysis

Table 3 summarizes the sensitivity ranking for the perfor-

mance of the input parameters to the T-N concentration for

effluent. It showed the importance of the spatial and temporal

variables to the model predictions. In the ANN case, the

temperature was the most important parameter, followed by

the T-N of inflow water and pH. On the other hand, the

month, COD, and SS were the most three important param-

eters for the SVM model.

For a biological water treatment plant, the temperature, pH,

and organic carbon are the three most important operational

conditions for bacteria growth, which could affect the removal

efficiency of the T-N of biological water treatment. Hence, the

temperature and pH are a reasonable determination as the

most significant parameters for predicting the T-N concentra-

tion of effluent for machine learning models in this study.

Additionally, the T-N concentration of effluent was also an

important input parameter, which directly affects the input

amount of T-N into waste water treatment. Hence it could be

found that the ANN model could lead to a more reasonable

model compared to SVM based on the consideration of

characteristics of the biological treatment process. For the

control of process, the more reasonable physical relation

based ANN model could be more reliable model to apply on

the avoidance of the high T-N concentration impact manage-

ment on the system by adjusting the most physically related

parameters than SVM.

Machine learning models do not have to represent all the

physical meaning through the input and output variables, but

we can still observe from the result of sensitivity analysis that

the highest ranking parameter of ANN was temperature.

However, for SVM, the highest ranking parameter was the

month, which seems to not be related to any physical

meaning. By considering the relationship between tempera-

ture and month, and an additional effect of the ionic strength

and flocs based on the different season (or month), the result

of ANN was more acceptable (Zita and Hermansson, 1994).

Additionally, the values of the final effect in Table 3 showed

that there was not much difference for all variables in SVM. In

terms of the process control, ANN showed the more reliable

and reasonable results than the SVM model.

Fig. 4 –Basic statistics analysis of the measured water quality data. T-N: total nitrogen (log scale); T-P: total phosphorus; TSS:

total suspended solids; a: influent; b: flow-distribution; c: aeration-sedimentation; d: effluent; e: supernatant; f: pretreatment.

Table 2 –Comparison of the optimized artificial neural networks (ANN) and support vector machines (SVM) performances

for prediction of total nitrogen (T-N) concentration of the effluent from the wastewater treatment plant in Ulsan.

Site Model Model parameters R

NSE d

rel

Tr Vl Vl Tr Tr Vl

Ulsan wastewater treatment plant ANNs

(Tansig/Tansig)

lr: 0.50

mo: 0.742

#N:11

0.55 0.47 0.56 0.46 0.80 0.76

SVMs

(RBF)

C: 50.005

ε: 0.001

σ: 4.693

1.00 0.46 1.00 0.45 0.99 0.77

lr: the learning rate; mo, momentum; # N: number of hidden neurons; C: the cost constant; ε: the radius of insensitive tube; σ: the parameter of

the kernel function; R

: the coefficient of determination; NSE: Nash–Sutcliffe model efficiency; d

rel

: relative efficiency criteria; Tr: the training

step; Vl: the validation step; Tansig/Tansig: the transfer function of ANNs selected in this study; RBF: the kernel function of SVMs selected in

this study.

96 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

2.4. Model test

Measured values of T-N concentration of effluent from WWTP

in Ulsan were compared to the modeled values by the

machine learning models (ANN and SVM) using both of

regressions model and residual values, to check the models'

performance. Fig. 5 shows the regression model plot of steps

of training and validation for the both of ANN and SVM. We

could inspect that both of the ANN and SVM model resulted in

a good fit for the modeled data. As Table 2 shows, for the

training step and validation step of ANN model, coefficient of

determination value were 0.55 and 0.47; Nash–Sutcliff effi-

ciency (NSE) were 0.56 and 0.46; relative efficiency criteria

were 0.80 and 0.76. On the other hand, the results of the SVM

model showed that coefficient of determination (R

) values for

the training and validation were 1.00 and 0.46; NSE values

were 1.00 and 0.4;, relative efficiency criteria were 0.99 and

0.77 (Table 2).

Table 3 –Sensitivity rank of input variables in artificial neural networks (ANNs) and support vector machines (SVMs) using

the Latin Hypercube One factor At a Time (LH-OAT) sensitivity analysis for the Ulsan wastewater treatment plant.

Rank ANN SVM

Variable Final effect Variable Final effect

1 Temperature 38.59 Month 1.45

2 Total nitrogen of inflow 33.37 Chemical oxygen demand 1.34

3 pH 32.60 Suspended solid 1.33

4 Volumetric flow rate of inflow 30.58 pH 1.29

5 Suspended solid 26.89 Temperature 1.28

6 Total nitrogen of food waste leachate 22.31 Total nitrogen of inflow 1.24

7 Month 23.58 Volumetric flow rate of inflow 1.22

8 Chemical oxygen demand 17.64 Total nitrogen of food waste leachate 1.17

Fig. 5 –Comparison of the modeled and measured total nitrogen (T-N) concentration of effluent from the Ulsan waste water

treatment plant training and validation tests using artificial neural network (ANN) and support vector machine (SVM) model.

97JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

By considering the low value of the modeling performance

criteria, we additionally check the fitness of the created

machine learning models through the analysis of residuals

(Fig. 6). We could observe that the plot of residuals for training

and validation for the both ANN and SVM shows that the

relationship between residuals and modeled values of T-N

concentration are independent and random distribution. The

results could also support by further correlation analysis (R

for ANN: Training = 5.509e −7, Validation = 3.306e−6; R

for

SVM: Training = 1.1e −6, Validation = 2.01e −5) in Fig. 7.

In this study, the low results of model performance criteria

obtained were likely from the data noise and short-term of the

input data. However, ANN and SVM models could give

acceptable modeling accuracy results for the future prediction

of effluent T-N concentration.

2.5. Comparison of models

Fig. 4 shows the measured and predicted T-N concentrations

by the ANN and SVM models with application of optimum

parameters. Prediction accuracy of the SVM was slightly higher

thantheaccuracyoftheANNduringthetrainingsteps,whereas

the accuracy of the SVM was almost identical during the

validation steps. Consequently, we observed a higher prediction

performance of SVM than ANN.

Recently, ANN was appliedto predict water qualityvariables.

Soyupak et al. (2003) studied the prediction of dissolved oxygen

concentration in three separate reservoirs. The correlation of

the evaluation coefficient was greater than 0.95 for predicting

dissolved oxygen concentration. SVM has also been used to

predict water quality. Singh et al. (2009) computed the DO

(dissolved oxygen) and BOD concentration in a polluted river

flowing through the northern alluvial Gangetic plains in India.

Root-mean-square error (RMSE) values for the predicted and

observed values of DO were 0.7 and 0.74 for training and

validation steps, respectively, while the predicted and observed

values of BOD were 0.85 and 0.85 for training and validation

steps, respectively (Singh et al., 2009). For the waste water

treatment plant area, Oliveira-Esquerre et al. (2002) applied

ANN for the prediction of the biochemical oxygen demand of

the biological wastewater treatment effluent with an average R

of 0.76. Additionally, Mjalli et al. (2007) used ANN for research

into wastewater treatment plant operation characteristics and

the prediction of the BOD, COD, and TSS for the Doha West

WWTP.

A relatively low value of the model performance criteria

(0.4–1.0 for R

, 0.4–1.0 for NSE, and 0.76–0.99 for relative

efficiency criteria) for output variables for the T-N concentra-

tion of pre-treated FW leachate could be observed in this

study. The reason might be the interpretation and prediction

Fig. 6 –Plot of the modeled versus measured total nitrogen (T-N) concentration of effluent from the Ulsan waste water treatment

plant training and validation tests.

98 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

ability for the higher non-linear relationship. Balabin and

Lomakina found that higher nonlinear interferences could

lead to low accuracy for both the ANN and SVM models

(Balabin and Lomakina, 2011). On the other hand, the limited

number of input variables and data noise could also be the

reason for the lower values of the coefficient of determination.

Hamed et al. (2004) applied ANN to the modeling of waste

water treatment. The data set for modeling was collected

around 10 months and low values of coefficient of determi-

nation (<0.5 on average) could also be observed, which were

similar to our study. However, future studies are necessary for

the high performance models for more effective water quality

prediction than current results.

Machine learning models (ANN and SVM) were developed

for the prediction of T-N concentration in effluent from the

Ulsan wastewater treatment plant using water quality and

meteorological data. The results showed that the machine

learning model could be applied to model the complex

wastewater treatment process, which also parallel treated

the high T-N concentration of FW leachate from food waste

through the wastewater treatment process. The values of the

T-N concentration of the effluent were successfully predict-

ed by the machine learning models and these two models

(ANN and SVM) could also be applied to 1) estimate the T-N

concentration of effluent when real-time monitoring or

sampling is not possible, and 2) estimate the range of the

output parameters to avoid exceeding the water quality

regulations.

It should be noted that the measured data were checked daily.

Therefore, the current machine learning models (ANN and SVM)

were only applied to the prediction daily water quality change

with a very short period (10 months). A large data set based

model recalibration and revalidation would be required in future

studies for a more accurate prediction model. Additionally, other

input parameters may also be considered for future modeling

work. Nevertheless, the models which were constructed in this

study could still be effectively used for the prediction of the

effluent T-N concentration.

3. Conclusions

The aim of this study was to develop two reliable machine

learning models (ANN and SVM) to predict the early 1-day

interval T-N concentration of effluent to avoid impact of high

FW leachate T-N concentration loading to the waste water

treatment. Both of daily water quality data and meteorological

data were used as input parameters, and a pattern search

algorithm was used for model parameter optimization for

machine learning models. In addition, sensitivity analysis

was also conducted to determine the effectiveness of each

Fig. 7 –Residuals plots versus modeled (predicted) total nitrogen (T-N) concentration of effluent from the Ulsan waste water

treatment plant training and validation tests.

99JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

input parameter by using the LH-OAT method. The present

study shows that: 1) the optimum model of ANN and SVM was

reliable to predict the trends of water quality at the

wastewater treatment plant of Ulsan; 2) based only on the

model performance assessment from prediction accuracy,

the SVM model performance was better than the ANN model;

and 3) however, from the sensitivity analysis, more physical

related cause-and-effect relationships between the T-N

concentration of effluent and other input parameters could

be elucidated from ANN than SVM. Thus, the ANN model

could be a more reasonable and reliable model than SVM for

the purpose of decision-making model building and process

control for the integrated food waste and waste water

treatment. This study showed that machine learning models

could be a reliable method for the water quality prediction as

early warning water quality control of waste water treat-

ment. For the future work, long-term modeling for the input

value sampling could be suggested in the future to improve

the accuracy of the ANN and SVM models.

Acknowledgments

This research was supported by a grant (12-TI-C04) from

Advanced Water Management Research Program funded by

Ministry of Land, Infrastructure and Transport of Korean

government.

REFERENCES

Ahmed, J.A., Sarma, A.K., 2007. Artificial neural network model

for synthetic streamflow generation. Water Resour. Manag.

21 (6), 1015–1029.

Balabin, R.M., Lomakina, E.I., 2011. Support vector machine

regression (SVR/LS-SVM)—an alternative to neural networks

(ANN) for analytical chemistry? Comparison of nonlinear

methods on near infrared (NIR) spectroscopy data. Analyst 136

(8), 1703–1712.

Behera, S.K., Park, J.M., Kim, K.H., Park, H.-S., 2010. Methane

production from food waste leachate in laboratory-scale

simulated landfill. Waste Manag. 30 (8-9), 1502–1508.

Burnley, S., Phillips, R., Coleman, T., Rampling, T., 2011. Energy

implications of the thermal recovery of biodegradable

municipal waste materials in the United Kingdom. Waste

Manag. 31 (9-10), 1949–1959.

Cecchi, F., Traverso, P.G., Perin, G., Vallini, G., 1988. Comparison of

co‐digestion performance of two differently collected organic

fractions of municipal solid waste with sewage sludges.

Environ. Technol. 9 (5), 391–400.

Chelliapan, S., Mahat, S.B., Din, M.F.M., Yuzir, A., Othman, N.,

2012. Anaerobic digestion of paper mill wastewater. Iranica

J. Energy Environ. 3, 85–90.

Cho, K.H., Kang, J.-H., Ki, S.J., Park, Y., Cha, S.M., Kim, J.H., 2009.

Determination of the optimal parameters in regression

models for the prediction of chlorophyll-a: a case study of the

Yeongsan Reservoir, Korea. Sci. Total Environ. 407 (8),

2536–2545.

Cho, K.H., Sthiannopkao, S., Pachepsky, Y.A., Kim, K.-W., Kim, J.H.,

2011. Prediction of contamination potential of groundwater

arsenic in Cambodia, Laos, and Thailand using artificial neural

network. Water Res. 45 (17), 5535–5544.

Dibike, Y.B., Velickov, S., Solomatine, D., Abbott, M.B., 2001. Model

induction with support vector machines: introduction and

applications. J. Comput. Civ. Eng. 15 (3), 208–216.

Dreyfus, G., 2005. Neural Networks: Methodology and

Applications. Springer, Heidelberg.

Dreyfus, G., Martinez, J.-M., Samuelides, A., Gordon, M.B., Badran,

F., Thiria, S., et al., 2002. Réseaux de Neurones de Méthodologie

et Applications. Eyrolles.

Gallant, S.I., 1993. Neural Network Learning and Expert Systems.

MIT Press, London.

García, A.J., Esteban, M.B., Márquez, M.C., Ramos, P., 2005.

Biodegradable municipal solid waste: characterization and

potential use as animal feedstuffs. Waste Manag. 25 (8),

780–787.

Gernaey, K.V., van Loosdrecht, M.C.M., Henze, M., Lind, M.,

Jørgensen, S.B., 2004. Activated sludge wastewater treatment

plant modelling and simulation: state of the art. Environ.

Model. Software 19 (9), 763–783.

Gill, M.K., Asefa, T., Kemblowski, M.W., McKee, M., 2006. Soil

moisture prediction using support vector machines. J. Am.

Water Resour. Assoc. 42 (4), 1033–1046.

Govindaraju, R.S., 2000. Artificial neural networks in hydrology. II:

hydrologic applications. J. Hydrol. Eng. 5 (2), 124–137.

Hamed, M.M., Khalafallah, M.G., Hassanien, E.A., 2004. Prediction

of wastewater treatment plant performance using artificial

neural networks. Environ. Model. Software 19 (10), 919–928.

Hamzawi, N., Kennedy, K.J., McLean, D.D., 1998. Technical

feasibility of anaerobic co-digestion of sewage sludge and

municipal solid waste. Environ. Technol. 19 (10), 993–1003.

Han, M.J., Behera, S.K., Park, H.S., 2012. Anaerobic co‐digestion of

food waste leachate and piggery wastewater for methane

production: statistical optimization of key process parameters.

J. Chem. Technol. Biotechnol. 87 (11), 1541–1550.

Hong, Y.-S.T., Rosen, M.R., Bhamidimarri, R., 2003. Analysis of a

municipal wastewater treatment plant using a neural

network-based pattern analysis. Water Res. 37 (7), 1608–1618.

Iacopozzi, I., Innocenti, V., Marsili-Libelli, S., Giusti, E., 2007. A

modified activated sludge model no. 3 (ASM3) with two-step

nitrification–denitrification. Environ. Model. Software 22 (6),

847–861.

Karul, C., Soyupak, S., Çilesiz, A.F., Akbay, N., Germen, E., 2000.

Case studies on the use of neural networks in eutrophication

modeling. Ecol. Model. 134 (2-3), 145–152.

Khalil, A., Almasri, M.N., McKee, M., Kaluarachchi, J.J., 2005.

Applicability of statistical learning algorithms in groundwater

quality modeling. Water Resour. Res. 41, W05010. http://dx.doi.

org/10.1029/2004WR003608.

Khalil,A.F.,McKee,M.,Kemblowski,M.,Asefa,T.,Bastidas,L.,

2006. Multiobjective analysis of chaotic dynamic systems

with sparse learning machines. Adv. Water Resour. 29 (1),

72–88.

Khan, M.S., Coulibaly, P., 2006. Application of support vector

machine in lake water level prediction. J. Hydrol. Eng. 11 (3),

199–205.

Kim, S.H., Shin, H.S., 2009. Acidogenesis of lipids-containing

wastewater in anaerobic sequencing batch reactor. J. Korean

Soc. Environ. Eng. 31 (12), 1075–1080.

Kim, J.K., Han, G.H., Oh, B.R., Chun, Y.N., Eom, C.-Y., Kim, S.W.,

2008. Volumetric scale-up of a three stage fermentation

system for food waste treatment. Bioresour. Technol. 99 (10),

4394–4399.

Krause, P., Boyle, D.P., Bäse, F., 2005. Comparison of different

efficiency criteria for hydrological model assessment. Adv.

Geosci. 5, 89–97.

Lee, T.L., 2004. Back-propagation neural network for long-term

tidal predictions. Ocean Eng. 31 (2), 225–238.

Lee, C.Y., Shin, H.S., Chae, S.R., Nam, S.Y., Paik, B.C., 2003. Nutrient

removal using anaerobically fermented leachate of food waste

in the BNR process. Water Sci. Technol. 47 (1), 159–165.

100 JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

Lee, D.H., Behera, S.K., Kim, J.W., Park, H.-S., 2009. Methane

production potential of leachate generated from Korean food

waste recycling facilities: a lab-scale study. Waste Manag. 29

(2), 876–882.

Lee, E., Seong, C., Kim, H., Park, S., Kang, M., 2010. Predicting the

impacts of climate change on nonpoint source pollutant loads

from agricultural small watershed using artificial neural

network. J. Environ. Sci. 22 (6), 840–845.

Lek, S., Guégan, J.-F., 1999. Artificial neural networks as a tool in

ecological modelling, an introduction. Ecol. Model. 120 (2-3),

65–73.

Lewis, R.M., Torczon, V., 2002. A globally convergent augmented

Lagrangian pattern search algorithm for optimization with

general constraints and simple bounds. SIAM J. Optim. 12 (4),

1075–1089.

Li, X.M., Cheng, K.Y., Selvam, A., Wong, J.W., 2012. Bioelectricity

production from acidic food waste leachate using microbial

fuel cells: effect of microbial inocula. Process Biochem. 48 (2),

283–288.

Liong, S.Y., Sivapragasam, C., 2002. Flood stage forecasting with

support vector machines. J. Am. Water Resour. Assoc. 38 (1),

173–186.

Liong, S.Y., Khu, S.T., Chan, W.T., 2001. Derivation of Pareto front

with genetic algorithm and neural network. J. Hydrol. Eng. 6 (1),

52–61.

Mahmoud, N., Zeeman, G., Gijzen, H., Lettinga, G., 2003. Solids

removal in upflow anaerobic reactors, a review. Bioresour.

Technol. 90 (1), 1–9.

Mata-Alvarez, J., 2003. Biomethanization of the Organic Fraction

of Municipal Solid wastes. IWA Publishing, London.

Mata-Alvarez, J., Cecchi, F., Pavan, P., Llabres, P., 1990. The

performances of digesters treating the organic fraction of

municipal solid wastes differently sorted. Biol. Wastes 33 (3),

181–199.

McCulloch, W.S., Pitts, W., 1943. A logical calculus of the ideas

immanent in nervous activity. Bull. Math. Biophys. 5 (4),

115–133.

Mjalli, F.S., Al-Asheh, S., Alfadala, H.E., 2007. Use of artificial

neural network black-box modeling for the prediction of

wastewater treatment plants performance. J. Environ. Manage.

83 (3), 329–338.

Muttil, N., Chau, K.W., 2006. Neural network and genetic

programming for modelling coastal algal blooms. Int.

J. Environ. Pollut. 28, 223–238.

Oliveira-Esquerre, K.P., Mori, M., Bruns, R.E., 2002. Simulation of

an industrial wastewater treatment plant using artificial

neural networks and principal components analysis. Braz.

J. Chem. Eng. 19 (4), 365–370.

Palani, S., Liong, S.-Y., Tkalich, P., 2008. An ANN application for

water quality forecasting. Mar. Pollut. Bull. 56 (9), 1586–1597.

Poggi-Varaldo, H.M., Oleszkiewicz, J.A., 1992. Anaerobic

co-composting of municipal solid waste and waste sludge at

high total solids levels. Environ. Technol. 13 (5), 409–421.

Rivas, A., Irizar, I., Ayesa, E., 2008. Model-based optimisation of

wastewater treatment plants design. Environ. Model. Software

23 (4), 435–450.

Rogers, L.L., Dowla, F.U., 1994. Optimization of groundwater

remediation using artificial neural networks with parallel

solute transport modeling. Water Resour. Res. 30 (2), 457–481.

Rumelhart, D.E., Mcclelland, J.L., 1986. Parallel Distributed

Processing: Explorations in the Microstructure of Cognition.

MIT Press, Cambridge, Mass.

Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1988. Learning

internal representations by error propagation. In: Collins, A.,

Smith, E.E. (Eds.), Readings in Cognitive Science. Morgan

Kaufmann, pp. 399–421.

S. Cheon, J.H., Bae, Y., Park, S., Lim, J., Ha, C., Choi, Y., Lim, H., 2013.

Examination of Inlet Conditions for Effective Anaerobic

Digestion of Food Waste Leachate in Bio-reactor. SUDOKWON

Landfill Site Management Corp., Incheon, South Korea

(Available at: http://webbook.me.go.kr/DLi-File/094/003/002/

5561166.PDF).

Schmit, K.H., Ellis, T.G., 2001. Comparison of temperature-phased

and two-phase anaerobic co-digestion of primary sludge and

municipal solid waste. Water Environ. Res. 73 (3), 314–321.

Shon, T., Moon, J., 2007. A hybrid machine learning approach to

network anomaly detection. Inform. Sci. 177 (18), 3799–3821.

Singh, K.P., Basant, A., Malik, A., Jain, G., 2009. Artificial neural

network modeling of the river water quality—a case study.

Ecol. Model. 220 (6), 888–895.

Smith, M., 1993. Neural Networks for Statistical Modeling.

International Thomson Computer Press.

Sosnowski, P., Wieczorek, A., Ledakowicz, S., 2003. Anaerobic

co-digestion of sewage sludge and organic fraction of

municipal solid wastes. Adv. Environ. Res. 7 (3), 609–616.

Soyupak, S., Karaer, F., Gürbüz, H., Kivrak, E., Sentürk, E., Yazici,

A., 2003. A neural network-based approach for calculating

dissolved oxygen profiles in reservoirs. Neural Comput. Appl.

12 (3-4), 166–172.

Trichakis, I.C., Nikolos, I.K., Karatzas, G.P., 2011. Artificial Neural

Network (ANN) based modeling for Karstic groundwater level

simulation. Water Resour. Manag. 25 (4), 1143–1152.

van Griensven, A., Meixner, T., Grunwald, S., Bishop, T., Diluzio,

M., Srinivasan, R., 2006. A global sensitivity analysis tool for

the parameters of multi-variable catchment models. J. Hydrol.

324 (1-4), 10–23.

Vapnik, V., 1995. The Nature of Statistical Learning Theory.

Springer, New York, USA.

Vapnik, V.N., 1999. An overview of statistical learning theory. IEEE

Trans. Neural Netw. 10 (5), 988–999.

Wang, W.J., Xu, Z.B., Lu, W.Z., Zhang, X.Y., 2003. Determination of

the spread parameter in the Gaussian kernel for classification

and regression. Neurocomputing 55 (3-4), 643–663.

Wintgens, T., Rosen, J., Melin, T., Brepols, C., Drensla, K.,

Engelhardt, N., 2003. Modelling of a membrane bioreactor

system for municipal wastewater treatment. J. Membr. Sci. 216

(1-2), 55–65.

Yan, H., Zou, Z.H., Wang, H.W., 2010. Adaptive neuro fuzzy

inference system for classification of water quality status.

J. Environ. Sci. 22 (12), 1891–1896.

Yoon, H., Jun, S.-C., Hyun, Y., Bae, G.-O., Lee, K.-K., 2011. A

comparative study of artificial neural networks and support

vector machines for predicting groundwater levels in a coastal

aquifer. J. Hydrol. 396 (1-2), 128–138.

Yu, P.-S., Chen, S.-T., Chang, I.F., 2006. Support vector regression

for real-time flood stage forecasting. J. Hydrol. 328 (3-4),

704–716.

Zita, A., Hermansson, M., 1994. Effects of ionic strength on

bacterial adhesion and stability of flocs in a wastewater

activated sludge system. Appl. Environ. Microbiol. 60 (9),

3041–3048.

101JOURNAL OF ENVIRONMENTAL SCIENCES 32 (2015) 90–101

Revolutionizing wastewater treatment toward circular economy and carbon neutrality goals: Pioneering sustainable and efficient solutions for automation and advanced process control with smart and cutting-edge technologies

Article

Jun 2024

Enhanced Oxygen Demand Prediction in Effluent Re-actors with ANN Modeling

Chapter

May 2024

Integrating machine learning algorithm with sewer process model to realize swift prediction and real-time control of H2S pollution in sewer systems

Article

Jun 2024

Prediction of Wastewater Treatment Plant Performance through Machine Learning Techniques

Article

Jun 2024
DESALIN WATER TREAT

Status and future trends in wastewater management strategies using artificial intelligence and machine learning techniques

Article

Jun 2024
CHEMOSPHERE

Spatial Analysis of Water Quality Trends in Wastewater Treatment Using GIS and Machine Learning

Conference Paper

May 2024

Digitalization of Phosphorous Removal Process in Biological Wastewater Treatment Systems: Challenges, and Way Forward

Article

Full-text available

May 2024
ENVIRON RES

Phosphorus in wastewater poses a significant environmental threat, leading to water pollution and eutrophication. However, it plays a crucial role in the water-energy-resource recovery-environment (WERE) nexus. Recovering Phosphorus from wastewater can close the phosphorus loop, supporting circular economy principles by reusing it as fertilizer or in industrial applications. Despite the recognized importance of phosphorus recovery, there is a lack of analysis of the cyber-physical framework concerning the WERE nexus. Advanced methods like automatic control, optimal process technologies, artificial intelligence (AI), and life cycle assessment (LCA) have emerged to enhance wastewater treatment plants (WWTPs) operations focusing on improving effluent quality, energy efficiency, resource recovery, and reducing greenhouse gas (GHG) emissions. Providing insights into implementing modeling and simulation platforms, control, and optimization systems for Phosphorus recovery in WERE (P-WERE) in WWTPs is extremely important in WWTPs. This review highlights the valuable applications of AI algorithms, such as machine learning, deep learning, and explainable AI, for predicting phosphorus (P) dynamics in WWTPs. It emphasizes the importance of using AI to analyze microbial communities and optimize WWTPs for different various objectives. Additionally, it discusses the benefits of integrating mechanistic and data-driven models into plant-wide frameworks, which can enhance GHG simulation and enable simultaneous nitrogen (N) and Phosphorus (P) removal. The review underscores the significance of prioritizing recovery actions to redirect Phosphorus from effluent to reusable products for future consideration

Machine Learning Application for Nutrient Removal Rate Coefficient Analyses in Horizontal Flow Constructed Wetlands

Article

Apr 2024

Diurnal variation of air pollutants and their relationship with Land Surface Temperature in Bengaluru and Hyderabad Cities of India

Machine learning-based global trends and the development prospects of wastewater treatment: A bibliometric analysis

Article

Apr 2024

Wastewater treatment is important for pollutant reduction and reclaimed water production. Machine learning is increasing applied in environmental field for deciphering variables’ relationships and processing large datasets. However, multifarious sewage treatment systems, technologies and data processing methods led to the widespread application of machine learning in wastewater treatment. Here, we evaluated a total of 398 publications focus on machine learning-based wastewater treatment from 1993 to 2022 using bibliometric method. We aimed to provide a quantitative analysis on research hotpots, global trends and development prospects of wastewater treatment. Results showed that the related topic began in 1993 and publications’ number was significantly increased since 2018. In the past three decades, modeling-based prediction and optimization has always been a research hotspot in wastewater treatment, although the continuous increasing of multifarious research topics in this field. As the international collaboration network core, China published 22.9% of the literatures, followed by the United States (13.1%) and Spain (9.36%). Water Research is the most productive journal with 22 publications containing research articles and review papers. Pollutant and antibiotics removal prediction, and neutral network based regression prediction are three independent research categories. Future research focus will still be on modeling-based wastewater treatment prediction and optimization. The findings provide an important reference and international overview to recognize the potential opportunity for researchers whom are working on machine learning based wastewater treatment and related projects.

Unit Operation and Process Modeling with Physics-Informed Machine Learning

Article

Apr 2024

Anaerobic Digestion of Paper Mill Wastewater

Article

Full-text available

Jan 2012

In general, paper mill wastewater contains complex organic substances which could not be treated completely using conventional treatment processes, e.g. aerobic processes. As a result, anaerobic technology is a promising alternative for paper mill wastewater treatment due to its ability to degrade hard organic compounds. In the present study, treatment of paper mill wastewater using a stage anaerobic reactor was investigated. The more specific objectives of this study were to confirm whether paper mill wastewater can be tolerated by methanogenic sludge and to assess the stability of reactor for measured parameters (e.g. COD removal, and methane composition). Results showed up to 98% COD removal efficiency in the anaerobic reactor when the reactor was operated at an OLR of 1.560 kg COD/m3.d. Anaerobic digestion can provide high treatment efficiency for recalcitrant substrates, which generates robust microorganism (acidogenesis and methanogenesis), for the degradation of recalcitrant compounds such as in the paper mill wastewater.

Anaerobic co-composting of municipal solid waste and waste sludge at high total solids levels

Article

Full-text available

Jan 1992

Current status of pediatric kidney transplantation

Article

Full-text available

Oct 2009

Renal transplantation is the treatment of choice for children with end-stage renal disease. The outcome of pediatric kidney transplantation has improved dramatically in recent years, with lower acute rejection rates, superior graft survival, and low mortality. These improvements have allowed increased attention to other aspects of care for long-term survivors. Taking this into consideration, this review article will focus on the key issues related to pediatric kidney transplantation such as growth, neurocognitive function, nonadherence, and posttransplantation infectious complications, including lymphoproliferative disease, to broaden the understanding of pediatricians who provide pre-and postoperative care to children with end-stage renal disease.

Learning Internal Representations by Error Propagation

Article

Jan 1986
NATURE

The Nature of Statistical Learning Theory

Chapter

Jan 2000

Vladimir N. Vapnik

In this chapter we consider bounds on the rate of uniform convergence. We consider upper bounds (there exist lower bounds as well (Vapnik and Chervonenkis, 1974); however, they are not as important for controlling the learning processes as the upper bounds).

Biomethanization of the Organic Fraction of Municipal Solid Wastes

Article

Oct 2002

Joan Mata-Álvarez

Anaerobic digestion is a biological process that converts organic matter into biogas, reducing the number of microorganisms. The editor of a new book on the subject, describes the development of the process and explains why it is important to future sustainable waste treatment.

Anaerobic co-digestion of food waste leachate and piggery wastewater for methane production: Statistical optimization of key process parameters

Article

Nov 2012
J CHEM TECHNOL BIOT

BACKGROUND: Anaerobic co-digestion of refractory liquid organic wastes is an alternative environmental management strategy with economic benefits arising out of biogas production. Laboratory-scale experimental investigations were carried out on the anaerobic co-digestion of two liquid organic wastes, food waste leachate (FWL) and piggery wastewater (PWW). Three important parameters affecting methane yield were chosen for this study, namely, mixing ratio, alkalinity and salinity, which were optimized using response surface methodology. RESULTS: The results were analyzed statistically and the optimum conditions identified as: mixing ratio (FWL: PWW) 33 (in terms of volatile solid, w/w) (2 on v/v), alkalinity 2850 mg CaCO3 L−1, and salinity 3.4 g NaCl L−1. Under the optimum conditions, a cumulative methane yield (CMY) of 310 mL CH4 g−1 VSadded and VS reduction (VSR) of 54% were predicted. Mixing ratio and alkalinity showed the greatest individual and interactive effects on CMY and VSR (P < 0.05). A confirmation experiment under optimum conditions showed a CMY and VSR of 323 mL CH4 g−1 VSadded and 50%, respectively. This was only 1.04% and 1.1%, respectively, different from the predicted values. CONCLUSION: Anaerobic co-digestion of FWL and PWW carried out under the optimum condition may be a feasible and efficient treatment option for methane production. Copyright

Optimization of Ground Water Remediation Using Artificial Neural Networks

Article

Feb 1994

In this new approach, optimal management solutions are found by 1) first training and artificial neural network (ANN) to predict the outcome of the flow and transport code, and 2) then using the trained ANN to search through many pumping realizations to find an optimal one for successful remediation. The behaviour of complex groundwater scenarios with spatially variable transport parameters and multiple contaminant plumes is simulated with a two-dimensional hybrid finite-difference/finite-element flow and transport code. The flow and transport code develops the set of examples upon which the network is trained. The input of the ANN characterizes the different realizations of pumping, with each input indicating the pumping level of a well. The output is capable of characterizing the objectives and constraints of the optimization, such as attainment of regulatory goals, value of cost functions and cleanup time, and mass of contaminant removal. The supervised learning algorithm of back propagation was used to train the network. The conjugate gradient method and weight elimination procedures are used to speed convergence and improve performance, respectively. Once trained, the ANN begins a search through various realizations of pumping patterns to determine whether or not they will be successful. -from Authors

Technical Feasibility of Anaerobic Co-Digestion of Sewage Sludge and Municipal Solid Waste

Article

Oct 1998

Experiments were conducted using one-litre anaerobic batch bioreactors operated mesophilically (37 °C) and fed a mixture of primary sludge (RAW), thickened waste activated sludge (TWAS) and simulated organic municipal solid waste (MSW). A mixture of 25% MSW and 75% sewage sludge (60% RAW, 40% TWAS) yielded the highest biogas production. Based on biogas production, the most anaerobically biodegradable components were the white paper and grass components of the MSW. The TWAS and the newspaper components of the MSW were found to be the least biodegradable components. Both particle size and total solids concentration of the feed had a significant impact on the performance of the process. Over the operating region studied, the optimal operating conditions in terms of biogas production and volatile solids removal were at small particle sizes (0.85 mm, the smallest studied) and high total solids concentrations (22.1%, the highest studied).

Applicability of Statistical Learning Algorithms in Groundwater Quality Modeling

Article

May 2005

Four algorithms are outlined, each of which has interesting features for predicting contaminant levels in groundwater. Artificial neural networks (ANN), support vector machines (SVM), locally weighted projection regression (LWPR), and relevance vector machines (RVM) are utilized as surrogates for a relatively complex and time-consuming mathematical model to simulate nitrate concentration in groundwater at specified receptors. Nitrates in the application reported in this paper are due to on-ground nitrogen loadings from fertilizers and manures. The practicability of the four learning machines in this work is demonstrated for an agriculture-dominated watershed where nitrate contamination of groundwater resources exceeds the maximum allowable contaminant level at many locations. Cross-validation and bootstrapping techniques are used for both training and performance evaluation. Prediction results of the four learning machines are rigorously assessed using different efficiency measures to ensure their generalization ability. Prediction results show the ability of learning machines to build accurate models with strong predictive capabilities and hence constitute a valuable means for saving effort in groundwater contamination modeling and improving model performance.

Prediction of effluent concentration in a wastewater treatment plant using machine learning models

Abstract

Recommended publications

A systematic model calibration methodology based on multiple errors minimization method for the opti...

Characterisation of water-soluble protein powder and optimisation of process parameters for the remo...

The Role of Wastewater Treatment in Protecting Water Supplies Against Emerging Pathogens

Effluent dewatering research for MSW (municipal solid waste) and wastewater treatment sludges