Content uploaded by Sanjeev Kumar
Author content
All content in this area was uploaded by Sanjeev Kumar on Aug 04, 2022
Content may be subject to copyright.
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 11, NOVEMBER 2020 9243
A Feature Extraction Method Using Linear Model
Identification of Voltammetric Electronic Tongue
Sanjeev Kumar ,Student Member, IEEE, and Arunangshu Ghosh ,Member, IEEE
Abstract— A novel technique of feature extraction for a voltam-
metric electronic tongue is presented using system identification
method with the subsequent synthesis of an equivalent circuit for
black tea and then to predict the total theaflavin (TF) content
in it. The equivalent circuit parameters for different tea samples
are estimated using the current response data obtained from the
voltammetric electronic tongue, on which system identification
procedure is applied. These identified circuit parameters are
then treated as the features of tea samples. The efficacy of the
features is corroborated by developing prediction models for
TF and comparing the prediction results with reference to TF
content in tea. Various regression models such as principal com-
ponent regression, partial least-squares regression, independent
component regression, multilayer feedforward neural network
regression, support vector regression, and extreme learning
machine (ELM)-based regression models have been evaluated.
The proposed feature extraction method performs better when
its prediction accuracy was compared with that of the discrete
wavelet transform (DWT), a well-established feature extraction
method and the neighborhood components analysis (NCA) for
regression, and a feature selection method was introduced here
for the first time for signal processing of electronic tongue.
A significant reduction in the number of features has been
obtained in this work over existing feature extraction techniques.
Index Terms—Electronic tongue, equivalent circuits, machine
learning, prediction methods, system identification.
I. INTRODUCTION
ASYSTEM is characterized by its structure and
parameters. In the language of control engineering,
a system is represented by a transfer function whose structure
and parameters decide its characteristic behavior. An equiva-
lent circuit of a system is often considered as a counterpart
of the transfer function, and it is a theoretical electrical
network with linear passive elements which may be powered
by a source. The equivalent circuit of a system represents
analogy between its voltage–current to the input–output of
the system itself. The equivalent circuits reveal important
information of a system, and therefore, it can be utilized for
system characterization. With this motivation in this work,
Manuscript received June 9, 2019; revised February 27, 2020; accepted
May 3, 2020. Date of publication May 14, 2020; date of current version
October 9, 2020. This work was supported by the Science and Engineering
Research Board, Department of Science and Technology, Government of
India, under Grant ECR/2016/001813. The Associate Editor coordinating the
review process was John Lataire. (Corresponding author: Sanjeev Kumar.)
The authors are with the Department of Electrical Engineering,
National Institute of Technology Patna, Patna 800005, India (e-mail:
sanjnitp@gmail.com; arunansghu.ghosh@nitp.ac.in).
Color versions of one or more of the figures in this article are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIM.2020.2994604
the parameters of the equivalent circuit of the electronic tongue
have been treated as characteristic features of the analyte under
observation.
This work mainly focuses on equivalent circuit-based fea-
ture extraction for different black tea samples, in which
the system identification is applied to the experimental
input–output data set obtained from a laboratory-made elec-
tronic tongue. An electronic tongue is an array of nonspecific
electrochemical sensors that respond to most of the com-
ponents present in a liquid sample rather than a particular
component [1]. A time-varying voltage pulse is applied to
the analyte to obtain a current response profile, indicating
the sample’s electrical, chemical, and physical properties. The
relationship between the input and output behaviors of an
electronic tongue is typically very complex when time-domain
or frequency-domain analysis is considered. On the other hand,
system identification is a technique capable of estimating a
system model from the applied input and the measured output
data from the system, thus embedding system behavior in
terms of few model parameters. Based on the prior knowledge
of the system, a model structure is selected, and a set of algo-
rithms is used to minimize the error between the model output
and the actual measured response of a system. A system identi-
fication has been used in various applications of engineering in
the past. The method of estimating the equivalent circuit for tea
samples using system identification has been recently proposed
by Kumar et al. [2]. Clustering-based system identification of
an electromagnetic actuator has been reported in [3], where
modeling, identification, and control were performed. The
discrete-time model for a cooling process has been identified
in [4]. In [5], a high precision motion control was achieved
in a moving system. Many system identification methods have
been proposed in [6]–[8].
The electronic tongue used in this work was earlier
employed to estimate the quality measures of black
tea [9], [10]. Tea constitutes a major share of beverages con-
sumed across the world, and the presence of compounds such
as polyphenol, amino acids, caffeine, theaflavins (TFs), and
thearubigins in it gives it a unique taste [10]. The TF is related
positively to the brightness of tea, and it is responsible for its
astringent taste [11]. The presence of such a large number of
compounds in tea makes their estimation a very complex task.
The conventional methods for tea quality estimation include
gas chromatography, high-performance liquid chromatography
(HPLC), and capillary electrophoresis [12]–[14]. However,
they are time-consuming and quite costly as well. The use
0018-9456 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: National Institute of Technology Patna. Downloaded on October 09,2020 at 07:08:38 UTC from IEEE Xplore. Restrictions apply.
9244 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 11, NOVEMBER 2020
of electronic tongue for tea quality estimation is economical
and less time-consuming. The quality determining compounds
such as TF and thearubigin content in tea samples were
predicted using an electronic tongue in [11]. In the literature,
a plenty of applications are available, in which electronic
tongue has been used as an instrument for quality estima-
tion [11], [15] and classification [1], [16], [17] of tea.
A voltammetric electronic tongue is basically a three-
electrode electrochemical system that includes an array of
working electrode (WE), a reference electrode (RE), and a
counter electrode (CE). A set of RE, CE, and one of the WE
makes it a voltammetric electrochemical channel or system.
Although an electronic tongue may be based on any of the
standard electrochemical measurement techniques available
in [17]–[21], the device used in this work is based on pulse
voltammetry, in which a series of pulse potentials are applied
to the electrochemical cell. The current produced in response
to the applied voltage carries characteristic information of the
liquid analyte. Feature extraction techniques such as discrete
wavelet transform (DWT) [1], [15] and others [9], [10] are
used to obtain meaningful information called “features” from
the current waveforms. The clustering analysis is done on
these features using methods such as principal component
analysis (PCA) and linear discriminant analysis (LDA) [1].
Regression techniques such as principal component regres-
sion (PCR), independent component regression (ICR) [22],
partial least-squares regression (PLSR) [9], [11], support vec-
tor regression (SVR) [11], and neural network-based pattern
recognition models are used to perform the estimations using
an electronic tongue in correlation with the results obtained
from the conventional laboratory methods [12]–[14].
As per the literature in electrochemistry, the electrical model
for a typical three-electrode electrochemical cell is given by
the Randles model, which is a series combination of a resistor
and a parallel RC circuit [23], [27]. Literature works [24]–[26]
show contribution toward equivalent circuit estimation for
systems involving electrode–electrolyte interactions. In some
previous works [2], [27], based on the Randles model, equiv-
alent circuits were obtained for electronic tongue with tea
liquor as an analyte, where the analysis was confined only to
the identification of circuit parameters. This work has novelty
over these previous works as the identified circuit parameters
have been utilized here for the first time to estimate the TF
content in tea liquor. This article is aimed to validate the
idea of using the equivalent circuit parameters as features by
developing regression models to estimate the TF content in
tea. In this work, PCR, ICR, PLSR, SVR, extreme learning
machine (ELM), and neural network regression (NNR) models
have been used to predict the TF content in black tea. All these
regression models have also been implemented to predict the
TF values using DWT-based conventional feature extraction
technique for comparing it with the results of the proposed
feature set. Besides, the neighborhood component analysis
(NCA), a feature extraction technique [29] so far not used in
the area of electronic tongue is also implemented here to com-
pare the prediction results with that of the proposed method.
The experimental setup of electronic tongue is explained
in Section II. Section III discusses the estimation of equivalent
Fig. 1. (a) Experimental setup of electronic tongue showing its major
components. (b) Insight of the electrode array used in the experiment.
circuit parameters and feature extraction. In Section IV,
the cluster analysis and prediction results of TF content in
various tea samples using the proposed feature extraction
method are discussed, followed by the conclusion in Section V.
II. EXPERIMENTATION
A. Electronic Tongue Setup
The setup of electronic tongue used here is the same as that
was used earlier in [2], [9], and [10]. It consists of five WEs
of noble metals, namely, gold, iridium, palladium, platinum,
and rhodium. The noble metals are used as WEs on account of
their electrochemical inertness and physical stability. Stainless
steel (SS316) CE and a standard Ag/AgCl RE are the other
two electrodes of the electrochemical cell of the system. The
cell is coupled via an electronic interface (potentiostat) to the
software interface in a computer. The experimental setup is
shown in Fig. 1 with an insight into the electrode array [2]. The
potentiostat ensures the application of desired potential pulse
across the WE with respect to the RE. The allied electrode
switching circuit switches between the WEs sequentially and
makes sure that the pulse voltage gets applied to only one
electrode at a time. A data acquisition (DAQ) card, USB
6009 from National Instruments, has been used to collect the
response current from the electrode array at the rate of 1000
samples per second.
B. Input LAPV Signal and the Response of Electronic Tongue
The input potential to the system is a large amplitude pulse
voltammetry (LAPV) signal [9], [18]. The LAPV signal is
generally the most preferred candidate for the perturbation
signal to the electronic tongue with a tea sample as an analyte.
Tea is constituted of several compounds each having a different
redox potential. The LAPV signal covers a wide range of
potential levels that bring out redox information of different
compounds present in the tea in its current response. The
features hence obtained from these responses are considered
rich in information. LAPV signal used here has a pulsewidth
of 40 ms, and its magnitude ranges from −0.9 to +0.9 V with
an increment of 0.1 V after every 40 ms as shown in Fig. 2.
Response for 20 different finished black tea samples has
been analyzed, and the tea liquor samples were prepared using
the method prescribed in [11]. A total of 25 observations are
taken with each tea sample, and the corresponding current data
were recorded for all the five WEs in a sequential manner.
The electrodes were rinsed properly after each measurement
using ultrapure water. A response waveform from all five
WEs for the LAPV signal applied to each of them is shown
Authorized licensed use limited to: National Institute of Technology Patna. Downloaded on October 09,2020 at 07:08:38 UTC from IEEE Xplore. Restrictions apply.
KUMAR AND GHOSH: FEATURE EXTRACTION METHOD USING LINEAR MODEL IDENTIFICATION 9245
Fig. 2. Applied LAPV potential across working and RE.
Fig. 3. Response of electronic tongue for tea sample S1 to an LAPV input
obtained for all the five WEs.
in Fig. 3. For one tea sample, 7400 data points (sampled
at 0.001 s) spanning the response of five WEs were logged
into the PC using a customized software interface implemented
in LabVIEW. Thus, 1480 data points correspond to a single
WE. The data array obtained for a tea sample from single
WE and single observation was, therefore, of size [1480 ×1].
The sources of metrological errors in these data are from the
electrode at an average reproducibility of 99.17% and from
DAQ system (0.0125% for a full-scale range of ±500 µA).
The reference concentration values of total TF in tea
samples were obtained using HPLC, a standard instrumental
technique by a method as reported earlier [28]. These TF
values for each tea sample have been provided by Tocklai
Tea Research Institute, Jorhat, India.
III. DATA ANALYSIS
The estimation of electronic tongue model and its equivalent
circuit parameter is presented in this section. The equivalent
circuit parameters analogous to the estimated model are treated
as taste features of tea samples. Such the parameter extraction
has been done for 20 tea samples, for which the prediction of
TF content is planned to be performed as the main contribution
of this work. The input–output data required for the system
identification task as explained in Section II were recorded
for various tea samples for the entire set of WEs. The identi-
fication procedure was implemented in MATLAB (2017a) to
estimate the model parameters.
Usually, the model identification is carried out by con-
sidering only those system dynamics which are significant,
contains the traits of the test solution, and results in a simple
model for future analysis. The system attributes that are left
unexplained by the identified model come under the model
error [6]. A simple model is easy to interpret and characterizes
a system with a very limited number of parameters. This work
implements a simple yet effective model of a voltammetric
electronic tongue, which is defined by only five parameters.
A. Estimation of Equivalent Circuit Parameters
The transfer function model of electronic tongue intro-
duced in [2] was identified for each combination of WEs
Fig. 4. Box plot showing model fit variation with change in tea sample in
the case of various WEs.
Fig. 5. Estimated equivalent circuit of electronic tongue for tea samples.
and tea samples. A common model structure for the
electrode–electrolyte combination was selected with the best
model fit based on an error criterion. The identified transfer
function has a 2-pole 1-zero structure which is given by the
following equation:
Ir(s)
Vi(s)=k(1+Zs)
(1+P1s)(1+P2s)(1)
where Ir(s)is the current response and Vi(s)is the applied
potential to the system in the Laplace domain. Here, k,Z,P1,
and P2are estimated using system identification. The nor-
malized root-mean-square error (NRMSE)-based model fit is
estimated for all 20 tea samples in the case of all WEs
over 25 observations. The box plot in Fig. 4 shows the
model fit for the same. This box plot mainly proclaims the
accuracy of the model, whose parameters are used later in this
work to characterize various tea samples. It may be observed
that the median of model fit (%) for the tea samples lies
between 78.52% and 81.75%. The average model fit obtained
is in the range of 77.67%–81.27%. Using the identified model
in (1), the impedance function of the proposed equivalent
circuit is given by the following equation:
Z(s)=V(s)
I(s)=(s+a)(s+b)
K(s+c)(s+d)(2)
where K,a,b,c,anddare estimated using the identified
parameters and the values of resistor and capacitor used in the
current measuring cum filter circuit of the potentiostat [2].
The equivalent circuit obtained by applying network syn-
thesis on the driving point impedance function Z(s)is shown
in Fig. 5. The relations using which the equivalent circuit
parameters are calculated using (2) are given in the following
equations:
R1=A+B+C(3)
Authorized licensed use limited to: National Institute of Technology Patna. Downloaded on October 09,2020 at 07:08:38 UTC from IEEE Xplore. Restrictions apply.
9246 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 11, NOVEMBER 2020
Fig. 6. Box plot showing the change in values of equivalent circuit elements
with WE for sample S1 over 25 observations.
R2=−B(4)
C2=−1/Bc (5)
R3=−C(6)
C3=−1/Cd (7)
where A=(ab/Kcd);C=((a+b−d−KAc)/K(c−d));
and B=(1/K)−A−C.
The values of circuit elements (resistance and capacitance)
for the equivalent circuit shown in Fig. 5 for all combinations
of the WE and 20 tea samples over 25 observations are calcu-
lated using (3)–(7). The box plot in Fig. 6 typically shows the
magnitude of circuit parameters (Fig. 5) over 25 observations
which are later treated as the features of the tea sample S1. For
sample S1, the percentage standard deviations considering all
WEs and all observations together are 5.14%, 3.93%, 3.29%,
5.19%, and 5.20% for the circuit elements R1,R2,C2,R3,and
C3, respectively.
B. Feature Extraction
A technique of equivalent circuit-based feature extraction
from waveforms obtained from electronic tongue is presented
here. The feature set of a tea sample for a particular obser-
vation consists of values of R1,R2,C2,R3,andC3for all
the WEs in the order: gold, iridium, palladium, platinum,
and rhodium which are arranged in a column matrix of size
[25 ×1]. Therefore, the feature set for a tea sample for
25 repeated observations is of size [25 ×25]. It should be
noted that a large reduction in feature size was observed
as 7400 data points from one tea sample, which was reduced to
some 25 features. In this work, the feature extraction of 20 tea
samples (S1–S20) was done, and therefore, the resultant fea-
ture matrix is of size [25 ×500]. These features are further
applied to various regression techniques to predict the total TF
content in these black tea samples. In an attempt to compare
the TF prediction results with existing methods of feature
extraction, the DWT [1] and the NCA [29] have been used.
These are preferred for comparison due to their ability to bring
down the number of data points to a feature size comparable
with that of the proposed equivalent circuit-based features.
In this regard, the eighth level of DWT is selected as it
produced 29 features (7400 data points ÷28≈29) which
are close to the proposed number (25) of features. The NCA
Fig. 7. PCA plot for various tea samples obtained using the features from
(a) proposed equivalent circuit method, (b) DWT method, and (c) NCA
method.
is a weight-based feature selection technique that selects the
best subset of features by optimizing the average leave-one-out
prediction accuracy [29] using the input data set (measured
7400 data points) and the target (reference TF values) set. This
results in a weight vector that denotes the weight magnitude
of each data point. Here, the analysis is restricted to 25 most
weighted data points which are selected as features to make
it comparable with that of the proposed feature set.
IV. RESULTS AND DISCUSSION
The equivalent circuit-based features obtained in Section III
have been analyzed and used here to predict the total TF
content in various tea samples. Before going into various
regression methods applied to the feature sets, the cluster-
ing tendency of the proposed features using PCA has been
analyzed. The clustering study will ensure that the feature
obtained from different tea samples represents a distinct class
of information and is suitable for use in the prediction task.
A. Principal Component Analysis
The PCA is a dimension reduction and clustering technique
that projects the extracted features along the orthogonal direc-
tion (known as PCs) of maximum variance (PC1) and second
maximum variance (PC2) and so on. In this work, the PCA
is applied to the proposed feature as well as on the features
obtained using the DWT and NCA method to examine the
effectiveness of the equivalent circuit-based feature extraction.
The PCA plot for ten tea samples obtained using the proposed
features, DWT, and NCA is shown in Fig. 7. The clusters
of similar markers representing individual characteristics of
tea samples are found distinguishable, whose measure of
separation, known as separability index (SI) [1], [2], has been
calculated as per the Fisher criterion [30]. The calculated SI
and the variability explained (%) by the first three PCs for all
the feature extraction methods are shown in Table I. A better
separability is also visible in Fig. 7(a) as compared to Fig. 7(b)
and (c). The higher SI values in case of the proposed feature
set in Table I indicate that the circuit parameters when used
as features offer more distinguishable clusters compared to
DWT- and NCA-based feature extraction. Besides this, much
Authorized licensed use limited to: National Institute of Technology Patna. Downloaded on October 09,2020 at 07:08:38 UTC from IEEE Xplore. Restrictions apply.
KUMAR AND GHOSH: FEATURE EXTRACTION METHOD USING LINEAR MODEL IDENTIFICATION 9247
TAB LE I
RESULTS OF PCA APPLIED ON FEATURE EXTRACTION METHODS
better variability in features is explained by the proposed
feature set compared to others. Therefore, the proposed circuit
parameters can be seen as a better candidate to be used as
features of different tea samples. This motivates using the
proposed feature set for the prediction of TF content in black
tea samples.
B. Prediction of TF Content in Black Tea
The TF content in 20 different tea samples is predicted using
the proposed feature extraction method with the help of six
different regression methods including PCR [10], ICR [22],
PLSR, SVR, ELM [31], and NNR [11]. The feature matrix is
divided into the ratio of 3:2, where 15 out of 25 observations
for each tea sample are used for training, whereas the rest ten
sets are used to test the developed regression model resulting
in a data level analysis. The estimated total TF is compared
with the reference TF values of the tea samples.
The PCR routines written in MATLAB were used to train
the regression model using the calibration data obtained from
the electronic tongue along with the respective total TF values
of tea samples. Contributions from 23 principal components
were considered as it produced the best test accuracy. The test
data set was applied to the regression models to predict the
TF content in tea samples. The predicted TF values are found
very close to that of the reference values even for the linear
regression method such as PCR with a promising prediction
accuracy of 96.43% and a correlation of 0.9493 as shown
in Table II. The PCR results for the case of DWT and NCA
method have a lower prediction accuracy and correlation value.
A newly introduced ICR method in the area of electronic
tongue has also been implemented here on the circuit-based
features to estimate the TF content in tea using the technique
adopted in the previous work [22]. The ICR results in Table II
show that the predictions obtained using the proposed feature
extraction method are better than those of the compared
methods. The prediction accuracy and correlation between
the reference and the predicted TF in the case of NCA are
the second best.
Another linear regression method, PLSR, has been used to
predict the TF values in tea. However, from several iterative
observations, it has been found that 25 PLS components
produce the best prediction accuracy on an average during
the process of model development. The PLSR results for
the proposed feature extraction were compared with other
methods, where the proposed method again outperformed the
other techniques. The prediction results for PLSR in Table II
show that the proposed circuit element-based feature extraction
produced a better average prediction accuracy of 95.11% in
comparison to that of the DWT method (93.91%) and NCA
method (94.68%). The obtained correlation using the proposed
TAB LE I I
PREDICTION OF TF CONTENT IN BLACK TEA WITH TEST SET USING
VARIOUS REGRESSION MODELS (DATA LEVEL ANALYSIS )
feature set is also the best among all. The PCR, ICR, and PLSR
are linear methods of regression, and therefore, in anticipation
of improved prediction performance, neural network-based
nonlinear regressors such as SVR, ELM, and NNR are also
implemented for comparison purpose.
The prediction performance of the proposed feature set in
comparison with the DWT and NCA is analyzed with SVR,
an efficient machine learning tool for regression. It uses a
nonlinear inner product kernel that transforms the data into a
high-dimensional feature space, where the patterns are linearly
separable with higher probability [30]. This central feature of
SVR takes care of the nonlinearities in the feature set which
improves the regression performance of a complex feature set.
The radial basis function (RBF) is used here as the kernel
function; the complexity parameter set to 1000 and mapping
precision of 0.005 are used as training parameters to develop
the SVR model in MATLAB. With the SVR technique, the
predictions obtained for different feature extraction techniques
are actually very close. However, still, the proposed feature
extraction method has an upper hand in terms of the mean
prediction accuracy and the correlation as shown in Table II.
The NNR has also been used for TF prediction using the
feature extraction techniques discussed here. The developed
neural network model is a feedforward backpropagation type,
and it is trained using the Levenberg–Marquardt algorithm
(TRAINLM). The default gradient descent with momentum
weight and bias function (LEARNGDM) in MATLAB was
used as an adaptation learning function. The number of hidden
layers (1), the hidden neurons (30), and the maximum number
of epochs (500) were determined in a systematic fashion
so as to find out the architecture that produces maximum
test accuracy. The default values of the training parameters
as provided by MATLAB function TRAINLM have been
considered for neural network model development. The NNR
is generally much efficient than the linear regression methods.
This is evident from its performance with different feature
extraction techniques. The prediction accuracy, as well as the
Authorized licensed use limited to: National Institute of Technology Patna. Downloaded on October 09,2020 at 07:08:38 UTC from IEEE Xplore. Restrictions apply.
9248 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 11, NOVEMBER 2020
Fig. 8. Comparison of averaged prediction accuracy obtained from three
feature extraction methods using various regression techniques.
correlation values in the case of NNR, is found better than any
other regression which is shown in Table II. The prediction
performance of the proposed feature extraction method is
much better than that of the DWT and NCA method which
validates the efficacy of the equivalent circuit parameters
when used as features. The average computational time taken
over 20 repetitive trainings on an Intel Core i3 processor,
6-GB RAM computer using the proposed feature set is 1.230 s
compared to 1.258 and 1.251 s for DWT- and NCA-based
feature extraction method, respectively.
The large computational time associated with NNR requires
an alternative time-efficient and powerful regression method.
In this work, the authors chose the ELM algorithm proposed
by Huang et al. [31] to this end. As a special mention regarding
the contribution of this article in the area of electronic tongue,
ELM has been incorporated and applied to the proposed fea-
ture extraction method to predict TF values in black tea. The
extremely fast learning capability of ELM made it possible
to obtain results for all the architectures with the number
of hidden neurons ranging from 1 to 1000. The accuracy
for the proposed method and DWT- and NCA-based features
was found the best for 802, 400, and 121 hidden neurons
consuming exceptionally less computational time of 0.1563,
0.0938, and 0.0313 s, respectively. Similar to the rest of the
regressors, the ELM also gave the best predictions with the
proposed feature set as shown in Table II.
The results in Table II show that the performance of the
proposed feature set outperforms the other feature extraction
methods over all the regression techniques used for the devel-
opment of prediction models. This is also evident from the
bar plot in Fig. 8 which shows the average prediction accuracy
obtained for all three feature extraction methods with different
regression techniques. However, the proposed feature set lags
very marginally in terms of the best prediction accuracy.
However, again, the worst accuracy, the mean accuracy, and
the correlation of the proposed method are found maximum
with no exceptions. Furthermore, the prediction accuracy and
the correlations obtained in the case of linear regression
methods are lower than that of the nonlinear regressors as
expected. Among these, the NNR gave the best prediction
performance with accuracy in the range of 95.14%–99.86%.
In Table III, the reference values of TF in tea samples are
compared with that of the values predicted by NNR with
different feature extraction methods. Since the reference TF
values are not the actual TF values, a two-tailed t-test has
been performed with a null hypothesis that the reference and
predicted TF values are estimated using independent methods
with samples derived from the same population. The null
TABLE III
PREDICTED VALUES OF TF CONTENT IN TEA USING NNR
hypothesis in all the cases is found true. The test results
shown in Table III establish that the proposed method produces
significant predictions of TF content as indicated by the p-
value and t-value at a 5% significance level.
C. Prediction Performance With Sample Level Analysis
More realistic results of TF predictions may be anticipated
with the sample level separation of tea data for calibration.
In sample level analysis, the regression models were trained by
the features from a set of tea samples (S1–S14) and tested with
the remaining independent tea samples (S15–S20). The first
15 observations from individual tea samples S1–S14 constitute
the training set of size [25 ×210]. Rest ten observations
are used as validation set [25 ×140]. On the other hand,
for sample level prediction, all 25 observations from six
independent samples S15–S20 constitute the test set of size
[25 ×150]. The sample level prediction performances with
test set and validation set are shown in Table IV.
The sample level prediction performance with all the regres-
sors clearly favors the proposed feature extraction method.
As expected, the mean percentage accuracy and the cor-
relation values in Table IV are lower in comparison with
that of the data level prediction shown earlier in Table II.
However, the sample level analysis actually produces robust
prediction with independent samples. The performance with
the validation set in Table IV is comparable with the results
of Table II as both of them are the data level predictions. It is
important to note that the data level predictions show marginal
yet consistent improvement in performance with the proposed
feature set. On the contrary, the sample level prediction is
able to extract significant improvement in prediction from the
proposed feature set. Here, the ELM shows the best results and
outperforms the NNR too. Even when the feature extraction
Authorized licensed use limited to: National Institute of Technology Patna. Downloaded on October 09,2020 at 07:08:38 UTC from IEEE Xplore. Restrictions apply.
KUMAR AND GHOSH: FEATURE EXTRACTION METHOD USING LINEAR MODEL IDENTIFICATION 9249
TAB LE I V
SAMPLE LEVEL PREDICTION PERFORMANCES FOR
INDEPENDENT BLACK TEA
TAB LE V
SUMMARY OF COMPARISON WITH EARLIER WORKS
methods failed to show a considerable correlation in some
cases, the proposed method supersedes the rest, especially
in the case of SVR and PLSR. Therefore, the sample level
analysis not only highlighted the performance of the proposed
method but also came up with more reliable results.
D. Comparison With Existing Works
The proposed methodology has transformed the large exper-
imental data into a significantly reduced number of features
which is a key contribution. A summarized comparison of the
proposed feature extraction with some of the previous works
is shown in Table V, highlighting the percentage reduction
of features obtained in various cases. The algorithms of the
nonlinear regressors such as NNR and SVR could not be run
for other feature extraction methods due to a large number
of features. However, ELM was found to be useful in this
case as the prediction results could be obtained very rapidly
even for large feature size. Therefore, the average prediction
accuracy of TF reported for various cases in Table IV has been
determined only by using PCR, ICR, PLSR, and ELM.
The proposed feature extraction clearly outperforms all the
other methods in terms of feature reduction and prediction
accuracy with sample level analysis in particular. Furthermore,
in the feature extraction works done previously, the size of
the feature set is dependent on the size of raw waveforms.
The proposed method in this regard is independent of the
response size and depends on the morphology of the current
response capturing the dynamics of chemical events during
redox reactions. The number of features depends on the
structure of the equivalent circuit that fits the electronic tongue
response waveforms. This is a major advantage of the proposed
method over others as even if the raw data are sampled at a
faster rate, the feature size will remain the same.
V. C ONCLUSION
The notion of using the equivalent circuit parameters as
taste determining features of a liquid is a new idea of fea-
ture extraction, where the characteristic information of the
analyte under observation conveyed by these features have
been effectively utilized to estimate TF content of black
tea samples. The treatment of circuit parameters as system
features is validated from PCA plots. The formation of distinct
clusters for different tea samples elucidates that the values of
resistors and capacitors of the equivalent circuits are related
to the biochemical information of the tea samples, and they
are distinct in nature. For data level prediction, the average
accuracy of TF in the case of the proposed method considering
all the regression techniques together is 96.79% compared to
95.28% and 95.44% for the DWT- and NCA-based feature
extraction, respectively. The realistic results were obtained
with sample level prediction, in which the average accuracy
for the proposed method is 85.81% as compared to 79.38%
and 79.31% for DWT and NCA, respectively. The inclusion of
NCA for the first time on electronic tongue responses and the
reduction of 1480 data points to only five features per electrode
are the noteworthy contribution of the work. The results for the
proposed idea of feature extraction validate that the parameters
of the equivalent circuit are the tokens of taste for different tea
samples and, thus, opened a gateway for further research where
the prediction and classification tasks related to an electrolyte
under test may be conducted by analyzing the parameter values
of the corresponding equivalent circuit for a given electrode
and electrolyte combination.
ACKNOWLEDGMENT
The authors are thankful to Prof. Rajib Bandyopadhyay
and Prof. Bipan Tudu, Department of Instrumentation and
Electronics Engineering, Jadavpur University, Kolkata, India.
REFERENCES
[1] M. Palit et al., “Classification of black tea taste and correlation with
tea Taster’s mark using voltammetric electronic tongue,” IEEE Trans.
Instrum. Meas., vol. 59, no. 8, pp. 2230–2239, Aug. 2010.
[2] S. Kumar, A. Ghosh, B. Tudu, and R. Bandyopadhyay, “A circuit model
estimation of voltammetric taste measurement system for black tea,”
Measurement, vol. 140, pp. 609–621, Jul. 2019.
[3] A. Forrai, “Modeling, system identification, and control of electro-
magnetic actuators,” in Actuators. Rijeka, Croatia: IntechOpen, 2018,
pp. 85–105.
[4] T. Liu, S. Dong, S. Rong, and C. Zhong, “Identification of discrete-time
model with integer delay and control design for cooling processes with
application to jacketed crystallizers,” IEEE Trans. Control Syst. Technol.,
vol. 25, no. 5, pp. 1775–1789, Sep. 2017.
Authorized licensed use limited to: National Institute of Technology Patna. Downloaded on October 09,2020 at 07:08:38 UTC from IEEE Xplore. Restrictions apply.
9250 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 11, NOVEMBER 2020
[5] T. Oomen, R. van Herpen, S. Quist, M. van de Wal, O. Bosgra, and
M. Steinbuch, “Connecting system identification and robust control for
next-generation motion control of a wafer stage,” IEEE Trans. Control
Syst. Technol., vol. 22, no. 1, pp. 102–118, Jan. 2014.
[6] K.J.Keesman,System Identification: An Introduction. London, U.K.:
Springer, 2011.
[7] L. De Tommasi, D. Deschrijver, and T. Dhaene, “Transfer function
identification from phase response data,” AEU-Int. J. Electron. Commun.,
vol. 64, no. 3, pp. 218–223, Mar. 2010.
[8] C. Li, H. Li, and P. Kou, “Piecewise function based gravitational
search algorithm and its application on parameter identification of AVR
system,” Neurocomputing, vol. 124, pp. 139–148, Jan. 2014.
[9] A. Ghosh et al., “Monitoring the fermentation process and detection of
optimum fermentation time of black tea using an electronic tongue,”
IEEE Sensors J., vol. 15, no. 11, pp. 6255–6262, Nov. 2015.
[10] A. Ghosh et al., “Detection of optimum fermentation time of black CTC
tea using a voltammetric electronic tongue,” IEEE Trans. Instrum. Meas.,
vol. 64, no. 10, pp. 2720–2729, Oct. 2015.
[11] A. Ghosh, B. Tudu, P. Tamuly, N. Bhattacharyya, and R. Bandyopad-
hyay, “Prediction of theaflavin and thearubigin content in black tea
using a voltammetric electronic tongue,” Chemometric Intell. Lab. Syst.,
vol. 116, pp. 57–66, Jul. 2012.
[12] N. Togari, A. Kobayashi, and T. Aishima, “Relating sensory properties
of tea aroma to gas chromatographic data by chemometric calibration
methods,” Food Res. Int., vol. 28, no. 5, pp. 485–493, Jan. 1995.
[13] Y. Zuo, “Simultaneous determination of catechins, caffeine and gallic
acids in green, oolong, black and pu-erh teas using HPLC with a photo-
diode array detector,” Talanta, vol. 57, no. 2, pp. 307–316, May 2002.
[14] H. Horie, T. Mukai, and K. Kohata, “Simultaneous determination of
qualitatively important components in green tea infusions using capillary
electrophoresis,” J. Chromatography A, vol. 758, no. 2, pp. 332–335,
Jan. 1997.
[15] P. Saha, S. Ghorai, B. Tudu, R. Bandyopadhyay, and
N. Bhattacharyya, “A novel technique of black tea quality prediction
using electronic tongue signals,” IEEE Trans. Instrum. Meas., vol. 63,
no. 10, pp. 2472–2479, Oct. 2014.
[16] P. Ivarsson, S. Holmin, N.-E. Höjer, C. Krantz-Rülcker, and
F. Winquist, “Discrimination of tea by means of a voltammetric elec-
tronic tongue and different applied waveforms,” Sens. Actuators B,
Chem., vol. 76, nos. 1–3, pp. 449–454, Jun. 2001.
[17] A. P. Bhondekar, R. Vig, A. Gulati, M. L. Singla, and P. Kapur, “Perfor-
mance evaluation of a novel iTongue for indian black tea discrimination,”
IEEE Sensors J., vol. 11, no. 12, pp. 3462–3468, Dec. 2011.
[18] F. Winquist, P. Wide, and I. Lundström, “An electronic tongue based
on voltammetry,” Analytica Chim. Acta, vol. 357, nos. 1–2, pp. 21–31,
Dec. 1997.
[19] P. Ivarsson, “A voltammetric electronic tongue,” Chem. Senses, vol. 30,
no. 1, pp. i258–i259, Jan. 2005.
[20] D. Szollosi, Z. Kovacs, A. Gere, L. Sipos, Z. Kokai, and A. Fekete,
“Sweetener recognition and taste prediction of coke drinks by electronic
tongue,” IEEE Sensors J., vol. 12, no. 11, pp. 3119–3123, Nov. 2012.
[21] M. Scampicchio, S. Benedetti, B. Brunetti, and S. Mannino, “Amper-
ometric electronic tongue for the evaluation of the tea astringency,”
Electroanalysis, vol. 18, no. 17, pp. 1643–1648, Sep. 2006.
[22] S. Kumar, P. Kumar, and A. Ghosh, “Independent component regression
for the development of prediction model for analysis of electronic tongue
response,” in Proc. 5th Int. Conf. Emerg. Appl. Inf. Technol. (EAIT),
Jan. 2018, pp. 1–4.
[23] A. J. Bard and L. R. Faulkner, Electrochemical Methods: Fundamentals
and Applications, 2nd ed. New York, NY, USA: Wiley, 2001.
[24] A. M. Dhirde, N. V. Dale, H. Salehfar, M. D. Mann, and T.-H. Han,
“Equivalent electric circuit modeling and performance analysis of a
PEM fuel cell stack using impedance spectroscopy,” IEEE Trans. Energy
Convers., vol. 25, no. 3, pp. 778–786, Sep. 2010.
[25] W. Franks, I. Schenker, P. Schmutz, and A. Hierlemann, “Impedance
characterization and modeling of electrodes for biomedical applications,”
IEEE Trans. Biomed. Eng., vol. 52, no. 7, pp. 1295–1302, Jul. 2005.
[26] T. Pajkossy, “Impedance spectroscopy at interfaces of metals and aque-
ous solutions—Surface roughness, CPE and related issues,” Solid State
Ionics, vol. 176, no. 25, pp. 1997–2003, Aug. 2005.
[27] S. Kumar, A. Ghosh, B. Tudu, and R. Bandyopadhyay, “An equivalent
electrical network of an electronic tongue: A case study with tea
samples,” in Proc. ISOCS/IEEE Int. Symp. Olfaction Electron. Nose
(ISOEN), May 2017, pp. 1–3.
[28] A. Ghosh, B. Tudu, P. Tamuly, N. Bhattacharyya, and R. Bandy-
opadhyay, “Electronic tongue for the estimation of important quality
compounds in finished tea,” in Electronic Nose and Tongue in Food
Science. New York, NY, USA: Academic, 2016, pp. 245–253.
[29] N. S. Malan and S. Sharma, “Feature selection using regularized neigh-
bourhood component analysis to enhance the classification performance
of motor imagery signals,” Comput. Biol. Med., vol. 107, pp. 118–126,
Apr. 2019.
[30] S. Haykin, Neural Networks and Learning Machine, 3rd ed. New Jersey,
NJ, USA: Pearson, 2009.
[31] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme learning
machine: Theory and applications,” Neurocomputing, vol. 70,
nos. 1–3, pp. 489–501, Dec. 2006.
Sanjeev Kumar (Student Member, IEEE) received
the M.Tech. degree in electrical engineering from
the National Institute of Technology Patna, Patna,
India, in 2015, where he is currently pursuing the
Ph.D. degree with the Department of Electrical
Engineering.
His current research interests include control engi-
neering, system identification, fractional-order mod-
elling, and taste sensors.
Arunangshu Ghosh (Member, IEEE) received the
Ph.D. degree from Jadavpur University, Kolkata,
India, in 2015.
He is currently an Assistant Professor with
the Department of Electrical Engineering, National
Institute of Technology Patna, Patna, India. His
research interests include the electronic tongue,
machine olfaction, quartz crystal microbalance sen-
sors, and pattern recognition.
Authorized licensed use limited to: National Institute of Technology Patna. Downloaded on October 09,2020 at 07:08:38 UTC from IEEE Xplore. Restrictions apply.