ArticlePDF Available

Discrimination of the Red Jujube Varieties Using a Portable NIR Spectrometer and Fuzzy Improved Linear Discriminant Analysis

Authors:

Abstract and Figures

In order to quickly, nondestructively, and effectively distinguish red jujube varieties, based on the combination of fuzzy theory and improved LDA (iLDA), fuzzy improved linear discriminant analysis (FiLDA) algorithm was proposed to classify near-infrared reflectance (NIR) spectra of red jujube samples. FiLDA shows performs better than iLDA in dealing with NIR spectra containing noise. Firstly, the portable NIR spectrometer was employed to gather the NIR spectra of five kinds of red jujube, and the initial NIR spectra were pretreated by standard normal variate transformation (SNV), multiplicative scatter correction (MSC), Savitzky-Golay smoothing (S-G smoothing), mean centering (MC) and Savitzky-Golay filter (S-G filter). Secondly, the high-dimensional spectra were processed for dimension reduction by principal component analysis (PCA). Then, linear discriminant analysis (LDA), iLDA and FiLDA were applied to extract features from the NIR spectra, respectively. Finally, K nearest neighbor (KNN) served as a classifier for the classification of red jujube samples. The highest classification accuracy of this identification system for red jujube, by using FiLDA and KNN, was 94.4%. These results indicated that FiLDA combined with NIR spectroscopy was an available method for identifying the red jujube varieties and this method has wide application prospects.
Content may be subject to copyright.


Citation: Qi, Z.; Wu, X.; Yang, Y.;
Wu, B.; Fu, H. Discrimination of the
Red Jujube Varieties Using a Portable
NIR Spectrometer and Fuzzy
Improved Linear Discriminant
Analysis. Foods 2022,11, 763.
https://doi.org/10.3390/
foods11050763
Received: 25 December 2021
Accepted: 25 February 2022
Published: 7 March 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
foods
Article
Discrimination of the Red Jujube Varieties Using a Portable
NIR Spectrometer and Fuzzy Improved Linear
Discriminant Analysis
Zuxuan Qi 1,2, Xiaohong Wu 1,2 , Yangjian Yang 3,*, Bin Wu 4and Haijun Fu 1
1School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, China;
2221907084@stmail.ujs.edu.cn (Z.Q.); wxh419@ujs.edu.cn (X.W.); fuhaijun21@ujs.edu.cn (H.F.)
2High-Tech Key Laboratory of Agricultural Equipment and Intelligence of Jiangsu Province,
Jiangsu University, Zhenjiang 212013, China
3Research Institute of Zhejiang University-Taizhou, Taizhou 317700, China
4
Department of Information Engineering, Chuzhou Polytechnic, Chuzhou 239000, China; wubin2003@163.com
*Correspondence: yangfgh123@126.com
Abstract:
In order to quickly, nondestructively, and effectively distinguish red jujube varieties, based
on the combination of fuzzy theory and improved LDA (iLDA), fuzzy improved linear discriminant
analysis (FiLDA) algorithm was proposed to classify near-infrared reflectance (NIR) spectra of red
jujube samples. FiLDA shows performs better than iLDA in dealing with NIR spectra containing
noise. Firstly, the portable NIR spectrometer was employed to gather the NIR spectra of five kinds of
red jujube, and the initial NIR spectra were pretreated by standard normal variate transformation
(SNV), multiplicative scatter correction (MSC), Savitzky-Golay smoothing (S-G smoothing), mean
centering (MC) and Savitzky-Golay filter (S-G filter). Secondly, the high-dimensional spectra were
processed for dimension reduction by principal component analysis (PCA). Then, linear discriminant
analysis (LDA), iLDA and FiLDA were applied to extract features from the NIR spectra, respectively.
Finally, K nearest neighbor (KNN) served as a classifier for the classification of red jujube samples. The
highest classification accuracy of this identification system for red jujube, by using FiLDA and KNN,
was 94.4%. These results indicated that FiLDA combined with NIR spectroscopy was an available
method for identifying the red jujube varieties and this method has wide application prospects.
Keywords:
red jujube; near-infrared spectroscopy; feature extraction; fuzzy set theory; classification
1. Introduction
Red jujube is a kind of agricultural product with a long history. It has caught the
fascination of people all over the world and is widely planted in China. Red jujube is rich in
a variety of nutrients that are beneficial to the human body, including sugars, fats, organic
acids, amino acids, vitamins, flavonoids, and a variety of trace elements, which can prevent
cancer, cardiovascular and cerebrovascular diseases [
1
]. For different origins of red jujube,
their taste and nutritional value have obvious differences [
2
]. However, the current testing
methods for red jujube varieties at the markets are too complicated and are unsuitable for
large-scale application. Furthermore, these methods are not friendly to consumers, so it is
very necessary to build a fast, concise, cheap, and reliable method that can recognize the
red jujube varieties.
Some traditional identification methods of red jujube varieties have been extensively
employed. Professional jujube discriminators can identify the type of red jujube by its shape,
colour, and clarity. However, many professionals are vulnerable to the environment and
physical state. Furthermore, it also takes plenty of time and money to train a professional
red jujube appraiser. In recent years, domestic and foreign researchers actively established
some methods for identifying red jujube varieties. For example, Wang et al. explored the
electrical characteristics of red jujube fruits for variety identification in 2014 [3].
Foods 2022,11, 763. https://doi.org/10.3390/foods11050763 https://www.mdpi.com/journal/foods
Foods 2022,11, 763 2 of 14
At present, NIR spectroscopy technology has been quite mature with the emergence
of several new types of spectral instruments, and there it has many advantages: fast,
low cost, and other advantages [
4
10
]. Nowadays, NIR has been widely utilized in
the testing of agricultural products [
11
19
], food engineering [
20
,
21
], and many other
fields. Fan et al. [
22
] extracted the NIR hyperspectral image of red jujube and built
a model based on thermometric methods to identify the types of red jujube in 2017.
Zhang et al. [
23
] employed NIR spectroscopy and partial least squares discriminant
analysis (PLSDA) to identify the red jujube varieties in 2017. Luo et al. [
24
] established
an online NIR spectral correction model for the jujube quality of Southern Xinjiang
in 2012. Guo, Gu, Liu, & Shang [
25
] (2016) can identify peach varieties with 100%
classification accuracy by least squares support vector machine (LSSVM) and extreme
learning machine (ELM) combined with NIR spectroscopy. The genetic algorithm (GA)
was utilized to research the NIR spectra of grapes, and the classification accuracy of
different grape varieties attained 96.58% [
26
]. PLSDA combined with local algorithm
was employed by Sánchez et al. [
27
] to classify and recognize strawberry varieties in
2012. Pérez-Marín et al. [
28
] (2010) employed PLSDA in conjunction with spectral data
to accurately classify plum varieties.
Fuzzy recognition is an analytical method which uses fuzzy mathematics theory
to solve related problems. Compared with other pattern recognition methods, fuzzy
recognition has the advantages of good stability and can accurately describe the diversity
of sample information. At present, fuzzy set theory has been used in many fields. Yan
et al. [
29
] combined the maximum boundary criterion (MMC) with fuzzy set theory
and proposed a new algorithm-fuzzy maximum boundary criterion. Huang et al. [
30
]
applied fuzzy k-nearest neighbor algorithm (FKNN) to face recognition and obtained
high accuracy. Xie et al. [
31
] applied the fuzzy method to spectral extraction, thus
providing a new idea and method for two-dimensional optical fiber spectral extraction.
Few scholars have applied the fuzzy feature extraction algorithm in the classification
of red jujube before. Traditional feature extraction methods lack the description of
the diversity of sample class information. Fuzzy pattern recognition is characterized
by the complete representation of sample information and good discriminant stability.
Traditional LDA always has the problem of small sample size and rank limit, which
restrict the extraction of discriminant information, but improved linear discriminant
analysis (iLDA) can solve these two problems based on exponential scatter matrixes [
32
].
Moreover, iLDA can also identify the valid discriminant information in the null space of
the within-class matrix Sw, and LDA cannot do this. Fuzzy improved linear discriminant
analysis (FiLDA), the combination of fuzzy theory and iLDA, was not only an innovation
in fuzzy feature extraction algorithm, but also the better performance than iLDA in
dealing with NIR spectra containing noise, so it can improve the classification accuracy
of different types of red jujube. At the same time, based on the advantages of iLDA
algorithm and exponential fuzzy scatter matrixes, FiLDA can not only overcome the
two problems existing in the LDA algorithm, but also solve the problem of sample class
information diversity due to the fuzzy theory. FiLDA is an innovative fuzzy feature
extraction algorithm which can carry out more accurate feature extraction from NIR
spectra containing noise.
LDA is a supervised pattern recognition technology and it is also an effective feature
extraction and dimensionality reduction technology [
33
]. Beverage, liquor, and other fields
have been large-scale use of LDA to identify different varieties [
34
36
]. For many appli-
cations, the dimensionality of data exceeds the number of data, i.e., the small sample size
problem, which may lead to the singularity of the within-class scatter matrix. However,
classical LDA requires the within-class scatter matrix to be nonsingular, which is its limi-
tation [
37
]. Therefore, LDA has been improved in many aspects by researchers. iLDA is
feature extraction and dimensionality reduction algorithms that based on LDA, and this
can overcome the above problem.
Foods 2022,11, 763 3 of 14
The purpose of this experiment was to combine fuzzy set theory and feature extraction
algorithms to establish a classification model for identifying the red jujube varieties. The
experimental steps were described as follows: (1) employ a portable NIR spectrometer
to collect the spectra of red jujube samples; (2) preprocess the spectral data, and then use
feature extraction algorithms to extract features from the data; (3) utilize KNN to build the
identification model of red jujube samples, in order to realize the rapid identification of
different red jujube varieties.
2. Materials and Methods
2.1. Sample Preparation
There are five varieties of red jujube samples which come from five production areas
(Henan, Shanxi, Xinjiang, Hebei and Gansu) in China. That is, one variety corresponds
to one production area. Each variety has 60 samples, so a total of 300 samples were
selected. Subsequently, all of the red jujube samples were divided into training and test
samples in a certain proportion. The selection of red jujube samples was needed to meet
the following requirements: the size (length: 3–5 cm, width: 2–3 cm), weight (10–20 g) and
maturity of red jujube which came from the same variety had little difference. Meanwhile,
the experimenters ensured that the surface of the red jujube was clean and free from
obvious defects.
2.2. Spectra Collection
The NIR-M-R2 spectrometer (Shenzhen Pynect Science and Technology Co. Ltd.,
Shenzhen, China), a portable spectrometer, was employed to collect NIR spectral data
of red jujube samples. It has a wavelength range of 900–1700 nm, a signal-to-noise ratio
of 6000:1, the InGaAs detector, and a slit size of 1.8
×
0.025 mm. During the whole
collection process, the experimental temperature and relative humidity were kept at about
25
C and 50–60%, respectively. Before collecting the NIR spectral data, the spectrometer
must be preheated for one hour. The wavelength range of the collected NIR spectra was
900–1700 nm, and the resolution ratio was 10 nm. The collected NIR spectra of red jujube
were the 228-dimensional data. Each red jujube sample was scanned three times by the
spectrometer along around the equator, and the final data were the average of the three
test results. FiLDA can deal with noisy data better than LDA and iLDA, so we used the
whole range of the spectra to show this advantage of FiLDA. The final spectrogram was
displayed in Figure 1.
Figure 1. The raw spectra of red jujube samples.
Foods 2022,11, 763 4 of 14
2.3. NIR Spectra Preprocessing
The original spectra were easily influenced by the physical properties of the samples.
The data shown in Figure 1not only had the required sample characteristics but also
were mixed with unnecessary information and noise [
38
]. Therefore, it was necessary to
preprocess the spectra to achieve the purpose of enhancing the stability of the model [39].
In order to get the best experimental results, we employed five pre-processing methods
which include MSC, SNV, S-G smoothing, MC and S-G filter [
40
,
41
] to preprocess the
spectra. For S-G filter, we used Matlab function y = sgolayfilt (x, order, framelen). If x is a
matrix, sgolayfilt operates on each column. The polynomial order must be less than the
box length framelen, so framelen must be odd. If order = framelen
1, the filter is not
smoothed. In this experiment, the polynomial order was 2 and the box length framelen
was 53. Their functions were, respectively, to eliminate scattering phenomenon, reduce
the impact of diffuse reflection, decrease random error, delete redundant data and so on.
Figure 2showed the NIR spectra data of red jujube samples after the pre-treatment.
Figure 2. Cont.
Foods 2022,11, 763 5 of 14
Figure 2.
NIR spectra of red jujube samples under different pretreatment methods: (
a
) S-G filter,
(b) MC, (c) MSC, (d) SNV, (e) S-G smoothing.
2.4. Data Analysis Methods
2.4.1. Principal Component Analysis
The dimensionality of the collected red jujube NIR spectra was 228. These initial
NIR spectra of red jujube samples included some redundant information and noise data,
which could increase the difficulty of classification and reduce the accuracy of classification.
In order to obtain the effective information of NIR spectra, it was necessary to extract
multiple eigenvalues for analysis. However, excessive eigenvalues would not only affect
the subsequent spectral analysis but also increase the difficulty of the experiment. The
purpose of dimensionality reduction is to find characteristic value which can directly mirror
the discrepancy of NIR spectra. PCA is a widely used analytical method, which can be
employed to reduce dimension and remove redundant information [
42
,
43
]. Meanwhile,
PCA preserves the characteristic information of NIR spectra by selecting the original
eigenvalues [44].
2.4.2. Linear Discriminant Analysis
LDA is a traditional algorithm to reduce the spectral dimension [
45
]. In the dimen-
sionality reduction process, it uses the prior knowledge and experience of the samples [
46
].
The ultimate purpose of LDA is to project spectral data from the higher dimensional space
to the lower dimensional space, maximize the distance between classes and minimize the
distance within classes.
2.4.3. Improved Linear Discriminant Analysis
iLDA is also an algorithm for feature extraction and it can extract the identification
information in the matrix of Swwhen the eigenvalues are zero [36].
In this study, iLDA algorithm had two purposes: on the one hand, since the NIR
spectra of red jujube was the high-dimensional data, iLDA was employed to deal with
spectral data. On the other hand, it could also extract characteristic information from
spectral data. Then, the steps of the iLDA are listed as follows (Input: data matrix D;
Output: transformation matrix W):
Step 1. Define the matrices St,Sband Sw;
Step 2. B(exp(Sw))1ex p(Sb);
Step 3. Eigen decomposition of Bas B=UVUT;
Step 4. WUq, q =c1;
Foods 2022,11, 763 6 of 14
In Step 1, three matrices called total scatter matrix S
t
, between-class matrix S
b
, within-
class matrix Sware shown as follows.
St=
n
i=1diddidT
Sb=
c
j=1vjdvjdT
Sw=
c
j=1
dDjdvjdivjT
Here,
di
is the ith sample; crepresents the number of types of experimental samples; n
is the number of samples; The mean of all the samples is
d
;
vj
denotes the mean value of
class jsamples in the sample set.
2.4.4. Fuzzy Improved Linear Discriminant Analysis
The steps of the FiLDA are listed as follows (Input: data matrix D; Output: transfor-
mation matrix W):
1. Define the matrices Sf t,Sf b and Sf w ;
2. Bex pSf w1expSf b ;
3. Eigen decomposition of Bas B=UVUT;
4. WUq, q =c1;
Three matrices called fuzzy total scatter matrix
Sf t
, fuzzy between-class matrix
Sf b
and fuzzy within-class matrix Sf w are shown as follows:
Sf t =
c
j=1
n
i=1
uη
ij diddidT
Sf b =
c
j=1
n
i=1
uη
ij vjdvjdT
Sf w =
c
j=1
n
i=1
uη
ij divjdivjT
where
c
is the number of sample categories and
n
is the number of training sample data.
uij
is the fuzzy membership value of the ith data point.
η
is the weight index. FiLDA
algorithm is a combination of fuzzy membership function and iLDA algorithm; it cannot
only describe the diversity of sample information but also solve the small sample size
problem of LDA.
2.4.5. K Nearest Neighbor
KNN is a supervised pattern recognition algorithm whose basic principle is that the
same kind of experimental samples are close to each other, and the different kinds of
experimental samples are far away from each other [47].
We employed PCA + LDA, PCA + iLDA, and PCA + FiLDA to realize feature extraction
on NIR spectra and then we used the KNN algorithm to establish a classification model
of red jujube varieties. The classification accuracy of the model would be affected by the
number of samples and the internal parameter K in the course of trying to establish the
test model.
Foods 2022,11, 763 7 of 14
2.5. Software
In this article, all of the algorithms were performed using Matlab 2014a (The Math-
Works, Natick, MA, USA).
3. Results and Discussion
3.1. Spectral Analysis
In this study, the wavelength scope of the collected NIR spectra of red jujube was
900–1700 nm. The NIR spectra contained a lot of characteristic functional group information
as shown in Figure 1. There are 2 distinct peaks, which are 1180 nm and 1430 nm, in the
NIR spectra of red jujube samples. After 1350 nm, the absorbance of all of the red jujube
samples dramatically changes, which is due to the absorption of O-H and water [
48
]. From
Figure 1, we can also find that the absorbance of the red jujube samples reaches the peak
of the whole spectrum at 1430 nm. The first part is connected with the first and second
frequency multiplications of C-H group stretching vibration. These absorptions reflect
protein-like substances. The peak at 1430 nm may be related to the first and second order
frequency doubling of the O-H group in the water [
49
]. Since red jujube samples with five
different varieties have different functional group information, the NIR spectra were able
to accurately express all of the samples.
3.2. Spectral Preprocessing
Figure 2showed the NIR spectra of red jujube samples under different pre-processing
methods. These pre-processing methods were employed in this article: S-G smoothing,
S-G filter, MC, MSC and SNV. Compared with other spectra, the spectra (b) pre-processed
by MC had no obvious peaks and troughs, while the red jujube spectra pre-processed by
the other methods all showed obvious peaks and troughs. We tried five preprocessing
methods to deal with NIR spectra and found S-G filter with the best effect, so we choose
S-G filter to preprocess the spectra in this paper. After spectral pre-processing, we applied
PCA + LDA, PCA + iLDA and PCA + FiLDA to implement feature extraction on NIR
spectra. The classification accuracy of jujube variety under PCA + LDA, PCA + iLDA and
PCA + FiLDA were introduced below.
3.3. Classification with PCA + LDA
The data cannot be used directly after pre-processing because the spectral data con-
tained a lot of repetitive information. This phenomenon was unfavourable for the classi-
fication of red jujube varieties. Therefore, in order to obtain the principal components of
the spectrum of red jujube samples and remove the redundant information, the spectral
dimension must be reduced first [
11
]. In this experiment, the cumulative contribution of
the first 7 principal components was more than 99.98%, and then the NIR spectral data was
projected into the first seven principal components, which could improve the classification
accuracy of the experiment. Moreover, the eigenvalues were as follows:
λ1
= 133.189,
λ2
= 7.711,
λ3
= 7.258,
λ4
= 0.425,
λ5
= 0.117,
λ6
= 0.062,
λ7
= 0.029. Since the first 3 principal
components (PC1, PC2, and PC3) accounted for 99.6% of the total square deviation, they not
only preserved the characteristic information of the NIR spectrum data but also eliminated
the redundant information. Therefore, the three-dimensional feature space of NIR spectral
data of red jujube was established. Figure 3displayed the PCA scores plot of vectors with
PC1, PC2, and PC3. Since the experiment used different pre-treatment methods, the spectra
of red jujube after PCA treatment were different. It could be seen from the Figure 3that
the clustering positions of each kind of red jujube sample were different, so it was proved
that the feature extraction algorithm could be used to classify and identify red jujube from
different origins. Among them, the classification effect of Figure 3a was the best, and the
classification effect of Figure 3b was the worst. Then the accumulative eigenvalue of PC1
accounted for 89.9% for those of the first 3 principal components (PC1-PC3). Additionally,
it was easy to find that the red jujube samples still could not be well recognized by PCA.
Therefore, in order to get a better classification effect, it was necessary to adopt more feature
Foods 2022,11, 763 8 of 14
extraction methods to obtain the identification information from red jujube samples. In
this experiment, PCA + LDA is a two-stage algorithm. That is to say, PCA is employed to
reduce the dimension of spectral data, and then LDA is applied to extract the characteristic
information of spectral data. Therefore, PCA was employed to reduce the dimensionality
of the red jujube NIR spectral data to 7 latent variables. Then, LDA was responsible for ex-
tracting discriminant information and the test samples were mapped to these discriminant
vectors of LDA.
Figure 3.
PCA scores plot of vectors with PC1, PC2 and PC3 under different pretreatment methods:
(a) S-G filter, (b) MC, (c) MSC, (d) SNV, (e) S-G smoothing.
Foods 2022,11, 763 9 of 14
LDA scores plot of vectors with DV1, DV2, and DV3 were shown in Figure 4. In
Figure 4, samples in 2 varieties of red jujubes (Henan and Shanxi) overlapped each other,
but most of the experimental samples of red jujube could be easily distinguished.
Figure 4. LDA scores plot of vectors with DV1, DV2 and DV3 under S-G filter +PCA + LDA.
3.4. Classification with iLDA
iLDA extracted discriminant information from the 7-dimensional spectral data. A
total of 300 red jujube samples were divided into the training set (each variety of red jujube
has 35 training samples, totally 175) and the test set (each variety of red jujube has 25 test
samples, totally 125). After the training set was processed by iLDA to produce 3 optimal
discriminant vectors (DV1, DV2 and DV3), the 7-dimensional spectral data of 125 test
samples were projected to DV1, DV2 and DV3. Figure 5showed the scores plot of three
optimal discriminant vectors. As shown in Figure 5, test samples of the NIR spectral data
had good distribution. However, there were 13 samples from Hebei misclassified as those
from Xinjiang and there were 10 samples from Shanxi misclassified as those from Henan.
There were 3 samples from Xinjiang misclassified as those from Shanxi, and there was also
1 sample from Gansu misclassified as that from Hebei. Therefore, its classification accuracy
was only 77.6%.
Figure 5. iLDA scores plot of vectors with DV1, DV2 and DV3 under S-G filter +PCA + iLDA.
Foods 2022,11, 763 10 of 14
3.5. Classification with FiLDA
In this section, FiLDA was applied to extract feature information of the NIR spectral
data after PCA dimension reduction. All of the parameters were as follows: the fuzzy
weight parameter
η=
4, the number of sample categories
c=
5. The initial cluster centers
of FiLDA were:
V(0)=
v(0)
1
v(0)
2
v(0)
3
v(0)
4
v(0)
5
=
0.9550
0.3765
0.1315
0.2947
0.9564
0.1284
0.2290
0.0897
0.0378
0.1167
0.1512
0.0789
0.0984
0.0864
0.0497
0.0214
0.0086
0.0295
0.0084
0.0397
0.0184
0.0002
0.0005
0.0038
0.0112
0.0084
0.0074
0.0121
0.0206
0.0120
0.0049
0.0066
0.0164
0.0033
0.0033
The initial fuzzy membership values of FiLDA were displayed in Figure 6. The abscissa
represented sample set and the ordinate signified fuzzy membership values. There were
five different varieties in this experiment, so there were five different little figures. Each
little figure represented red jujube from one origin, and they represented Henan, Shanxi,
Xinjiang, Hebei, and Gansu, respectively. When the value of the ordinate exceeds 0.5, it
means that the test sample belongs to the red jujube of a certain origin. When the fuzzy
membership value of the ith sample
uij
was the biggest in the jth class, we could confirm
the ith sample belonged to the jth class.
Figure 6. Initial fuzzy membership values.
Figures 4,5and 7used the first seven PCs to develop discriminant analysis model and
S-G filter was applied as pre-processing method. Figure 7displayed the three-dimensional
scoring diagram when the feature extraction algorithm of FiLDA was used to extract the
identification information from the test set of red jujube samples. A total of 5 different kinds
of red jujube samples could be clearly identified by using FiLDA with the classification
accuracy 94.4%. In view of classification results, the data distribution of Figure 7was
obviously better than that in Figure 5. This further demonstrated the effectiveness of FiLDA
in extracting the identification information from NIR spectra of red jujube.
3.6. Classification Results of KNN
Table 1displayed the recognition accuracies of red jujube varieties from different
origins by using several pre-processing methods and feature extraction algorithms. At the
same time, other conditions remain unchanged (especially the number of training samples
n_training is 175 and the number of testing samples n_test is 125).
Foods 2022,11, 763 11 of 14
Figure 7. FiLDA scores plot of vectors with DV1, DV2 and DV3 under S-G filter + PCA + FiLDA.
The pre-processing method and feature extraction algorithm were S-G filter and LDA,
respectively, and the classification accuracy of the KNN was 75.2%. There were 14 samples
from Shanxi misclassified as those from Henan and there were also 4 samples from Xinjiang
misclassified as those from Shanxi. There was also 11 sample from Hebei misclassified as
that from Xinjiang, and there were also 2 samples from Gansu misclassified as those from
Hebei. The pre-processing method and feature extraction algorithm were S-G filter and
FiLDA, respectively, and the classification accuracy of the KNN reached 94.4%. There were
2 samples from Hebei misclassified as those from Shanxi, and there were also 2 samples
from Gansu misclassified as those from Hebei. There was also about 1 sample from Shanxi
misclassified as that from Henan and there was also 1 sample from Xinjiang misclassified
as that from Shanxi. It can prove that FiLDA can classify red jujube varieties and has a good
classification effect. At the same time, it was apparent that the classification accuracies
of LDA were generally not as good as those of iLDA and FiLDA when using the same
pre-processing methods.
Table 1.
Classification accuracies by several preprocessing methods and feature extraction algorithms.
SNV MSC MC S-G Smoothing S-G Filter
PCA + LDA 47.2% 44.0% 44.6% 45.6% 75.2%
PCA + iLDA 50.1% 44.0% 47.2% 58.4% 77.6%
PCA + FiLDA 52.5% 68.5% 62.5% 75.0% 94.4%
3.7. Discussion
The NIR spectral data were collected by the NIR-M-R2 spectrometer, and then spectral
data were processed by S-G filter, PCA, LDA, iLDA and FiLDA. Then, KNN was applied
to classify the test samples. We evidently discovered that the classification accuracies of
red jujube varieties were different when different feature extraction algorithms were used
in the experiments in Table 1. The classification accuracies reached less than 90% when
the PCA + LDA/iLDA were employed as feature extraction algorithms. In contrast, they
could reach more than 90% when the PCA + FiLDA was applied as feature extraction
algorithm. As was shown in Table 1, it could be found that the classification accuracy was
the highest when both FiLDA and the S-G filter preprocessing method were utilized in this
classification system for processing NIR spectra of red jujube samples.
Foods 2022,11, 763 12 of 14
The number of training samples and test samples was changed, but other experimental
conditions were consistent. Table 2displayed the classification accuracies of red jujube
varieties by several feature extraction methods and different number of training data and
test data. In Table 2, n_training indicates the number of training samples, and n_ test
represents the number of test samples. It was easy to find that the classification accuracies
changed with the above 2 parameters. From Table 2, we could clearly see that PCA +
FiLDA can better classify different kinds of red jujube samples compared with PCA +
LDA/iLDA. When the parameters of n_training and n_test were 175 and 125, respectively,
the classification accuracy of PCA + FiLDA also reached the highest with 94.4%.
Table 2. Classification accuracies with different number of training data and test data.
n_training n_test PCA + LDA PCA + iLDA PCA + FiLDA
150 150 77.3% 79.3% 92.0%
175 125 75.2% 77.6% 94.4%
200 100 75.0% 76.0% 90.0%
4. Conclusions
To classify red jujube varieties quickly, nondestructively, and effectively, FiLDA algo-
rithm coupled with NIR spectroscopy was proposed in this study. FiLDA algorithm is the
derivation of fuzzy set theory and iLDA. FiLDA is a new fuzzy feature extraction algorithm
that combines the fuzzy algorithm with the iLDA, and it is applied in the identification
of red jujube varieties. The NIR spectral data were collected for 300 red jujube samples
of 5 types by using the NIR-M-R2 spectrometer. NIR spectra were processed by S-G filter,
PCA, LDA, iLDA and FiLDA, respectively. Finally, KNN was employed as a classifier to
recognize the red jujube varieties. FiLDA was able to identify red jujube samples accurately
and had the highest classification accuracies than other feature extraction algorithms. In
addition, NIR spectroscopy has been widely used in the field of food inspection, and in the
food supply chain. The experimental results proved that FiLDA algorithm coupled with
NIR spectroscopy could play an important role in the classification of red jujube varieties.
Author Contributions:
Conceptualization, Z.Q. and X.W.; methodology, X.W.; software, Z.Q.; val-
idation, Z.Q. and X.W.; investigation, Z.Q. and X.W.; resources, B.W. and H.F.; data curation, B.W.
and H.F.; writing—original draft preparation, Z.Q.; writing—review and editing, X.W. and Y.Y.;
visualization, B.W. and H.F.; supervision, X.W. and Y.Y.; project administration, X.W. All authors have
read and agreed to the published version of the manuscript.
Funding:
This research was funded by Priority Academic Program Development of Jiangsu Higher
Education Institutions (PAPD), the Talent Program of Chuzhou Polytechnic (YG2019026 and YG2019024)
and Key Science Research Project of Chuzhou Polytechnic (YJZ-2020-12).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data sharing is not applicable to this article.
Acknowledgments: We would like to thank Haoxiang Zhou for his help for this article.
Conflicts of Interest: The authors declare no conflict of interest.
References
1.
Yang, J.; Hou, Y.; Chang, N. Determination of amino acid content and principal component analysis of Shanxi jujube. Food Res.
Dev. 2021,42, 141–145.
2.
Mairemu, S.Y. Establishment of near infrared spectroscopy for Jun jujube sugar of different mature period. Anhui Agric. Sci. Bull.
2017,23, 143–145.
3.
Wang, H.Q.; Zhang, H.H.; Zhuo, S.P.; Zhang, Z.; Li, H.F. Identification of jujube fruit species based on dielectric properties. Food
Sci. Technol. 2014,7, 304–308.
Foods 2022,11, 763 13 of 14
4.
Chen, Q.S.; Chen, M.; Liu, Y.; Wu, J.Z.; Wang, X.Y.; Ouyang, Q.; Chen, X.H. Application of FT-NIR spectroscopy for simultaneous
estimation of taste quality and taste-related compounds content of black tea. Food Sci. Technol. Mysore.
2018
,55, 4363–4368.
[CrossRef]
5.
Jiang, H.; Chen, Q.S. Chemometric models for the quantitative descriptive sensory properties of green tea (Camellia sinensis L.)
using Fourier transform near infrared (FT-NIR) spectroscopy. Food Anal. Method. 2015,8, 954–962. [CrossRef]
6. Ripoll, G.; Lobón, S.; Joy, M. Use of visible and near infrared reflectance spectra to predict lipid peroxidation of light lamb meat
and discriminate dam’s feeding systems. Meat Sci. 2018,143, 24–29. [CrossRef]
7.
Wang, J.J.; Zareef, M.; He, P.H.; Sun, H.; Chen, Q.S.; Li, H.H.; Xu, D.L. Evaluation of matcha tea quality index using portable NIR
spectroscopy coupled with chemometric algorithms. J. Sci. Food Agric. 2019,99, 5019–5027. [CrossRef]
8.
Zhang, H.; Jiang, H.; Liu, G.H.; Mei, C.L.; Huang, Y.H. Identification of radix puerariae starch from different geographical origins
by FT-NIR spectroscopy. Int. J. Food Prop. 2017,20, 1567–1577. [CrossRef]
9.
Tan, B.; Xiao, T.F.; Liu, Q.L.; Li, G.; Huang, C.X.; Li, G. Nondestructive detection experiment of typical economic fruit near-infrared
diffuse reflection and its spectral data analysis. Hubei Agric. Sci. 2020,59, 154–158.
10.
Lei, S.Z.; Yao, H.G. Applications of near infrared spectrum technique for non-destructive measurement of fruit quality. Chinese J.
Spectrosc. Lab. 2009,26, 775–779.
11.
Shang, J.; Zhang, Y.; Meng, Q.L. Nondestructive identification of apple varieties by VIS/NIR spectroscopy. Storage. Process
2019
,
19, 8–14.
12.
Zhan, Y.; Peng, Y.F.; Peng, H.G.; Luo, H.P. Application of near-infrared spectroscopy nondestructive testing of jujube in south
xinjiang sugar content. J. Agric. Mech. Res. 2014,36, 179–183.
13.
Wu, X.H.; Wu, B.; Sun, J.; Li, M. Rapid discrimination of apple varieties via near-infrared reflectance spectroscopy and fast allied
fuzzy c-means clustering. Int. J. Food Eng. 2014,11, 23–30. [CrossRef]
14.
Wu, X.H.; Wu, B.; Sun, J.; Li, M.; Du, H. Discrimination of apples using near infrared spectroscopy and sorting discriminant
analysis. Int. J. Food Prop. 2016,19, 1016–1028. [CrossRef]
15.
Zhao, J.; Hao, L.; Chen, Q.; Huang, X.; Sun, Z.; Fang, Z. Identification of egg’s freshness using NIR and support vector data
description. J. Food Eng. 2010,98, 408–414. [CrossRef]
16.
Teye, E.; Huang, X.; Takrama, J.; Haiyang, G. Integrating NIR spectroscopy and electronic tongue together with chemometric
analysis for accurate classification of cocoa bean varieties. J. Food Process Eng. 2014,37, 560–566. [CrossRef]
17.
Xing, Z.; Hou, X.; Tang, Y.; He, R.; Mintah, B.K.; Dabbour, M. Monitoring of polypeptide content in the solid-state fermentation
process of rapeseed meal using NIRS and chemometrics. J. Food Process Eng. 2018,41, e12853. [CrossRef]
18.
Guo, Z.; Barimah, A.O.; Shujat, A.; Zhang, Z.; Chen, Q. Simultaneous quantification of active constituents and antioxidant
capability of green tea using NIR spectroscopy coupled with swarm intelligence algorithm. LWT-Food Sci. Technol.
2020
,
129, 109510. [CrossRef]
19.
Cai, J.R.; Chen, Q.S.; Wan, X.M.; Zhao, J.W. Determination of total volatile basic nitrogen (TVB-N) content and warner-bratzler
shear force (WBSF) in pork using Fourier transform near infrared (FT-NIR) spectroscopy. Food Chem.
2011
,126, 1354–1360.
[CrossRef]
20.
Huang, X.Y.; Xu, H.X.; Wu, L.; Dai, H.; Yao, L.Y.; Han, F.K. A data fusion detection method for fish freshness based on computer
vision and near-infrared spectroscopy. Anal. Method 2016,8, 2929–2935. [CrossRef]
21.
Wu, X.H.; Fu, H.J.; Tian, X.Y.; Wu, B.; Sun, J. Prediction of pork storage time using Fourier transform near infrared spectroscopy
and adaboost ULDA. J. Food Process Eng. 2017,40, e12566. [CrossRef]
22.
Fan, Y.; Qiu, Z.; Chen., J.; Wu, X.; He, Y. Identification of varieties of dried red jujubes with near-infrared hyperspectral imaging.
Spectrosc. Spectr. Anal. 2017,37, 836–840.
23.
Zhang, J.C.; Zhang, X.; Bai, T.C.; Shi, L.Z. Jujube species identification based on near infrared spectroscopy and PLS-DA. Sci.
Technol. Food Ind. 2017,38, 68–71, 76.
24.
Luo, H.P.; Wang, L.; Guo, L.; Xuan, Z.Y. The research to detection the moisture content of southern jujube rapidly with near
infrared spectroscopy. Int. Acad. Annu. Meet China Agric. Mach. Soc. 2012,14, 25–28.
25.
Guo, W.C.; Gu, J.S.; Liu, D.Y.; Shang, L. Peach variety identification using near-infrared diffuse reflectance spectroscopy. Comput.
Electron. Agric. 2016,123, 297–303. [CrossRef]
26.
Cao, F.; Wu, D.; He, Y. Soluble solids content and pH prediction and varieties discrimination of grapes based on visible–near
infrared spectroscopy. Comput. Electron. Argic. 2010,71, 15–18. [CrossRef]
27.
Sánchez, M.T.; Haba, M.J.D.L.; Benítez-López, M.; Fernández-Novales, J.; Garrido-Varo, A.; Perez-Marin, D. Non-destructive
characterization and quality control of intact strawberries based on NIR spectral data. J. Food Eng.
2012
,110, 102–108. [CrossRef]
28.
Pérez-Marín, D.; Paz, P.; Guerrero, J.M.; Garrido-Varo, A.; Sánchez, M.T. Miniature handheld NIR sensor for the on-site non-
destructive assessment of post-harvest quality and refrigerated storage behavior in plums. J. Food Eng.
2010
,99, 294–302.
[CrossRef]
29. Yan, C.; Fan, L. Feature extraction using fuzzy maximum margin criterion. Neurocomputing 2012,86, 52–58.
30.
Huang, P.; Yang, Z.J.; Chen, C.K. Fuzzy local discriminant embedding for image feature extraction. Comput. Elect. Eng.
2015
,46,
231–240. [CrossRef]
31.
Xie, J.; Li, J.; Wang, H.; Zeng, W.; Guo, P. The methods for two-dimensional fiber spectra extraction. In Proceedings of the 2016
12th International Conference on Computational Intelligence and Security (CIS), Wuxi, China, 16–19 December 2016; pp. 487–491.
Foods 2022,11, 763 14 of 14
32. Liu, Z.B. An improved LDA algorithm and its application to face recognition. Comput. Eng. Sci. 2011,33, 89–93.
33.
Huang, Y.; Guan, Y. On the linear discriminant analysis for large number of classes. Eng. Appl. Artif. Intell.
2015
,43, 15–26.
[CrossRef]
34.
Liang, J.F.; Wu, W.; Chen, D.W. Identification of liquor authenticity based on FTIR with PCA- LDA. Sci. Technol. Food Ind.
2016
,37,
309–312.
35.
Yang, Z.; Wang, N.; Ullah, N.; Liang, Y.; Yang, X.; Cheng, Z. Quality of jujube beverage fermented by lactic acid based on electronic
nose analysis. Acta. Agric. Boreali Occiden Sin. 2015,24, 149–156.
36.
Wei, Y.; Lin, L.; Yang, X.; Li, D.; Fu, H.; Yang, T. NIR fiber technology combined with pattern recognition forrapid identification of
melamine adulteration in milk. China Dairy Ind. 2016,44, 48–51.
37.
Ye, J.P. Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. J. Mach.
Learn. Res. 2005,6, 483–502.
38.
Shen, Y.; Wu, X.; Wu, B.; Tan, Y.; Liu, J. Qualitative analysis of lambda-cyhalothrin on Chinese cabbage using mid-infrared
spectroscopy combined with fuzzy feature extraction algorithms. Agriculture 2021,11, 275. [CrossRef]
39.
Xiong, C.C.; Li, L.; Wang, T.Y. Establishment of a cinnamon habitat model based on near infrared spectroscopy. Northwest Pharm.
J. 2016,31, 221–225.
40.
Yu, M.; Li, S.; Yang, F.; Zheng, Y.; Li, P.; Jiang, L.; Liu, X. Identification on different origins of citri reticulatae pericarpium using
near infrared spectroscopy combined with optimized spectral pretreatments. J. Instrum. Anal. 2021,40, 65–71.
41.
Chen, J.; Jonsson, P. A simple method for reconstructing a high-quality NDVI time-series data set based on the savitzky-golay
filter. Remote Sens. Environ. 2004,91, 332–344. [CrossRef]
42.
Chen, S.Y.; Zhao, Q.M.; Dong, D.M. Application of near infrared spectroscopy combined with comparative principal component
analysis for pesticide residue detection in fruit. Spectrosc. Spectr. Anal. 2020,40, 917–921.
43.
Wu, X.H.; Zhou, H.X.; Wu, B.; Fu, H.J. Determination of apple varieties by near infrared reflectance spectroscopy coupled with
improved possibilistic Gath–Geva clustering algorithm. J. Food Process Preserv. 2020,44, e14561. [CrossRef]
44. Dixon, S.J.; Brereton, R.G. Comparison of performance of five common classifiers represented as boundary methods: Euclidean
distance to centroids, linear discriminant analysis, quadratic discriminant analysis, learning vector quantization and support
vector machines, as dependent on data structure. Chemometr. Intell. Lab. Syst. 2009,95, 1–17.
45.
Dogantekin, E.; Dogantekin, A.; Avci, D. An automatic diagnosis system based on thyroid gland: ADSTG. Expert. Syst. Appl.
2010,37, 6368–6372. [CrossRef]
46.
Dixon, S.J. Application of classification methods when group sizes are unequal by incorporation of prior probabilities to three
common approaches: Application to simulations and mouse urinary chemosignals. Chemometr. Intell. Lab. Syst.
2009
,99, 111–120.
[CrossRef]
47.
Wu, B.; Wang, D.Z.; Ji, G. Classification of vinegars based on orthogonal linear discriminant analysis and electronic nose
technology. Food Ferment. Ind. 2020,46, 263–268.
48.
Wu, L.G.; He, J.G.; Liu, G.S.; Wang, S.L.; He, X.G. Detection of common defects on jujube using Vis-NIR and NIR hyperspectral
imaging. Postharvest Biol. Technol. 2016,112, 134–142. [CrossRef]
49.
Wang, J.; Nakano, K.; Ohashi, S. Nondestructive detection of internal insect infestation in jujubes using visible and near-infrared
spectroscopy. Postharvest Biol. Technol. 2010,59, 272–279. [CrossRef]
... NIR spectroscopy is an emerging non-destructive analysis technology with the advantages of being fast, simple, and low-cost, and it has been widely applied for the analysis and classification of food [16][17][18]. Previous studies provide numerous examples where NIR spectroscopy has been utilized to facilitate both the classification of milk and the traceability of its origin. ...
Article
Full-text available
Milk is a kind of dairy product with high nutritive value. Tracing the origin of milk can uphold the interests of consumers as well as the stability of the dairy market. In this study, a fuzzy direct linear discriminant analysis (FDLDA) is proposed to extract the near-infrared spectral information of milk by combining fuzzy set theory with direct linear discriminant analysis (DLDA). First, spectral data of the milk samples were collected by a portable NIR spectrometer. Then, the data were preprocessed by Savitzky–Golay (SG) and standard normal variables (SNV) to reduce noise, and the dimensionality of the spectral data was decreased by principal component analysis (PCA). Furthermore, linear discriminant analysis (LDA), DLDA, and FDLDA were employed to transform the spectral data into feature space. Finally, the k-nearest neighbor (KNN) classifier, extreme learning machine (ELM) and naïve Bayes classifier were used for classification. The results of the study showed that the classification accuracy of FDLDA was higher than DLDA when the KNN classifier was used. The highest recognition accuracy of FDLDA, DLDA, and LDA could reach 97.33%, 94.67%, and 94.67%. The classification accuracy of FDLDA was also higher than DLDA when using ELM and naïve Bayes classifiers, but the highest recognition accuracy was 88.24% and 92.00%, respectively. Therefore, the KNN classifier outperformed the ELM and naïve Bayes classifiers. This study demonstrated that combining FDLDA, DLDA, and LDA with NIR spectroscopy as an effective method for determining the origin of milk.
... As the instrument can produce noise which is mixed with NIR spectra during the experiment, fuzzy recognition can reduce the influence of noise during classification. Qi et al. applied fuzzy improved linear discriminant analysis (FiLDA) and a portable NIR spectrometer for classification of red jujube varieties [29]. ...
... This methodology effectively balances the preservation of essential spectral details with the achievement of a smoothing effect (Pokhrel et al., 2023). The SNV algorithm serves the purpose of normalizing an individual spectrum in order to mitigate the influence of scattering arising from particles situated on the surface of the seed cotton (Qi et al., 2022). The SNV algorithm is executed using the following formula, referred to as Equation 2. ...
Article
Full-text available
Widespread adoption of machine-picked cotton in China, the impurity content of seed cotton has increased significantly. This impurity content holds direct implications for the valuation of seed cotton and exerts a consequential influence on the ensuing quality of processed lint and textiles. Presently, the primary approach for assessing impurity content in seed cotton primarily depends on semi-automated testing instruments, exhibiting suboptimal detection efficiency and not well-suited for the impurity detection requirements during the purchase of seed cotton. To address this challenge, this study introduces a seed cotton near-infrared spectral (NIRS) data acquisition system, facilitating the rapid collection of seed cotton spectral data. Three pretreatment algorithms, namely SG (Savitzky-Golay convolutional smoothing), SNV (Standard Normal Variate Transformation), and Normalization, were applied to preprocess the seed cotton spectral data. Cotton-Net, a one-dimensional convolutional neural network aligned with the distinctive characteristics of the seed cotton spectral data, was developed in order to improve the prediction accuracy of seed cotton impurity content. Ablation experiments were performed, utilizing SELU, ReLU, and Sigmoid functions as activation functions. The experimental outcomes revealed that after normalization, employing SELU as the activation function led to the optimal performance of Cotton-Net, displaying a correlation coefficient of 0.9063 and an RMSE (Root Mean Square Error) of 0.0546. In the context of machine learning modeling, the LSSVM model, developed after Normalization and Random Frog algorithm processing, demonstrated superior performance, achieving a correlation coefficient of 0.8662 and an RMSE of 0.0622. In comparison, the correlation coefficient of Cotton-Net increased by 4.01%. This approach holds significant potential to underpin the subsequent development of rapid detection instruments targeting seed cotton impurities.
... [53] Additionally, beyond 1270 nm, there was a notable change in the absorbance of all jujube samples, mainly due to the absorption of O-H and water. [54] In the region of 900-1270 nm, the absorbance of jujube samples is low. Above 1270 nm, the absorbance of the sample began to increase sharply and reached a peak at 1420 nm. ...
Article
Full-text available
The red jujube quality is closely associated with its place of origin. In order to quickly and easily identify the geographical origin of red jujube, the classification of red jujube samples’ near-infrared reflectance (NIR) spectra was performed using several fuzzy clustering methods in combination with principal component analysis (PCA) and linear discriminant analysis (LDA). Firstly, a NIR-M-R2 portable near-infrared spectrometer was used to collect four varieties of red jujube samples from four representative producing areas in four provinces: Gansu, Henan, Shanxi and Xinjiang in China. Each variety corresponded to a producing area, and it had 60 samples with a total of 240 samples. Near-infrared spectra of red jujube were acquired using a NIR-M-R2 portable near-infrared spectrometer, and the initial near-infrared spectra were preprocessed by Savitzky-Golay (SG) filtering. Secondly, PCA and LDA were used to further process the NIR data for dimension reduction and feature extraction, respectively. Finally, red jujube samples were classified by fuzzy C-means (FCM) clustering, Gustafson-Kessel (GK) clustering and possibility fuzzy C-means (PFCM) clustering. When GK served as the clustering algorithm, the clustering accuracy was the highest, as the value of 98.8%. Based on the experimental results, it was evident that the GK clustering algorithm played a significant role in identifying the place of origin of red jujube with near-infrared spectroscopy.
... It uses the samples' prior knowledge and experience during dimensionality reduction. Projecting spectral data from more excellent measurements to lesser dimensions while maximizing space within classes and minimizing space within classes is the main goal of LDA [25]. Random Forest (RF) has drawn more interest in vis-NIR spectral studies in several fields. ...
Article
Full-text available
There is a growing interest for cost-effective and nondestructive analytical techniques in both research and application fields. The growing approach by near-infrared spectroscopy (NIRs) pushes to develop handheld devices devoted to be easily applied for in situ determinations. Consequently, portable NIR spectrometers actually result definitively recognized as powerful instruments, able to perform nondestructive, online, or in situ analyses, and useful tools characterized by increasingly smaller size, lower cost, higher robustness, easy-to-use by operator, portable and with ergonomic profile. Chemometrics play a fundamental role to obtain useful and meaningful results from NIR spectra. In this review, portable NIRs applications, published in the period 2019–2022, have been selected to indicate starting references. These publications have been chosen among the many examples of the most recent applications to demonstrate the potential of this analytical approach which, not having the need for extraction processes or any other pre-treatment of the sample under examination, can be considered the “true green analytical chemistry” which allows the analysis where the sample to be characterized is located. In the case of industrial processes or plant or animal samples, it is even possible to follow the variation or evolution of fundamental parameters over time. Publications of specific applications in this field continuously appear in the literature, often in unfamiliar journal or in dedicated special issues. This review aims to give starting references, sometimes not easy to be found.
Article
Full-text available
Excess pesticide residues on cabbage are harmful to humans. In this study, we propose an innovative strategy for a quick and nondestructive qualitative test of lambda-cyhalothrin residues on Chinese cabbage. Spectral profiles of Chinese cabbage leaf samples with different concentrations of surface residues of lambda-cyhalothrin were collected with an Agilent Cary 630 FTIR Spectrometer. Standard normal variate (SNV), multiplicative scatter correlation (MSC), and principle component analysis (PCA) were utilized to preprocess the spectra. Then, fuzzy Foley-Sammon transformation (FFST), fuzzy linear discriminant analysis (FLDA), and fuzzy uncorrelated discriminant transformation (FUDT) were employed to extract features from the spectra data. Finally, k-nearest neighbor (kNN) was applied to classify samples according to the concentration of lambda-cyhalothrin residue. The highest identification accuracy rates of FFST, FLDA, and FUDT were 100%, 97.22%, and 100%, respectively. FUDT performed the best considering the combination of accuracy rate and required computing time. We believe that mid-infrared spectroscopy combined with fuzzy uncorrelated discriminant analysis is an effective method to accurately and quickly conduct qualitative analyses of lambda-cyhalothrin residues on Chinese cabbages. This method may have applications in other crops and other pesticide residues.
Article
Full-text available
BACKGROUND The study reports a portable near infrared (NIR) spectroscopy system coupled with chemometric algorithms for prediction of tea polyphenols and amino acids in order to index matcha tea quality. RESULTS Spectral data were preprocessed by standard normal variate (SNV), mean center (MC) and first‐order derivative (1stD) tests. The data were then subjected to full spectral partial least squares (PLS) and four variable selection algorithms, such as random frog partial least square (RF‐PLS), synergy interval partial least square (Si‐PLS), genetic algorithm‐partial least square (GA‐PLS) and competitive adaptive reweighted sampling partial least square (CARS‐PLS). RF‐PLS was established and identified as the optimum model based on the values of the correlation coefficients of prediction (RP), root mean square error of prediction (RMSEP) and residual predictive deviation (RPD), which were 0.8625, 0.82% and 2.13, and 0.9662, 0.14% and 3.83, respectively, for tea polyphenols and amino acids. The content range of tea polyphenols and amino acids in matcha tea samples was 8.51–14.58% and 2.10–3.75%, respectively. The quality of matcha tea was successfully classified with an accuracy rate of 83.33% as qualified, unqualified and excellent grade. CONCLUSION The proposed method can be used as a rapid, accurate and non‐destructive platform to classify various matcha tea samples based on the ratio of tea polyphenols to amino acids. © 2019 Society of Chemical Industry
Article
Apple, as an important agricultural product, has extremely high nutritional value. In order to distinguish apple varieties quickly, accurately, and nondestructively, an improved possibilistic Gath–Geva (IPGG) clustering algorithm was proposed to classify near infrared reflectance (NIR) spectra of apple samples. This paper used Antaris II NIR spectrometer (Thermo Electron Co., USA) to collect NIR spectra of four kinds of apples (Fuji, Huaniu, Gala, and Huangjiao). Then, multiple scatter correction (MSC) and principal component analysis (PCA) were applied to eliminate redundant information and reduce spectral dimensions, respectively. Finally, fuzzy c‐means (FCM), Gustafson‐Kessel (GK), Gath–Geva (GG), improved possibilistic c‐means (IPCM), and IPGG clustering algorithms were run on the preprocessed spectral data. The results shown that the clustering accuracy of IPGG was the highest, and it reached 96.5%. Experimental results demonstrated that NIR spectroscopy along with MSC, PCA, and IPGG clustering was an effective method for identifying apple varieties. The apple variety is of great importance to the quality of apple. For this, the proposed IPGG clustering along with near infrared reflectance spectroscopy was used to build an effective classification model to identify apple varieties quickly, accurately, and nondestructively. The experimental results showed that IPGG clustering algorithm has obvious advantages compared with FCM, GK, GG, and IPCM. This study provides a new method for apple grading and screening at the fruit and vegetable processing plants.
Article
A simple, rapid and low-cost analytical method was employed for simultaneous determination of bioactive constituents and antioxidant capability of green tea. The strategy was based on swarm intelligence algorithms with partial least squares (PLS) such as simulated annealing PLS (SA-PLS), ant colony optimization PLS (ACO-PLS), genetic algorithm PLS (GA-PLS), and synergy interval PLS (Si-PLS) coupled with Near-infrared (NIR) spectroscopy. These algorithms were independently applied to select informative spectral variables and improve the prediction of green tea components. Results showed that NIR combined with SA-PLS and Si-PLS had a strong correlation coefficient with the wet-chemical methods for predicting epigallocatechin gallate (Rp² = 0.97); epigallocatechin (Rp² = 0.97); epicatechin gallate (Rp² = 0.96); epicatechin (Rp² = 0.91); catechin (Rp² = 0.98); caffeine (Rp² = 0.96); theanine (Rp² = 0.93); and antioxidant capability (Rp² = 0.80) in green tea. Our results revealed the potential utilization of NIR spectroscopy coupled with SA-PLS and Si-PLS algorithms as an effective and robust technique to simultaneously predict active constituents and antioxidant capability of green tea.
Article
This study combined Fourier transform‐near infrared (NIR) spectroscopy and chemometrics, to monitor changes in peptide content during the solid‐state fermentation of rapeseed meal. A NIR calibration model was established by performing spectral scanning on 81 samples and using interval partial least squares. The results showed that coefficient of determination (R²) and root mean square error of cross‐validation could achieve 0.9441 and 0.654 g/L for polypeptide content. Furthermore, the predicted and experimental values of the two parameters in an external validation set showed similar changes throughout the fermentation. Practical applications The results show that near‐infrared spectroscopy is a promising method to monitor the chemical parameters of rapeseed meal during solid fermentation.
Article
Fourier transform near-infrared spectroscopy (FT-NIR) coupled to chemometric algorithms such as back propagation (BP)-AdaBoost and synergy interval partial least square (Si-PLS) were deployed for the rapid prediction taste quality and taste-related components in black tea. Eight main taste-related components were determined via chemical analysis and Pearson correlations. The achieved chemical results of the eight taste-related components in black tea infusion were predicted based on 160 tea samples obtained from different countries. Prediction results revealed BP-AdaBoost models gave superior predictions, with all the correlation coefficients of the prediction set (Rp) > 0.76, and the root mean square error values of the prediction set (RMSEP) < 1.7% compared with Si-PLS models (0.71 ≤ Rp ≤ 0.94, 0.08% ≤ RMSEP ≤ 1.73%). This implies that FT-NIR combined to BP-AdaBoostis capable of being deployed for the rapid evaluation of black tea taste quality and taste-related components content simultaneously.
Article
Measurement of thiobarbituric acid reactive substances (TBARS) is a well-established method for determine lipid oxidation in meat. This assay, however, is time-consuming and generates undesired chemical waste. Dam's milk is the principal source of vitamins and provitamins that delay lipid oxidation of light lamb meat; these compounds are stored in the lamb's muscle tissue. Hence, lamb meat could be used to determine the origin of the dam's diet. The aim of this study is to evaluate Near-infrared reflectance spectroscopy (NIRS) as a tool for determining the lipid peroxidation of light lamb meat and differentiate the meat of light lambs according the diet of their dams during lactation (grazing alfalfa, lucerne, or fed a total mixed ration). NIRS using select wavelengths was able to detect the lipid oxidation of meat (TBARS method). NIRS can detect analytes at concentrations of parts per million. Moreover, the feed diets were discriminated successfully.
Article
In order to realize rapid identification of dried red jujubes, this paper proposes a method based on near-infrared hyperspectral imaging technology. The near-infrared hyperspectral images (1 000~1 600 nm) of 240 samples in total from 4 cultivars of dried red jujubes will be acquired. The samples are to be divided into the calibration set and the prediction set in the ratio of 2:1. 7, 8, 10 effective wavelengths are to be selected by principal component analysis(PCA), x-loading weight(x-LW)and successive projection algorithm(SPA) respectively. The dimensionality of original hyperspectral images will be reduced with PCA, and texture features of the first principal component image are to be extracted with gray-level co-occurrence matrix(GLCM).The partial least squares-discriminant analysis(PLS-DA), back propagation neural network(BPNN)and least square support vector machine(LS-SVM) are to be applied to build identification models with the selected effective wavelengths, texture features and fusion of the former two features. The identification rates of the models based on fusion features will be higher than those of models based on the spectral features or texture features respectively. The BPNN models based on the fusion features will obtain the best results, whose identification rates of prediction set are to be 100%. The results in this paper indicate that the near-infrared hyperspectral imaging technology has great potential to identify the dried red jujubes rapidly.
Article
Pork storage time is related to its freshness. To discriminate the pork storage time rapidly, Adaboost‐ULDA algorithm was proposed to classify the Fourier transform near infrared reflectance (FT‐NIR) spectra of pork samples acquired by a Fourier transform near‐infrared spectrophotometer. Adaboost‐ULDA can not only extract discriminant information from FT‐NIR spectra but also construct a strong classifier with weak classifiers to classify spectra. Adaboost‐ULDA is a powerful classifier by combining uncorrelated linear discriminant analysis (ULDA) with Adaboost. Experimental results showed that Adaboost‐ULDA achieved the highest classification accuracy (100%) and classification accuracies were obtained as 87.9, 89.4, and 97.7% in principal component analysis (PCA) plus linear discriminant analysis (LDA), ULDA, and Adaboost‐PCA + LDA, respectively. In addition, the experiments demonstrated that the classification time of Adaboost‐ULDA was much less than that of Adaboost‐PCA + LDA. The overall results show that Adaboost‐ULDA combined with FT‐NIR is a feasible method in predicting the pork storage time. Practical applications Freshness is an important quality characteristic to pork's quality for food industry. Pork storage time is associated with its freshness. From the food safety point of view, it is significant and meaningful to predict the pork storage time. In our research, the proposed Adaboost‐ULDA coupled with Fourier transform near‐infrared reflectance spectroscopy can successfully predict the pork storage time. The results demonstrate the application prospect of this method to the quality control of pork products.
Article
Fourier transform near-infrared (FT-NIR) spectroscopy technique combined with multivariate calibration approach was employed to identify geographical origins of Radix Puerariae starch. First, the efficient spectral subintervals were selected by a synergy interval partial least squares (siPLS) method. Secondly, an iteratively retains informative variables (IRIV) algorithm was applied to select the characteristic wavelengths from the efficient spectral subintervals obtained by siPLS. Experimental results showed that the number of wavelength variables obtained by IRIV was 10. Meantime, only the first two PCs of principal component analysis (PCA) based on the selected 10 variables could explain 99.9769% of the total variance and the identification rate of validation set is 100% based on extreme learning machine (ELM) in this study. This work indicates that FT-NIR spectroscopy analysis technique coupled with multivariate calibration is an excellent choice for discrimination of geographical origins of Radix Puerariae starch. In addition, the IRIV is a promising algorithm for selection of characteristic wavelength variables in the practical application.