ArticlePDF Available

Application of Machine Learning to Terahertz Spectroscopic Imaging of Reagents Hidden By Thick Shielding Materials

Authors:

Abstract

We achieved high identification accuracy of reagents hidden by thick shielding materials, by combining injection-seeded terahertz (THz) wave parametric generator measurements and machine learning analysis. The analysis performance of three methods, support vector machine (SVM), k-nearest neighbor, and random forest, was compared in an attempt to identify the optimal approach. SVM proved to be the best model. Conventional systems could only identify reagents through pre-measured shields; however, incorporation of machine learning allowed us to identify the reagents through shielding materials that had not been pre-measured. Moreover, spectroscopic imaging of the reagents revealed the distribution pattern of the reagents, even through thick shielding materials that attenuated THz frequencies such that they were close to the noise level.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
1
Abstract We achieved high identification accuracy of reagents
hidden by thick shielding materials, by combining injection-
seeded terahertz (THz) wave parametric generator measurements
and machine learning analysis. The analysis performance of three
methods, support vector machine (SVM), k-nearest neighbor, and
random forest, was compared in an attempt to identify the optimal
approach. SVM proved to be the best model. Conventional systems
could only identify reagents through pre-measured shields;
however, incorporation of machine learning allowed us to identify
the reagents through shielding materials that had not been pre-
measured. Moreover, spectroscopic imaging of the reagents
revealed the distribution pattern of the reagents, even through
thick shielding materials that attenuated THz frequencies such
that they were close to the noise level.
Index TermsTerahertz wave parametric generator, Terahertz
radiation, Machine learning, Nondestructive testing.
I. INTRODUCTION
umerous methods have been applied to detect illicit drugs hidden
in envelopes and other containers. X-ray scanners [1] and drug
detection dogs are commonly used. However, while an X-ray scanner
can be used for the inspection of interiors, it cannot identify specific
substances, while drug detection dogs are prone to errors. Moreover,
suspicious mail cannot be opened without a search warrant. Therefore,
in recent years, non-destructive drug detection using terahertz (THz)
waves has been researched [2, 7]. THz waves can pass through many
materials, similar to microwaves, and can also be guided by lenses or
mirrors like infrared (IR) light. In addition, many reagents have
fingerprint spectra, making it possible to identify illegal drugs under
shielding materials in a non-destructive/non-contact manner.
THz wave parametric generators with MgO:LiNbO3 crystals have
been studied since the 1990s in terms of their nondestructive
inspection applications [711]. Significant improvements in the
performance of spectroscopic systems using an injection-seeded THz-
wave parametric generator (is-TPG) and in THz parametric detection
have led to a wide dynamic range of up to 125 dB [12, 13]. Thus,
spectroscopic systems can identify reagents through shielding
materials up to 5 cm thick [7, 8]. However, while these systems have
shown sufficient performance, analysis methods for reagent
Manuscript received XXX XX, 2019; revised XXX XX, 2021; accepted
XXX XX, 2021. Date of publication XXX XX, 2021; date of current version
XXX XX, 2021. This work was partially supported by Japan Society for the
Promotion of Science KAKENHI (18H03887, 19H02627); Research
Foundation for Opto-Science and Technology; and The Hibi Science
identification require further improvement. A previous study used a
reagent identification method based on simple regression analysis with
matrix operations [2, 7]. Although this method showed high accuracy
for discriminating reagents under specific shields that were assumed in
advance, it was not good enough for unknown shields or noisy data.
For real-world applications, the ability to discriminate reagents hidden
by a wide range of shielding materials is necessary.
In this study, we introduced machine learning analysis for is-TPG
measurements to detect and identify shielded reagents. Machine
learning algorithms learn patterns by analyzing a large amount of data.
The method we used in previous studies [2, 7] is also a type of machine
learning, but identification thresholds must be set. Thus, they were
distinct from the methods proposed in the current study.
Machine learning is widely used for material identification and
quantitative testing, and is also attracting attention as a sample
identification method for THz spectroscopy [1419]. However, the
identification is usually carried out on a sample or barrier material that
the algorithm has already been trained on. To our knowledge, no
current system is capable of identifying reagents through various kinds
of shielding materials, which is required for practical application. In
this study, we developed a versatile system that can discriminate
reagents through various shielding materials using is-TPG
spectroscopy with machine learning methods.
II. EXPERIMENTAL SETUP AND ANALYSIS METHOD
Our objective was illicit drug detection, but obtaining real samples
was difficult. Therefore, samples of three saccharide (maltose, glucose,
and lactose), which have fingerprint spectra similar to those of illicit
drugs in the THz band, were analyzed in this study. Maltose has
absorption peaks at 1.12 and 1.60 THz, while glucose has a peak at
1.44 THz, and lactose at 1.37 THz. Powders of these saccharides
(particle diameter: 30–100 μm) were enclosed in 10 × 10 mm2
polyethylene bags containing 1-mm thick samples.
To discriminate these reagent samples when shielded, it is
necessary for the algorithm to learn their spectra in advance through
various shielding materials. Therefore, our reagent samples were
placed under the following five types of shielding materials to generate
training data.
-Two pieces of cotton (thickness: 5 mm; attenuation: ≈ 10 dB);
Foundation. (Corresponding author: Kosuke Murate)
K. Murate, H. Kanai, and K. Kawase are with the Department of Electronics,
Graduate school of Engineering, Nagoya University, Furocho, Chikusa,
Nagoya, 4648603, Japan (e-mail: murate@nuee.nagoya-u.ac.jp,
kanai.hiroki@f.mbox.nagoya-u.ac.jp, kodo@nagoya-u.jp).
Kosuke Murate, Hiroki Kanai, and Kodo Kawase
Application of Machine Learning to Terahertz
Spectroscopic Imaging of Reagents Hidden by
Thick Shielding Materials
N
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
2
-Two pieces of corrugated cardboard (thickness: 6 mm; attenuation: ≈
20 dB);
-Two pieces of denim fabric (thickness: 0.7 mm; attenuation: 30 dB);
-Four pieces of polyurethane cushioning material (thickness: 31 mm;
attenuation: ≈ 25 dB);
-Two layers of polyethylene cushioning material and two postage
envelopes (thickness: 33 mm; attenuation: ≈ 12 dB).
Here the attenuation at 1.5 THz is shown because the spectroscopic
system used in this study was optimized at 1.5 THz. We also evaluated
whether the system was capable of identifying samples hidden by the
following two types of shielding materials that the algorithms were not
trained on:
-A low-attenuation shielding model consisting of cotton and envelopes
(thickness: 6 mm; attenuation: ≈ 10 dB);
-A high-attenuation shielding model consisting of four pieces of
corrugated cardboard, two pieces of cushioning material
(polyethylene), two pieces of bubble wrap (polyethylene), and two
pieces of envelope (thickness: 35 mm; attenuation:65 dB).
The transmittance spectra of reagents and shielding materials are
shown in Fig. 1. We also show the near-infrared (NIR) detection beam
intensity according to the attenuation rate of the THz-wave, to
demonstrate the degree to which the reagents and shielding attenuated
the THz waves.
Figure 2 shows a schematic diagram of the is-TPG measurement
system. A microchip Nd:YAG laser was used as the pump source, and
an external cavity laser diode (ECLD) was used as the seed source.
Pump and seed beam were injected into a MgO:LiNbO3 crystal under
non-collinear phase-matching conditions to generate the THz wave [8,
9]. The THz wave passed through the sample and then was input to
the MgO:LiNbO3 crystal together with the pump beam for detection.
The THz wave was converted into an NIR detection beam using the
inverse generation process; the detection beam was measured by an
NIR pyro-electric detector. A THz-wave variable attenuator was
inserted into the THz beam path; and the dynamic range was
confirmed to be up to eight orders of magnitude higher. The tuning
range was about 0.82.6 THz.
Normally, the relationship between the THz-wave intensity and
NIR detection beam intensity is not linear. As the input THz-wave
intensity increases, the change in NIR detection beam decreased due
tosaturation of the parametric gain as shown in Fig.1 [12]; therefore,
it is usually necessary to convert the detection beam intensity into THz
wave intensity using a pre-specified equation to obtain the correct
value. However, in this study, we used the unconverted detection beam
intensity, as the exact THz-wave transmittance was not required for
discrimination.
We compared the discrimination accuracy of three machine
learning methods; support vector machine (SVM), k-nearest neighbor
(kNN), and random forest (RF) algorithms. These methods are widely
used for easy classification. SVM determines the discriminant function
that maximizes the margin (the shortest distance between individual
data points) and classifies the data [20, 21]. kNN acquires k points near
known data and discriminates them by majority decision [22]. RF
takes a majority vote on the discrimination accuracy of a number of
decision trees [23].
We applied these machine learning methods using scikit-learn [24]
as . For the SVM method, we optimized parameter γ in the radial basis
function kernel and cost parameter C. γ denotes the degree of
complexity of the decision boundary; the larger the value, the more
complex the boundary. Cost parameter C is a measure of how much
misclassification is allowed; the larger the value of C, the less-tolerated
the misclassification. Thus, the classification becomes more complex.
These parameters were optimized over the range of 1.0 × 105 to 1.0 ×
Fig. 2. THz spectroscopic system using an injection-seeded THz parametric
generator (is-TPG). (HWP: half wave plate; ECLD: external cavity laser
diode; SOA: semiconductor optical amplifier; NIR: near-infrared.)
Microchip
Nd:YAG laser
PBS
10681075nm,
CW, 400mW
THz-wave
1064.4nm, 450ps,
50Hz, 0.7 mJ/pulse
NIR pyroelectric
detector
Detection beam
Pump beam
MgO:LiNbO3
MgO:LiNbO3
Seed beam
Sample
Lock-in amplifier
+ PC
HWP
f=100 mm
f=100 mm
Grating
1200L/mm
NIR
Si prism
Si prism
ECLD
HWP
HWP
HWP
Nd:YAG
amplifier
0.7 mJ
18 mJ
SOA
Cylindrical
(f=100 mm)
injection-seeded terahertz-wave
parametric generator (is-TPG)
THz parametric detector
Synchronized with the
pumped laser
Fig. 1. Transmission spectra of the (a) reagents, (b) trained shielding materials,
and untrained shielding materials used in this study. The near-infrared (NIR)
detection beam intensities according to the attenuation rate of the terahertz
(THz) wave are also shown.
系列6
系列7
系列1
系列2
系列3
系列4
系列5
系列6
系列8
系列9
系列10
Maltose
Glucose
Lactose
Low attenuation shielding model
High attenuation shielding model
-10 dB
-20 dB
-30 dB
-40 dB
-50 dB
-60 dB
-70 dB
-80 dB
Noise level





   
Transmittance
Frequency [THz]
Shielding materials (Untrained)
Frequency [THz]
-10 dB
-20 dB
-30 dB
-40 dB
Cotton
Corrugated cardboard
Denim
Cushioning material
(Polyurethane)
Cushioning material
(Polyethylene)and envelope
   


Shielding materials (Trained)
Transmittance
Maltose
Glucose
Lactose
-10 dB
-20 dB
-30 dB
-40 dB
-50 dB
-60 dB
-70 dB
-80 dB
Noise level
NIR detection beam
intensities according
to the attenuation
rate of the THz wave
Reagents





   
Frequency [THz]
Transmittance
(a)
(b)
(c)
NIR detection beam
intensities according
to the attenuation
rate of the THz wave
NIR detection beam
intensities according
to the attenuation
rate of the THz wave
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
3
104 using a cross-validation and grid search [25], which allows us to
test all possible parameter. For the kNN algorithm, we optimized
parameter k over the range of 15 and the weight parameters from
“uniform” and “distance” by cross-validation and grid search [25].
Parameter k describes how many points in close proximity are used
for the majority vote, and the weight parameters depend on whether
the weight is based on distance; “uniform” indicates no weighting, and
“distance” indicates weighting by the inverse of the distance. For the
RF method, the number of decision trees was optimized from 10 to
150. The data were normalized before learning and identification for
all machine learning methods. The source code used in this study is
available on https://github.com/knhiroki/program.
All three of the supervised machine learning algorithms had to be
trained in advance using a large amount of data. In total, 852
transmission spectra (n = 317, 275, and 260 for lactose, maltose and
glucose, respectively) were acquired through the five types of shields.
Among those spectra, we used 627 spectra for training data and 225
spectra for test data.
We also prepared the spectrums of reagents under untrained
shielding materials as test data; there were 17 samples for the low-
attenuation shield (6 maltose, 6 glucose, and 5 lactose samples), and
45 for the high-attenuation shield (15 maltose, 15 glucose, and 15
lactose samples).
Representative absorption spectra of the reagents obtained through
each shield are shown in Fig. 3. Each spectrum was acquired within 1
min. The identification was conducted by 70 frequencies from 1.1 to
1.8 THz. Although the absorption peaks of each reagent are evident in
the figures, some of them differ in shape from the pure absorption
peaks, due to the disruption of the waveforms by the shielding
materials. Moreover, when the high-attenuation shield was used, the
waveform was almost equivalent to the noise level.
III. RESULTS
The reagent identification results through the trained and untrained
shields are shown in the upper and lower parts of Table 1, respectively.
The three machine learning methods were compared. The values in
the table reflect the accuracy of the test data identification. Through the
trained shields, all methods identified the reagents with nearly 100%
accuracy. Thus, we confirmed that the spectroscopic system used in
this study could discriminate reagents through the trained shields.
Through the low-attenuation untrained shield, all learning methods
achieved 100% accuracy; Through the high-attenuation untrained
shield, the SVM, kNN and RF algorithms achieved 88.9%, 77.8%,
and 80.0% accuracy, respectively. The 100% accuracy for the
untrained low-attenuation shield was attributed to the spectrum being
similar to those obtained through the trained shields. In contrast,
although the spectra obtained through the high-attenuation shield were
close to the noise level and the original spectral shape was not
maintained, 88.9% accuracy was achieved.
We obtained a reagent discrimination rate of more than 88%,
regardless of the attenuation rate of the shielding material. SVM
combined with is-TPG spectroscopy showed the highest performance
among the learning methods used for identifying reagents through
shields. Using the same system but with conventional identification
methods, it was difficult to identify both the low-attenuation samples
and samples that were buried in noise.
Next, we attempted to reduce the number of measurement points
(i.e., frequencies) to make the analysis more efficient. The RF machine
learning method provides information on the contribution ratios of
each frequency to the identification results, as shown in Fig. 4. From a
total of 70 frequencies, only the top 7 frequencies (i.e., those making
the largest contributions) were selected. Table 2 shows the reagent
identification results obtained through the two untrained (low- and
high-attenuation) shielding materials. The discrimination accuracy of
the SVM, kNN, and RF algorithms was 100% for the low-attenuation
shield, compared to 77.8%, 66.7%, and 77.8%, respectively, for the
high-attenuation shield. Although the overall accuracy was lower than
that when using all frequencies, nearly 80% accuracy was achieved
with the SVM and RF methods. Considering the expected applications,
such as the identification of chemicals inside mail envelopes or parcels,
Figure1. Terahertz (THz) spectroscopic system using is-TPG.
(HWP: half wave plate; ECLD: external cavity laser diode; SOA:
semiconductor optical amplifier; NIR: near-infrared.)
TABLE I
DISCRIMINATION ACCURACY FOR VARIOUS SAMPLES.
SVM: support vector machine; kNN: k-nearest neighbor; RF: random forest.
Shielding materials name Machine learning method
SVM KNN RF
Trained shielding materials
(Average of 5 kinds of shielding) 98.7 % 96.9 % 98.2 %
Low attenuation shielding
model (-10dB) 100 % 100 % 100 %
High attenuation shielding
model (-65dB)88.9 % 77.8 % 80.0 %
Untrained
shielding
materials
Fig. 3. Typical transmission spectra of the reagents obtained through various
shielding materials. The transmittance was calculated based on the NIR
detection beam output and was not converted to THz-wave transmittance.
Included in training data set
Not included in training data
Denim Corrugated
cardboard Cotton
Cushioning
material
(Polyurethane)
Cushioning
material
(Polyethylene)
and envelope
Low
attenuation
shielding model
(-10 dB)
High
attenuation
shielding model
(-65 dB)
Frequency [THz] Frequency [THz] Frequency [THz]
Transmittance Transmittance Transmittance Transmittance Transmittance Transmittance Transmittance
Maltose Glucose Lactose





   





   





   





   





   





   





    





   





   





   





   





   





   





    





   





   





   





   





   





   





    
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
4
an accuracy rate of 80% would likely be acceptable, given that no other
method is currently available. If seven frequencies are sufficient for
discrimination, significant time savings could be achieved.
We are currently working on the simultaneous generation of
multiple THz wavelengths using the is-TPG system to achieve real-
time spectroscopy [8, 26]. Using the frequencies with large
contribution ratios in the RF method, real-time and highly accurate
discrimination can be realized.
Finally, spectroscopic imaging was performed using the proposed
system. First, a sample concealed by a high-attenuation shielding
model was placed at the THz focal point and raster-scanned using an
X-Y stage with a spatial resolution of about 1 mm, as shown in Fig. 5.
The seven frequencies with the highest contribution ratios in the RF
method were applied. Notably, the selected frequencies differed from
those shown in Fig. 4, as the relative contributions of the frequencies
varied slightly depending on the given dataset. The spectral results at
each measurement point were classified by SVM, and spectroscopic
imaging was then performed. To classify the shields and reagents, the
spectral data of many shield types were included as a new class in the
training dataset (the algorithms were not trained on the high
attenuation shield itself).
We were able to reveal the spatial distribution of the reagents, even
through untrained, high-attenuation shielding materials, as shown in
Fig. 6(a). In the overlay of the photograph shown in Fig. 6(b), it can be
seen that some pixels were misrecognized. For example, point "A"
was misidentified as maltose, even though it was a shielding material.
Although there was no absorption at 1.1 THz, transmittance was high
at 1.44 THz [Fig. 6(c)], suggesting that it was misidentified as maltose.
On the other hand, point "B" was misidentified as a shield, even though
it was glucose. Figure 6(d) shows that the absorption of "B" at 1.44
THz was low; it was misidentified because its waveform resembled
that of the shielding material. This error could be due to differences in
the amount of sample powder in the plastic bags. The identification
accuracy was about 63% based on Fig. 6(a). in the accuracy of
spectroscopic imaging was lower than indicated in Table 2, because
areas without reagents had to be identified as background (i.e., one
more target had to be identified). Moreover, the sample thickness was
low in some places, making it difficult to obtain sufficient information
from some of the point locations.
In this case, we performed measurements using the sample that
Fig. 6. (a) Spectroscopic imaging results showing the spatial distribution of
maltose, glucose, and lactose. The spatial resolution was about 1 mm and the
measurement time for this image was less than 2 h. (b) Imaging results
overlayed on the samples. (c) Comparison of the spectra obtained at point A
in (b), which was misidentified as maltose, with the spectra of the shield and
maltose. (d) Comparison of the spectra obtained at point B in (b), which was
misidentified as a shield, with the spectra of the shield and glucose.
Maltose in training data
Misidentified as a maltose
Shield
(a) Spectroscopic imaging result
(b) Imaging results overlayed on the samples
(c) Spectra at point A
Frequency [THz]
Transmittance [a.u.]
1.2 46
Frequency [THz]
Transmittance [a.u.]
Glucose in training data
Misidentified as a shield
Shield
B
shield
glucose in training data
A
shield
maltose in training data 




1.2 46





AB
(d) Spectra at point B
Fig. 4. Example contribution ratios for each frequency obtained using the
random forest algorithm. The top seven frequencies are in red color.
Frequencies near the absorption peak make larger contributions.
TABLE 2.
IDENTIFICATION RESULTS USING ONLY SEVEN FREQUENCIES.







    
  
Contribution rate
Frequency [THz]
Shielding materials name Machine learning method
SVM KNN RF
Low attenuation shielding
model (-10dB) identified
with 7 frequencies 100 %100 %100 %
High attenuation shielding
model (-65dB) identified
with 7 frequencies 77.8 % 66.7 % 77.8%
Untrained
shielding
materials
Fig. 5. Shielding materials and reagent samples used for the spectroscopic
imaging measurements. Reagents were sandwiched between four kinds of
shielding materials that attenuated the THz wave (to −65 dB at 1.5 THz).
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
5
attenuated the THz wave to almost the noise level, which resulted in
some misidentified points. As shown in Tables 1 and 2, spectroscopic
imaging was more accurate through low-attenuation shields. In
addition, improvement in the dynamic range using a highly sensitive
multi-stage THz parametric detector [13] would limit misidentification,
even through high-attenuation shields.
IV. CONCLUSION
We combined machine learning with is-TPG spectroscopic
measurements to identify reagents hidden by various shielding
materials. By training the algorithms on a large amount of
spectroscopic data, sufficient discrimination accuracy was obtained;
this was also the case through untrained shields. Three machine
learning algorithms were compared: SVM, kNN, and RF; SVM
showed the best discrimination performance. To shorten the
measurement time and allow use a multi-wavelength THz source,
identification was also performed using only the top 7 seven
frequencies for the RF machine learning method. Finally,
spectroscopic imaging of the reagents through untrained high-
attenuation shielding materials was performed; accurate spatial
distributions of reagent were obtained. In this study, we used SVM,
kNN, and RF because the amount of data was not particularly large;
however, if the dataset had been considerably larger, a deep neural
network would have been an appropriate option. We believe that
machine learning is essential to identify illicit drugs and other
substances hidden in packages using THz-wave methods, and we are
confident that this research will contribute to the future development
of THz-wave applications.
ACKNOWLEDGMENT.
Authors appreciate the assistance in the experiments and useful
discussions with Mr. T. Horiuchi and Mr. R. Mitsuhashi
REFERENCES
[1] I. Drakos, P. Kenny, T. Fearn, and R. Speller, “Multivariate analysis of
energy dispersive X-ray diffraction data for the detection of illicit drugs
in border control,” Crime Sci., vol. 6, no. 1, p. 1, 2017.
[2] K. Kawase, Y. Ogawa, Y. Watanabe, and H. Inoue, “Non -destructive
terahertz imaging of illicit drugs using spectral fingerprints,” Opt.
Express, vol. 11, no. 20, pp. 25492554, 2003.
[3] U. Puc, A. Abina, M. Rutar, A. Zidanšek, A. Jeglič, and G. Valušis,
“Terahertz spectroscopic identification of explosive and drug simulants
concealed by various hiding techniques,” Appl. Opt., vol. 54, no. 14, pp.
44954502, May 2015.
[4] V. A. Trofimov and S. A. Varentsova, “Detection and identification of
drugs under real conditions by using noisy terahertz broadband pulse,”
Appl. Opt., vol. 55, no. 33, pp. 96059618, Nov. 2016.
[5] P. Dean et al., “Absorption-sensitive diffuse reflection imaging of
concealed powders using a terahertz quantum cascade laser,” Opt.
Express, vol. 16, no. 9, pp. 59976007, Apr. 2008.
[6] M. Bauer et al., “Antenna-coupled field-effect transistors for multi-
spectral terahertz imaging up to 4.25 THz,” Opt. Express, vol. 22, no.
16, pp. 1923519241, Aug. 2014.
[7] M. Kato, S. R. Tripathi, K. Murate, K. Imayama, and K. Kawase, “Non-
destructive drug inspection in covering materials using a terahertz
spectral imaging system with injection-seeded terahertz parametric
generation and detection,” Opt. Express, vol. 24, no. 6, p. 6425, Mar.
2016.
[8] K. Murate and K. Kawase, “Perspective: Terahertz wave parametric
generator and its applications,” J. Appl. Phys., vol. 124, no. 16, p.
160901, Oct. 2018.
[9] S. Hayashi, K. Nawata, T. Taira, J. Shikata, K. Kawase, and H.
Minamide, “Ultrabright continuously tunable terahertz-wave
generation at room temperature,” Sci. Rep., vol. 4, p. 5045, Jun. 2014.
[10] K. Kawase, J. Shikata, and H. Ito, “Terahertz wave parametric source,”
J. Phys. Appl. Phys., vol. 35, no. 3, p. R1, 2002.
[11] Y. Takida, Y. Takida, K. Nawata, K. Nawata, and H. Minamide,
“Security screening system based on terahertz-wave spectroscopic gas
detection,” Opt. Express, vol. 29, no. 2, pp. 25292537, Jan. 2021.
[12] K. Murate et al., “A High Dynamic Range and Spectrally Flat Terahertz
Spectrometer Based on Optical Parametric Processes in LiNbO3,” IEEE
Trans. Terahertz Sci. Technol., vol. 4, no. 4, pp. 523526, Jul. 2014.
[13] H. Sakai, K. Kawase, and K. Murate, “Highly sensitive multi-stage
terahertz parametric detector,” Opt. Lett., vol. 45, no. 14, pp. 3905
3908, Jul. 2020.
[14] D. S. Bulgarevich, M. Talara, M. Tani, and M. Watanabe, “Machine
learning for pattern and waveform recognitions in terahertz image data,”
Sci. Rep., vol. 11, Jan. 2021.
[15] H. Ge, Y. Jiang, Z. Xu, F. Lian, Y. Zhang, and S. Xia, “Identification
of wheat quality using THz spectrum,” Opt. Express, vol. 22, no. 10,
pp. 1253312544, May 2014.
[16] Y. Sun et al., “Quantitative characterization of bovine serum albumin
thin-films using terahertz spectroscopy and machine learning methods,”
Biomed. Opt. Express, vol. 9, no. 7, pp. 29172929, Jul. 2018.
[17] C. Cao, Z. Zhang, X. Zhao, and T. Zhang, “Terahertz spectroscopy and
machine learning algorithm for non-destructive evaluation of protein
conformation,” Opt. Quantum Electron., vol. 52, no. 4, p. 225, Apr.
2020.
[18] W. Liu et al., “Automatic recognition of breast invasive ductal
carcinoma based on terahertz spectroscopy with wavelet packet
transform and machine learning,” Biomed. Opt. Express, vol. 11, no. 2,
pp. 971981, Feb. 2020.
[19] R. Mitsuhashi, K. Murate, S. Niijima, T. Horiuchi, and K. Kawase,
“Terahertz tag identifiable through shielding materials using machine
learning,” Opt. Express, vol. 28, no. 3, pp. 35173527, Feb. 2020.
[20] C. J. C. Burges, “A Tutorial on Support Vector Machines for Pattern
Recognition,” Data Min. Knowl. Discov., vol. 2, no. 2, pp. 121167,
Jun. 1998.
[21] C. Bishop, Pattern Recognition and Machine Learning. New York:
Springer-Verlag, 2006.
[22] S. A. Dudani, “The Distance-Weighted k-Nearest-Neighbor Rule,”
IEEE Trans. Syst. Man Cybern., vol. SMC-6, no. 4, pp. 325327, Apr.
1976.
[23] L. Breiman, “Random Forests,” Mach. Learn., vol. 45, no. 1, pp. 532,
Oct. 2001.
[24] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” J.
Mach. Learn. Res., vol. 12, no. 85, pp. 28252830, 2011.
[25] J. Bergstra and Y. Bengio, “Random search for hyper-parameter
optimization,” J. Mach. Learn. Res., vol. 13, no. null, pp. 281305, Feb.
2012.
[26] K. Murate, S. Hayashi, and K. Kawase, “Multiwavelength terahertz -
wave parametric generator for one-pulse spectroscopy,” Appl. Phys.
Express, vol. 10, no. 3, p. 032401, Feb. 2017.
Kosuke Murate received B.S., M.S.
and Ph.D. degrees from Nagoya
University, Japan in 2013, 2015, and
2018, respectively. Now he is working
as an assistant professor in the Nagoya
University from 2018. He received
Ikushi prize from Japan Society for the
Promotion of Science in 2018.
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
6
Hiroki Kanai received the B.S degree
in Department of Electrical
Engineering and Electronics, and
Information Engineering in 2020, and
now he is a master student in
Department of Electronics, Graduate
school of Engineering, Nagoya
University in Japan.
Kodo Kawase received the B.S. degree
from Kyoto Univ. in 1989, and the Ph.
D degrees from Tohoku Univ. in 1996.
He became a team leader of RIKEN in
2001. He became a Professor of
Nagoya University in 2005. He
received the 2005 Young Scientists’
Prize by the Minister of Education.
... Not only can it generate high-intensity, widely tunable terahertz-waves, but it also achieves a wide dynamic range of 125 dB in combination with parametric detection of the inverse process [8]. Furthermore, we are exploring various applications of machine learning in terahertz spectroscopy, such as the detection of illicit drugs in envelopes [9] and terahertz tag measurements [10]. ...
... Photonics 2022,9, 258 ...
Article
Full-text available
In this study, we developed a multi-wavelength terahertz-wave parametric generator that operates with only one injection seeding laser. Tunable lasers used as an injection seeder must be single-frequency oscillators, and conventional multi-wavelength terahertz-wave parametric generator requires basically the same number of lasers as the number of wavelengths. In order to solve this problem, we developed a new external cavity semiconductor laser that incorporates a DMD in its wavelength-selective mechanism. In this process, stable multi-wavelength oscillation from a single laser was made possible by efficiently causing four-wave mixing. This seed laser can be applied to practical real-time terahertz spectroscopy by arbitrarily switching the desired wavelength to be generated and the interval between multiple wavelengths.
... However, the identification usually involves a sample or barrier material for which the algorithm has already received training. In this study, we developed a versatile system that can discriminate saccharides through various packaging materials using is-TPG spectroscopy with machine learning methods [62]. The experimental system is equivalent to the system shown in Fig. 2, but the methods for sample processing and data management differ. ...
Article
Full-text available
Here, we introduce an injection-seeded terahertz (THz)-wave parametric generator (is-TPG) spectroscopic system and its application to nondestructive inspection through a packaging material with high attenuation. Recent technological innovations have dramatically improved is-TPG output. Combined with THz parametric detection, whereby detection is performed in the reverse process of generation, a spectrometer with an extremely high dynamic range has been achieved. THz spectroscopic imaging has enabled the previously difficult visualization of substance spatial distributions, even through thick packaging materials. Moreover, the introduction of machine learning has improved the accuracy of identification. High-speed wavelength tuning and multi-wavelength generation enable real-time acquisition of sample information and real-time identification by image recognition, thus broadening the range of applicability of the is-TPG. Additionally, detection sensitivity has improved to a level of < 1 aJ through multi-stage THz parametric detection. The system combining is-TPG and THz parametric detection now exhibits a dynamic range of 125 dB, enabling imaging through thick, high scattering materials with an attenuation factor of −100 dB; to our knowledge, such measurements are difficult to achieve with other THz-wave systems.
... Moreover, the concomitant basic detecting requirements include origin [57,58] Protein T-SNE-XGBoost Nondestructive evaluation of protein conformation [59] Sunflower seed K-means Nondestructive detection of internal mass by terahertz timedomain transmission imaging [60] Tablets PCR; PLS Quantitative determination of both active ingredient and excipient concentrations of tablets [61] Aminopyrine DFT Adulteration of traditional Chinese medicine with western medicine [62] Angelica; Eucommia ulmoides Analysis of Chinese medicinal materials identification [48] Arabidopsis thaliana Determination of osmotic potential; Determination of stomatal conductance. [63] Artemisia annua 2DCOS; 2DCOS-PLS Explore the differences in molecular structure [64] Grifola PCA; K-means Authenticity identification [65] Lateral root Authenticity identification; control of product quality [66] Gastrodia elata DFT; Second-order damped oscillator Establish the chromatographic characteristics of gastrodin; Classification of different species [67,68] Genistein and Biochanin Solid-State Theory Drug ingredients identification [69] Herba Solani Lyrati PCA; SVM; DT; RF Rapid identification of Chinese medicinal materials [70] Bovine serum albumin Low-frequency absorption spectra and refractive indices [71] Drugs-of-abuse Noninvasive imaging for chemical and structural information about substances [72] Apple SSC Prediction Model; Deep Learning Predicting the multiple geographical origins [73] Soybean oil PCA-SVM Material (transgenic soybeans) detection [74] Rice SNV; BC; First Derivative PLS-DA; SVM; BPNN Analysis and identification of rice adulteration [75] Reagents SVM; KNN; RF Identification of reagents hidden by thick shielding materials [76] Impurities CNN; ResNet Detection of impurities in wheat [77] Pesticides Voigt Function; PLS Determination of Pesticides in Flour [78] Abbreviations identification, substance composition identification, and so on. Furthermore, the newer iterations of radiation sources and other pivotal apparatus also drive THz instrumentation out of previous limitations [79] in medicinal plant quality detection. ...
Article
Full-text available
Herbal medicine (HM), derived from various therapeutic plants, has garnered considerable attention for its remarkable effectiveness in treating diseases. However, numerous issues including improved varieties selection, hazardous residue detection, and concoction management affect herb quality throughout the manufacturing process. Therefore, a practical, rapid, nondestructive detection technology is necessary. Terahertz (THz) spectroscopy, with low energy, penetration, and fingerprint features, becomes preferable method for herb quality appraisal. There are three parts in our review. THz techniques, data processing, and modeling methods were introduced in Part I. Three primary applications (authenticity, composition and active ingredients, and origin detection) of THz in medicinal plants quality detection in industrial processing and marketing were detailed in Part II. A thorough investigation and outlook on the well-known applications and advancements of this field were presented in Part III. This review aims to bring new enlightenment to the in-depth THz application research in herbal medicinal plants.
Article
Full-text available
Following the recent progress in the development of Terahertz (THz) generation and detection, THz technology is being widely used to characterize test sample properties in various applications including nondestructive testing, security inspection and medical applications. In this paper, we have presented a broad review of the recent usage of artificial intelligence (AI) particularly, deep learning techniques in various THz sensing, imaging, and spectroscopic applications with emphasis on their implementation for medical imaging of cancerous cells. Initially, the fundamentals principles and techniques for THz generation and detection, imaging and spectroscopy are introduced. Subsequently, a brief overview of AI – machine learning and deep learning techniques is summarized, and their performance is compared. Further, the usage of deep learning algorithms in various THz applications is reported, with focus on metamaterials design and classification, detection, reconstruction, segmentation, parameter extraction and denoising tasks. Moreover, we also report the metrics used to evaluate the performance of deep learning models and finally, the existing research challenges in the application of deep learning in THz cancer imaging applications are identified and possible solutions are suggested through emerging trends. With the continuous increase of acquired THz data – sensing, spectral and imaging, artificial intelligence has emerged as a dominant paradigm for embedded data extraction, understanding, perception, decision making and analysis. Towards this end, the integration of state-of-the-art machine learning techniques such as deep learning with THz applications enable detailed computational and theoretical analysis for better validation and verification than modelling techniques that precede the era of machine learning. The study will facilitate the large-scale clinical applications of deep learning enabled THz imaging systems for the development of smart and connected next generation healthcare systems as well as provide a roadmap for future research direction.
Article
This paper proposes to detect heavy metal pollutants in wheat using terahertz spectroscopy and deep support vector machine (DSVM). Five heavy metal pollutants, arsenic, lead, mercury, chromium, and cadmium, were considered for detection in wheat samples. THz spectral data were pre-processed by wavelet denoising. DSVM was introduced to further enhance the accuracy of the SVM classification model. According to the relationship between the accuracy and the training time with the number of hidden layers ranging from 1 to 4, the model performs the best when the hidden layer network has three layers. Besides, using the back-propagation algorithm to optimize the entire DSVM network. Compared with Deep neural network (DNN) and SVM models, the comprehensive evaluation index of the proposed model optimized by DSVM has the highest accuracy of 91.3 %. It realized the exploration enhanced the classification accuracy of the heavy metal pollutants in wheat.
Article
Full-text available
Several machine learning (ML) techniques were tested for the feasibility of performing automated pattern and waveform recognitions of terahertz time-domain spectroscopy datasets. Out of all the ML techniques under test, it was observed that random forest statistical algorithm works well with the THz datasets in both the frequency and time domains. With such ML algorithm, a classifier can be created with less than 1% out-of-bag error for segmentation of rusted and non-rusted sample regions of the image datasets in frequency domain. The degree of linear correlation between the rusted area percentage and the image spatial resolution with terahertz frequency can be used as an additional cross-validation criteria for the evaluation of classifier quality. However, for different rust staging measured datasets, a standardized procedure of image pre-processing is necessary to create/apply a single classifier and its usage is only limited to 1 ± 0.2 THz. Moreover, random forest is practically the best choice among the several popular ML techniques under test for waveform recognition of time-domain data in terms of classification accuracy and timing. Our results demonstrate the usefulness of random forest and several other machine learning algorithms for terahertz hyperspectral pattern recognition.
Article
Full-text available
Tunable terahertz (THz)-wave absorption spectroscopy is a promising technique to detect trace gases suspended in ambient air owing to their strong absorption fingerprints in the THz-wave spectral region. Here, we present a THz-wave spectroscopic gas detection platform based on a frequency-tunable injection-seeded THz-wave parametric generator and compact multipass gas absorption cells. Using a 1.8-m-path-length multipass cell, we detected gas-phase methanol (CH3OH) down to a trace concentration of 0.2 ppm at the 1.48-THz transparent atmospheric window. We also developed a transportable walk-through screening prototype using a 6-m-path-length multipass cell to identify suspicious subjects. Our results demonstrate the potential of the proposed system for security screening applications.
Article
Full-text available
In this Letter, we developed a high-sensitivity multi-stage terahertz (THz)-wave parametric detection system that operates at room temperature. This detection system has high sensitivity over a wide wavelength range through upconversion of a THz wave to near-infrared light. The broadband noise associated with parametric generation limited the detection sensitivity in the previous setup; however, in the multi-stage configuration using multiple ${{\rm LiNbO}_3}$ L i N b O 3 crystals, the THz parametric detection sensitivity was improved by spatially eliminating the broadband noise using an iris between the former and latter stages. With this improvement, the minimum detectable sensitivity at 1.05 THz approached 130 zJ ( ${\rm zJ} = {{10}^{- 21}}\;{\rm J}$ z J = 10 − 21 J ), which is equivalent to 90 photons or less. Furthermore, by combining this detector with an injection-seeded THz-wave parametric generator, which is a high-power, tunable THz-wave source, the THz-wave measurement system achieved a maximum dynamic range of 125 dB.
Article
Full-text available
Given the condition that protein conformation and activity are highly susceptible to environment factors such as temperature and pH, evaluation of protein conformation and activity is urgently needed in many fields. For example, most protein drugs need a stable and proper environment during production, storage and transportation, and it’s an enormous challenge to maintain protein activity throughout the whole process. Therefore, it’s necessary to ensure the safety and effectiveness of protein drugs by monitoring their activity before use. In our study, we presented an improved method for non-destructive evaluation of protein conformation and biological activity by terahertz spectroscopy combined with t-SNE-XGBoost. Firstly, bovine serum albumin (BSA) samples heated to different temperature were measured with THz-TDS. The obtained results indicated that native-conformation BSA will undergo transient states in the process of temperature induced denaturation. However, for any single given sample, it’s difficult to identify its conformation and activity directly by using the measured raw terahertz data. Therefore, we applied several different algorithms to the raw data for recognition of BSA samples with different conformation and activity induced by temperature. Finally, the models obtained by different algorithms were evaluated by calculating the root mean standard error of prediction (RMSEP) and the correlation coefficient of prediction (\(R_p\)). The THz-TDS plus t-SNE-XGBoost proved to be an effective non-destructive and label-free method for evaluation of protein conformation and activity. It can provide a new technique in many applications, such as pharmaceutical industry, clinical diagnosis and quality control.
Article
Full-text available
In recent years, there has been great interest in chipless radio-frequency identification (RFID) devices that work in the terahertz (THz) frequency range. Despite advances in RFID technology, its practical use in the THz range has yet to be realized, due to cost and detection accuracy issues associated with shielding materials. In this study, we propose two types of low-cost THz-tags; one is based on the thickness variation of coated polyethylene and the other on the fingerprint spectra of reagents. In the proposed approach, machine learning, specifically a deep-learning method, is used for high-precision tag identification even with weak signals, or when the spectrum is disturbed by passing through shielding materials. We achieved almost 100% identification accuracy despite using an inexpensive tag placed under thick shielding materials with an attenuation rate of about −50 dB. Furthermore, real-time tag identification was demonstrated by combining a multiwavelength injection-seeded THz parametric generator and a convolutional neural network.
Article
Full-text available
We demonstrate an automatic recognition strategy for terahertz (THz) pulsed signals of breast invasive ductal carcinoma (IDC) based on a wavelet entropy feature extraction and a machine learning classifier. The wavelet packet transform was implemented into the complexity analysis of the transmission THz signal from a breast tissue sample. A novel index of energy to Shannon entropy ratio (ESER) was proposed to distinguish different tissues. Furthermore, the principal component analysis (PCA) method and machine learning classifier were further adopted and optimized for automatic classification of the THz signal from breast IDC sample. The areas under the receiver operating characteristic curves are all larger than 0.89 for the three adopted classifiers. The best breast IDC recognition performance is with the precision, sensitivity and specificity of 92.85%, 89.66% and 96.67%, respectively. The results demonstrate the effectiveness of the ESER index together with the machine learning classifier for automatically identifying different breast tissues.
Article
Full-text available
The injection-seeded terahertz (THz) parametric generator (is-TPG) is one of the most high-power single-longitudinal mode THz-wave sources. Our system is less influenced by scattering, refraction, and multiple reflections by samples because it is a narrow-linewidth source, and the detection area of the THz parametric detector is large. Thus, it is suitable for nondestructive inspection of practical samples in the real world. In 2003, we reported on the development of a mail inspection system that employed a THz parametric oscillator. However, with a dynamic range of less than four orders of magnitude, this system could only identify reagents through thin envelopes. Recently, we succeeded in developing a high-power, highly sensitive THz-wave spectroscopic imaging system with a dynamic range of 100 dB using the is-TPG and a THz parametric detector. Nondestructive inspection of reagents inside thick envelopes and three-dimensional computed tomography of plastics, which attenuate THz-waves by more than 60 dB, were conducted using this system. More recently, we have focused our efforts on a real-time measurement system using a multiwavelength is-TPG, which gives rise to numerous potential applications, given the significantly shorter measurement times. Thus, this system will facilitate the implementation of THz-wave measurements in real-world applications. In this paper, we report on our recent results and provide a perspective on the is-TPG.
Article
Full-text available
The development of new spectral analysis methods in bio thin-film detection has generated intense interest in terahertz (THz) spectroscopy and its application in a wide range of fields. In this paper, it is the first time that machine learning methods are applied to the quantitative characterization of bovine serum albumin (BSA) deposited thin-films detected by terahertz time-domain spectroscopy. The spectra data of BSA thin-films prepared by solutions with concentrations ranging from 0.5 to 35 mg/ml are analyzed using the support vector regression method to learn the underlying model of the frequency against the target concentration. The learned mode successfully predicts the concentrations of the unknown test samples with a coefficient of determination R² = 0.97932. Furthermore, aiming to identify the relevance of each frequency to the concentration, the maximal information coefficient statistical analysis is used and the three most discriminating frequencies in THz frequency are identified at 1.2, 1.1 and 0.5 THz respectively, which means a good prediction for BSA concentration can be achieved by using the top three relevant frequencies. Moreover, the top discriminating frequencies are in good agreement with the frequencies predicted by a long-wavelength elastic vibration model for BSA protein.
Article
Full-text available
A system using energy dispersive X-ray diffraction has been tested to detect the presence of illicit drugs concealed within parcels typical of those which are imported into the UK via postal and courier services. The system was used to record diffraction data from calibration samples of diamorphine (heroin) and common cutting agents and a partial least squares regression model was established between diamorphine concentration and diffraction spectra. Parcels containing various crystalline and amorphous materials, including diamorphine, were then scanned to obtain multiple localised diffraction spectra and to form a hyperspectral image. The calibration model was used for the prediction of diamorphine concentration throughout the volume of parcels and enabled the presence and location of diamorphine to be determined from the visual inspection of concentration maps. This research demonstrates for the first time the potential of an EDXRD system to generate continuous hyperspectral images of real parcels from volume scanning in security applications and introduces the opportunity to explore hyperspectral image analysis in chemical and material identification. However, more work must be done to make the system ready for implementation in border control operations by bringing down the procedure time to operational requirements and by proving the system’s portability.
Article
Full-text available
In this study, the simultaneous generation of multiwavelength terahertz (THz) waves by an injection-seeded THz parametric generator (is-TPG) was achieved for the first time. The output and stability of the multiwavelength THz waves were equivalent to those of the THz waves generated via a single-wavelength is-TPG. Spatial separation of frequencies and high-sensitivity detection were achieved by converting the THz waves to near-infrared detection beams. Furthermore, one-pulse spectroscopy of saccharides was realized, and a dynamic range of more than 60 dB was obtained. The results demonstrated the possibility of using the is-TPG to significantly shorten the measurement times of spectroscopic systems.