ArticlePDF Available

Application of Machine Learning to Terahertz Spectroscopic Imaging of Reagents Hidden By Thick Shielding Materials

July 2021
IEEE Transactions on Terahertz Science and Technology PP(99):1-1

July 2021
PP(99):1-1

DOI:10.1109/TTHZ.2021.3094128

Authors:

Kosuke Murate

Nagoya University

Kodo Kawase

Nagoya University

We achieved high identification accuracy of reagents hidden by thick shielding materials, by combining injection-seeded terahertz (THz) wave parametric generator measurements and machine learning analysis. The analysis performance of three methods, support vector machine (SVM), k-nearest neighbor, and random forest, was compared in an attempt to identify the optimal approach. SVM proved to be the best model. Conventional systems could only identify reagents through pre-measured shields; however, incorporation of machine learning allowed us to identify the reagents through shielding materials that had not been pre-measured. Moreover, spectroscopic imaging of the reagents revealed the distribution pattern of the reagents, even through thick shielding materials that attenuated THz frequencies such that they were close to the noise level.

Content uploaded by Kosuke Murate

Content may be subject to copyright.

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

Abstract— We achieved high identification accuracy of reagents

hidden by thick shielding materials, by combining injection-

seeded terahertz (THz) wave parametric generator measurements

and machine learning analysis. The analysis performance of three

methods, support vector machine (SVM), k-nearest neighbor, and

random forest, was compared in an attempt to identify the optimal

approach. SVM proved to be the best model. Conventional systems

could only identify reagents through pre-measured shields;

however, incorporation of machine learning allowed us to identify

the reagents through shielding materials that had not been pre-

measured. Moreover, spectroscopic imaging of the reagents

revealed the distribution pattern of the reagents, even through

thick shielding materials that attenuated THz frequencies such

that they were close to the noise level.

Index Terms—Terahertz wave parametric generator, Terahertz

radiation, Machine learning, Nondestructive testing.

I. INTRODUCTION

umerous methods have been applied to detect illicit drugs hidden

in envelopes and other containers. X-ray scanners [1] and drug

detection dogs are commonly used. However, while an X-ray scanner

can be used for the inspection of interiors, it cannot identify specific

substances, while drug detection dogs are prone to errors. Moreover,

suspicious mail cannot be opened without a search warrant. Therefore,

in recent years, non-destructive drug detection using terahertz (THz)

waves has been researched [2, 7]. THz waves can pass through many

materials, similar to microwaves, and can also be guided by lenses or

mirrors like infrared (IR) light. In addition, many reagents have

fingerprint spectra, making it possible to identify illegal drugs under

shielding materials in a non-destructive/non-contact manner.

THz wave parametric generators with MgO:LiNbO3 crystals have

been studied since the 1990s in terms of their nondestructive

inspection applications [7–11]. Significant improvements in the

performance of spectroscopic systems using an injection-seeded THz-

wave parametric generator (is-TPG) and in THz parametric detection

have led to a wide dynamic range of up to 125 dB [12, 13]. Thus,

spectroscopic systems can identify reagents through shielding

materials up to 5 cm thick [7, 8]. However, while these systems have

shown sufficient performance, analysis methods for reagent

Manuscript received XXX XX, 2019; revised XXX XX, 2021; accepted

XXX XX, 2021. Date of publication XXX XX, 2021; date of current version

XXX XX, 2021. This work was partially supported by Japan Society for the

Promotion of Science KAKENHI (18H03887, 19H02627); Research

Foundation for Opto-Science and Technology; and The Hibi Science

identification require further improvement. A previous study used a

reagent identification method based on simple regression analysis with

matrix operations [2, 7]. Although this method showed high accuracy

for discriminating reagents under specific shields that were assumed in

advance, it was not good enough for unknown shields or noisy data.

For real-world applications, the ability to discriminate reagents hidden

by a wide range of shielding materials is necessary.

In this study, we introduced machine learning analysis for is-TPG

measurements to detect and identify shielded reagents. Machine

learning algorithms learn patterns by analyzing a large amount of data.

The method we used in previous studies [2, 7] is also a type of machine

learning, but identification thresholds must be set. Thus, they were

distinct from the methods proposed in the current study.

Machine learning is widely used for material identification and

quantitative testing, and is also attracting attention as a sample

identification method for THz spectroscopy [14–19]. However, the

identification is usually carried out on a sample or barrier material that

the algorithm has already been trained on. To our knowledge, no

current system is capable of identifying reagents through various kinds

of shielding materials, which is required for practical application. In

this study, we developed a versatile system that can discriminate

reagents through various shielding materials using is-TPG

spectroscopy with machine learning methods.

II. EXPERIMENTAL SETUP AND ANALYSIS METHOD

Our objective was illicit drug detection, but obtaining real samples

was difficult. Therefore, samples of three saccharide (maltose, glucose,

and lactose), which have fingerprint spectra similar to those of illicit

drugs in the THz band, were analyzed in this study. Maltose has

absorption peaks at 1.12 and 1.60 THz, while glucose has a peak at

1.44 THz, and lactose at 1.37 THz. Powders of these saccharides

(particle diameter: 30–100 μm) were enclosed in 10 × 10 mm2

polyethylene bags containing 1-mm thick samples.

To discriminate these reagent samples when shielded, it is

necessary for the algorithm to learn their spectra in advance through

various shielding materials. Therefore, our reagent samples were

placed under the following five types of shielding materials to generate

training data.

-Two pieces of cotton (thickness: 5 mm; attenuation: ≈ 10 dB);

Foundation. (Corresponding author: Kosuke Murate)

K. Murate, H. Kanai, and K. Kawase are with the Department of Electronics,

Graduate school of Engineering, Nagoya University, Furocho, Chikusa,

Nagoya, 4648603, Japan (e-mail: murate@nuee.nagoya-u.ac.jp,

kanai.hiroki@f.mbox.nagoya-u.ac.jp, kodo@nagoya-u.jp).

Kosuke Murate, Hiroki Kanai, and Kodo Kawase

Application of Machine Learning to Terahertz

Spectroscopic Imaging of Reagents Hidden by

Thick Shielding Materials

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

-Two pieces of corrugated cardboard (thickness: 6 mm; attenuation: ≈

20 dB);

-Two pieces of denim fabric (thickness: 0.7 mm; attenuation: ≈ 30 dB);

-Four pieces of polyurethane cushioning material (thickness: 31 mm;

attenuation: ≈ 25 dB);

-Two layers of polyethylene cushioning material and two postage

envelopes (thickness: 33 mm; attenuation: ≈ 12 dB).

Here the attenuation at 1.5 THz is shown because the spectroscopic

system used in this study was optimized at 1.5 THz. We also evaluated

whether the system was capable of identifying samples hidden by the

following two types of shielding materials that the algorithms were not

trained on:

-A low-attenuation shielding model consisting of cotton and envelopes

(thickness: 6 mm; attenuation: ≈ 10 dB);

-A high-attenuation shielding model consisting of four pieces of

corrugated cardboard, two pieces of cushioning material

(polyethylene), two pieces of bubble wrap (polyethylene), and two

pieces of envelope (thickness: 35 mm; attenuation: ≈ 65 dB).

The transmittance spectra of reagents and shielding materials are

shown in Fig. 1. We also show the near-infrared (NIR) detection beam

intensity according to the attenuation rate of the THz-wave, to

demonstrate the degree to which the reagents and shielding attenuated

the THz waves.

Figure 2 shows a schematic diagram of the is-TPG measurement

system. A microchip Nd:YAG laser was used as the pump source, and

an external cavity laser diode (ECLD) was used as the seed source.

Pump and seed beam were injected into a MgO:LiNbO3 crystal under

non-collinear phase-matching conditions to generate the THz wave [8,

9]. The THz wave passed through the sample and then was input to

the MgO:LiNbO3 crystal together with the pump beam for detection.

The THz wave was converted into an NIR detection beam using the

inverse generation process; the detection beam was measured by an

NIR pyro-electric detector. A THz-wave variable attenuator was

inserted into the THz beam path; and the dynamic range was

confirmed to be up to eight orders of magnitude higher. The tuning

range was about 0.8–2.6 THz.

Normally, the relationship between the THz-wave intensity and

NIR detection beam intensity is not linear. As the input THz-wave

intensity increases, the change in NIR detection beam decreased due

to “saturation” of the parametric gain as shown in Fig.1 [12]; therefore,

it is usually necessary to convert the detection beam intensity into THz

wave intensity using a pre-specified equation to obtain the correct

value. However, in this study, we used the unconverted detection beam

intensity, as the exact THz-wave transmittance was not required for

discrimination.

We compared the discrimination accuracy of three machine

learning methods; support vector machine (SVM), k-nearest neighbor

(kNN), and random forest (RF) algorithms. These methods are widely

used for easy classification. SVM determines the discriminant function

that maximizes the margin (the shortest distance between individual

data points) and classifies the data [20, 21]. kNN acquires k points near

known data and discriminates them by majority decision [22]. RF

takes a majority vote on the discrimination accuracy of a number of

decision trees [23].

We applied these machine learning methods using scikit-learn [24]

as . For the SVM method, we optimized parameter γ in the radial basis

function kernel and cost parameter C. γ denotes the degree of

complexity of the decision boundary; the larger the value, the more

complex the boundary. Cost parameter C is a measure of how much

misclassification is allowed; the larger the value of C, the less-tolerated

the misclassification. Thus, the classification becomes more complex.

These parameters were optimized over the range of 1.0 × 10−5 to 1.0 ×

Fig. 2. THz spectroscopic system using an injection-seeded THz parametric

generator (is-TPG). (HWP: half wave plate; ECLD: external cavity laser

diode; SOA: semiconductor optical amplifier; NIR: near-infrared.)

Microchip

Nd:YAG laser

PBS

1068～1075nm,

CW, 400mW

THz-wave

1064.4nm, 450ps,

50Hz, 0.7 mJ/pulse

NIR pyroelectric

detector

Detection beam

Pump beam

MgO:LiNbO3

Seed beam

Sample

Lock-in amplifier

+ PC

HWP

f=100 mm

Grating

1200L/mm

（NIR）

Si prism

ECLD

HWP

Nd:YAG

amplifier

0.7 mJ

→ 18 mJ

SOA

Cylindrical

(f=100 mm)

injection-seeded terahertz-wave

parametric generator (is-TPG)

THz parametric detector

Synchronized with the

pumped laser

Fig. 1. Transmission spectra of the (a) reagents, (b) trained shielding materials,

and untrained shielding materials used in this study. The near-infrared (NIR)

detection beam intensities according to the attenuation rate of the terahertz

(THz) wave are also shown.

系列1

系列2

系列3

系列4

系列5

系列6

系列8

系列9

系列10

系列6

系列7

系列1

系列2

系列3

系列4

系列5

系列6

系列8

系列9

系列10

Maltose

Glucose

Lactose

系列1

系列2

系列3

系列4

Low attenuation shielding model

High attenuation shielding model

-10 dB

-20 dB

-30 dB

-40 dB

-50 dB

-60 dB

-70 dB

-80 dB

Noise level











   

Transmittance

Frequency [THz]

Shielding materials (Untrained)

Frequency [THz]

-10 dB

-20 dB

-30 dB

-40 dB

Cotton

Corrugated cardboard

Denim

Cushioning material

(Polyurethane)

Cushioning material

(Polyethylene)and envelope

   





Shielding materials (Trained)

Transmittance

Maltose

Glucose

Lactose

-10 dB

-20 dB

-30 dB

-40 dB

-50 dB

-60 dB

-70 dB

-80 dB

Noise level

NIR detection beam

intensities according

to the attenuation

rate of the THz wave

Reagents











   

Frequency [THz]

Transmittance

(a)

(b)

(c)

NIR detection beam

intensities according

to the attenuation

rate of the THz wave

NIR detection beam

intensities according

to the attenuation

rate of the THz wave

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

104 using a cross-validation and grid search [25], which allows us to

test all possible parameter. For the kNN algorithm, we optimized

parameter k over the range of 1–5 and the weight parameters from

“uniform” and “distance” by cross-validation and grid search [25].

Parameter k describes how many points in close proximity are used

for the majority vote, and the weight parameters depend on whether

the weight is based on distance; “uniform” indicates no weighting, and

“distance” indicates weighting by the inverse of the distance. For the

RF method, the number of decision trees was optimized from 10 to

150. The data were normalized before learning and identification for

all machine learning methods. The source code used in this study is

available on https://github.com/knhiroki/program.

All three of the supervised machine learning algorithms had to be

trained in advance using a large amount of data. In total, 852

transmission spectra (n = 317, 275, and 260 for lactose, maltose and

glucose, respectively) were acquired through the five types of shields.

Among those spectra, we used 627 spectra for training data and 225

spectra for test data.

We also prepared the spectrums of reagents under untrained

shielding materials as test data; there were 17 samples for the low-

attenuation shield (6 maltose, 6 glucose, and 5 lactose samples), and

45 for the high-attenuation shield (15 maltose, 15 glucose, and 15

lactose samples).

Representative absorption spectra of the reagents obtained through

each shield are shown in Fig. 3. Each spectrum was acquired within 1

min. The identification was conducted by 70 frequencies from 1.1 to

1.8 THz. Although the absorption peaks of each reagent are evident in

the figures, some of them differ in shape from the pure absorption

peaks, due to the disruption of the waveforms by the shielding

materials. Moreover, when the high-attenuation shield was used, the

waveform was almost equivalent to the noise level.

III. RESULTS

The reagent identification results through the trained and untrained

shields are shown in the upper and lower parts of Table 1, respectively.

The three machine learning methods were compared. The values in

the table reflect the accuracy of the test data identification. Through the

trained shields, all methods identified the reagents with nearly 100%

accuracy. Thus, we confirmed that the spectroscopic system used in

this study could discriminate reagents through the trained shields.

Through the low-attenuation untrained shield, all learning methods

achieved 100% accuracy; Through the high-attenuation untrained

shield, the SVM, kNN and RF algorithms achieved 88.9%, 77.8%,

and 80.0% accuracy, respectively. The 100% accuracy for the

untrained low-attenuation shield was attributed to the spectrum being

similar to those obtained through the trained shields. In contrast,

although the spectra obtained through the high-attenuation shield were

close to the noise level and the original spectral shape was not

maintained, 88.9% accuracy was achieved.

We obtained a reagent discrimination rate of more than 88%,

regardless of the attenuation rate of the shielding material. SVM

combined with is-TPG spectroscopy showed the highest performance

among the learning methods used for identifying reagents through

shields. Using the same system but with conventional identification

methods, it was difficult to identify both the low-attenuation samples

and samples that were buried in noise.

Next, we attempted to reduce the number of measurement points

(i.e., frequencies) to make the analysis more efficient. The RF machine

learning method provides information on the contribution ratios of

each frequency to the identification results, as shown in Fig. 4. From a

total of 70 frequencies, only the top 7 frequencies (i.e., those making

the largest contributions) were selected. Table 2 shows the reagent

identification results obtained through the two untrained (low- and

high-attenuation) shielding materials. The discrimination accuracy of

the SVM, kNN, and RF algorithms was 100% for the low-attenuation

shield, compared to 77.8%, 66.7%, and 77.8%, respectively, for the

high-attenuation shield. Although the overall accuracy was lower than

that when using all frequencies, nearly 80% accuracy was achieved

with the SVM and RF methods. Considering the expected applications,

such as the identification of chemicals inside mail envelopes or parcels,

Figure1. Terahertz (THz) spectroscopic system using is-TPG.

(HWP: half wave plate; ECLD: external cavity laser diode; SOA:

semiconductor optical amplifier; NIR: near-infrared.)

TABLE I

DISCRIMINATION ACCURACY FOR VARIOUS SAMPLES.

SVM: support vector machine; kNN: k-nearest neighbor; RF: random forest.

Shielding materials name Machine learning method

SVM KNN RF

Trained shielding materials

(Average of 5 kinds of shielding) 98.7 % 96.9 % 98.2 %

Low attenuation shielding

model (-10dB) 100 % 100 % 100 %

High attenuation shielding

model (-65dB)88.9 % 77.8 % 80.0 %

Untrained

shielding

materials

Fig. 3. Typical transmission spectra of the reagents obtained through various

shielding materials. The transmittance was calculated based on the NIR

detection beam output and was not converted to THz-wave transmittance.

Included in training data set

Not included in training data

Denim Corrugated

cardboard Cotton

Cushioning

material

(Polyurethane)

Cushioning

material

(Polyethylene)

and envelope

Low

attenuation

shielding model

(-10 dB)

High

attenuation

shielding model

(-65 dB)

Frequency [THz] Frequency [THz] Frequency [THz]

Transmittance Transmittance Transmittance Transmittance Transmittance Transmittance Transmittance

Maltose Glucose Lactose











    











    











    











    











   











    











    











    











    











    











    











   











    











    











    











    











    











    











   











    











    

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

an accuracy rate of 80% would likely be acceptable, given that no other

method is currently available. If seven frequencies are sufficient for

discrimination, significant time savings could be achieved.

We are currently working on the simultaneous generation of

multiple THz wavelengths using the is-TPG system to achieve real-

time spectroscopy [8, 26]. Using the frequencies with large

contribution ratios in the RF method, real-time and highly accurate

discrimination can be realized.

Finally, spectroscopic imaging was performed using the proposed

system. First, a sample concealed by a high-attenuation shielding

model was placed at the THz focal point and raster-scanned using an

X-Y stage with a spatial resolution of about 1 mm, as shown in Fig. 5.

The seven frequencies with the highest contribution ratios in the RF

method were applied. Notably, the selected frequencies differed from

those shown in Fig. 4, as the relative contributions of the frequencies

varied slightly depending on the given dataset. The spectral results at

each measurement point were classified by SVM, and spectroscopic

imaging was then performed. To classify the shields and reagents, the

spectral data of many shield types were included as a new class in the

training dataset (the algorithms were not trained on the high

attenuation shield itself).

We were able to reveal the spatial distribution of the reagents, even

through untrained, high-attenuation shielding materials, as shown in

Fig. 6(a). In the overlay of the photograph shown in Fig. 6(b), it can be

seen that some pixels were misrecognized. For example, point "A"

was misidentified as maltose, even though it was a shielding material.

Although there was no absorption at 1.1 THz, transmittance was high

at 1.44 THz [Fig. 6(c)], suggesting that it was misidentified as maltose.

On the other hand, point "B" was misidentified as a shield, even though

it was glucose. Figure 6(d) shows that the absorption of "B" at 1.44

THz was low; it was misidentified because its waveform resembled

that of the shielding material. This error could be due to differences in

the amount of sample powder in the plastic bags. The identification

accuracy was about 63% based on Fig. 6(a). in the accuracy of

spectroscopic imaging was lower than indicated in Table 2, because

areas without reagents had to be identified as background (i.e., one

more target had to be identified). Moreover, the sample thickness was

low in some places, making it difficult to obtain sufficient information

from some of the point locations.

In this case, we performed measurements using the sample that

Fig. 6. (a) Spectroscopic imaging results showing the spatial distribution of

maltose, glucose, and lactose. The spatial resolution was about 1 mm and the

measurement time for this image was less than 2 h. (b) Imaging results

overlayed on the samples. (c) Comparison of the spectra obtained at point A

in (b), which was misidentified as maltose, with the spectra of the shield and

maltose. (d) Comparison of the spectra obtained at point B in (b), which was

misidentified as a shield, with the spectra of the shield and glucose.

Maltose in training data

Misidentified as a maltose

Shield

(a) Spectroscopic imaging result

(b) Imaging results overlayed on the samples

Frequency [THz]

Transmittance [a.u.]

1.2 46

Frequency [THz]

Transmittance [a.u.]

Glucose in training data

Misidentified as a shield

Shield

shield

glucose in training data

shield

maltose in training data 









1.2 46











(d) Spectra at point B

Fig. 4. Example contribution ratios for each frequency obtained using the

random forest algorithm. The top seven frequencies are in red color.

Frequencies near the absorption peak make larger contributions.

TABLE 2.

IDENTIFICATION RESULTS USING ONLY SEVEN FREQUENCIES.















    

  

Contribution rate

Frequency [THz]

Shielding materials name Machine learning method

SVM KNN RF

Low attenuation shielding

model (-10dB) identified

with 7 frequencies 100 %100 %100 %

High attenuation shielding

model (-65dB) identified

with 7 frequencies 77.8 % 66.7 % 77.8%

Untrained

shielding

materials

Fig. 5. Shielding materials and reagent samples used for the spectroscopic

imaging measurements. Reagents were sandwiched between four kinds of

shielding materials that attenuated the THz wave (to −65 dB at 1.5 THz).

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

attenuated the THz wave to almost the noise level, which resulted in

some misidentified points. As shown in Tables 1 and 2, spectroscopic

imaging was more accurate through low-attenuation shields. In

addition, improvement in the dynamic range using a highly sensitive

multi-stage THz parametric detector [13] would limit misidentification,

even through high-attenuation shields.

IV. CONCLUSION

We combined machine learning with is-TPG spectroscopic

measurements to identify reagents hidden by various shielding

materials. By training the algorithms on a large amount of

spectroscopic data, sufficient discrimination accuracy was obtained;

this was also the case through untrained shields. Three machine

learning algorithms were compared: SVM, kNN, and RF; SVM

showed the best discrimination performance. To shorten the

measurement time and allow use a multi-wavelength THz source,

identification was also performed using only the top 7 seven

frequencies for the RF machine learning method. Finally,

spectroscopic imaging of the reagents through untrained high-

attenuation shielding materials was performed; accurate spatial

distributions of reagent were obtained. In this study, we used SVM,

kNN, and RF because the amount of data was not particularly large;

however, if the dataset had been considerably larger, a deep neural

network would have been an appropriate option. We believe that

machine learning is essential to identify illicit drugs and other

substances hidden in packages using THz-wave methods, and we are

confident that this research will contribute to the future development

of THz-wave applications.

ACKNOWLEDGMENT.

Authors appreciate the assistance in the experiments and useful

discussions with Mr. T. Horiuchi and Mr. R. Mitsuhashi

REFERENCES

[1] I. Drakos, P. Kenny, T. Fearn, and R. Speller, “Multivariate analysis of

energy dispersive X-ray diffraction data for the detection of illicit drugs

in border control,” Crime Sci., vol. 6, no. 1, p. 1, 2017.

[2] K. Kawase, Y. Ogawa, Y. Watanabe, and H. Inoue, “Non -destructive

terahertz imaging of illicit drugs using spectral fingerprints,” Opt.

Express, vol. 11, no. 20, pp. 2549–2554, 2003.

[3] U. Puc, A. Abina, M. Rutar, A. Zidanšek, A. Jeglič, and G. Valušis,

“Terahertz spectroscopic identification of explosive and drug simulants

concealed by various hiding techniques,” Appl. Opt., vol. 54, no. 14, pp.

4495–4502, May 2015.

[4] V. A. Trofimov and S. A. Varentsova, “Detection and identification of

drugs under real conditions by using noisy terahertz broadband pulse,”

Appl. Opt., vol. 55, no. 33, pp. 9605–9618, Nov. 2016.

[5] P. Dean et al., “Absorption-sensitive diffuse reflection imaging of

concealed powders using a terahertz quantum cascade laser,” Opt.

Express, vol. 16, no. 9, pp. 5997–6007, Apr. 2008.

[6] M. Bauer et al., “Antenna-coupled field-effect transistors for multi-

spectral terahertz imaging up to 4.25 THz,” Opt. Express, vol. 22, no.

16, pp. 19235–19241, Aug. 2014.

[7] M. Kato, S. R. Tripathi, K. Murate, K. Imayama, and K. Kawase, “Non-

destructive drug inspection in covering materials using a terahertz

spectral imaging system with injection-seeded terahertz parametric

generation and detection,” Opt. Express, vol. 24, no. 6, p. 6425, Mar.

2016.

[8] K. Murate and K. Kawase, “Perspective: Terahertz wave parametric

generator and its applications,” J. Appl. Phys., vol. 124, no. 16, p.

160901, Oct. 2018.

[9] S. Hayashi, K. Nawata, T. Taira, J. Shikata, K. Kawase, and H.

Minamide, “Ultrabright continuously tunable terahertz-wave

generation at room temperature,” Sci. Rep., vol. 4, p. 5045, Jun. 2014.

[10] K. Kawase, J. Shikata, and H. Ito, “Terahertz wave parametric source,”

J. Phys. Appl. Phys., vol. 35, no. 3, p. R1, 2002.

[11] Y. Takida, Y. Takida, K. Nawata, K. Nawata, and H. Minamide,

“Security screening system based on terahertz-wave spectroscopic gas

detection,” Opt. Express, vol. 29, no. 2, pp. 2529–2537, Jan. 2021.

[12] K. Murate et al., “A High Dynamic Range and Spectrally Flat Terahertz

Spectrometer Based on Optical Parametric Processes in LiNbO3,” IEEE

Trans. Terahertz Sci. Technol., vol. 4, no. 4, pp. 523–526, Jul. 2014.

[13] H. Sakai, K. Kawase, and K. Murate, “Highly sensitive multi-stage

terahertz parametric detector,” Opt. Lett., vol. 45, no. 14, pp. 3905–

3908, Jul. 2020.

[14] D. S. Bulgarevich, M. Talara, M. Tani, and M. Watanabe, “Machine

learning for pattern and waveform recognitions in terahertz image data,”

Sci. Rep., vol. 11, Jan. 2021.

[15] H. Ge, Y. Jiang, Z. Xu, F. Lian, Y. Zhang, and S. Xia, “Identification

of wheat quality using THz spectrum,” Opt. Express, vol. 22, no. 10,

pp. 12533–12544, May 2014.

[16] Y. Sun et al., “Quantitative characterization of bovine serum albumin

thin-films using terahertz spectroscopy and machine learning methods,”

Biomed. Opt. Express, vol. 9, no. 7, pp. 2917–2929, Jul. 2018.

[17] C. Cao, Z. Zhang, X. Zhao, and T. Zhang, “Terahertz spectroscopy and

machine learning algorithm for non-destructive evaluation of protein

conformation,” Opt. Quantum Electron., vol. 52, no. 4, p. 225, Apr.

2020.

[18] W. Liu et al., “Automatic recognition of breast invasive ductal

carcinoma based on terahertz spectroscopy with wavelet packet

transform and machine learning,” Biomed. Opt. Express, vol. 11, no. 2,

pp. 971–981, Feb. 2020.

[19] R. Mitsuhashi, K. Murate, S. Niijima, T. Horiuchi, and K. Kawase,

“Terahertz tag identifiable through shielding materials using machine

learning,” Opt. Express, vol. 28, no. 3, pp. 3517–3527, Feb. 2020.

[20] C. J. C. Burges, “A Tutorial on Support Vector Machines for Pattern

Recognition,” Data Min. Knowl. Discov., vol. 2, no. 2, pp. 121–167,

Jun. 1998.

[21] C. Bishop, Pattern Recognition and Machine Learning. New York:

Springer-Verlag, 2006.

[22] S. A. Dudani, “The Distance-Weighted k-Nearest-Neighbor Rule,”

IEEE Trans. Syst. Man Cybern., vol. SMC-6, no. 4, pp. 325–327, Apr.

1976.

[23] L. Breiman, “Random Forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32,

Oct. 2001.

[24] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” J.

Mach. Learn. Res., vol. 12, no. 85, pp. 2825–2830, 2011.

[25] J. Bergstra and Y. Bengio, “Random search for hyper-parameter

optimization,” J. Mach. Learn. Res., vol. 13, no. null, pp. 281–305, Feb.

2012.

[26] K. Murate, S. Hayashi, and K. Kawase, “Multiwavelength terahertz -

wave parametric generator for one-pulse spectroscopy,” Appl. Phys.

Express, vol. 10, no. 3, p. 032401, Feb. 2017.

Kosuke Murate received B.S., M.S.

and Ph.D. degrees from Nagoya

University, Japan in 2013, 2015, and

2018, respectively. Now he is working

as an assistant professor in the Nagoya

University from 2018. He received

Ikushi prize from Japan Society for the

Promotion of Science in 2018.

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

Hiroki Kanai received the B.S degree

in Department of Electrical

Engineering and Electronics, and

Information Engineering in 2020, and

now he is a master student in

Department of Electronics, Graduate

school of Engineering, Nagoya

University in Japan.

Kodo Kawase received the B.S. degree

from Kyoto Univ. in 1989, and the Ph.

D degrees from Tohoku Univ. in 1996.

He became a team leader of RIKEN in

2001. He became a Professor of

Nagoya University in 2005. He

received the 2005 Young Scientists’

Prize by the Minister of Education.

Multi-Wavelength Terahertz Parametric Generator Using a Seed Laser Based on Four-Wave Mixing

Article

Full-text available

Apr 2022

In this study, we developed a multi-wavelength terahertz-wave parametric generator that operates with only one injection seeding laser. Tunable lasers used as an injection seeder must be single-frequency oscillators, and conventional multi-wavelength terahertz-wave parametric generator requires basically the same number of lasers as the number of wavelengths. In order to solve this problem, we developed a new external cavity semiconductor laser that incorporates a DMD in its wavelength-selective mechanism. In this process, stable multi-wavelength oscillation from a single laser was made possible by efficiently causing four-wave mixing. This seed laser can be applied to practical real-time terahertz spectroscopy by arbitrarily switching the desired wavelength to be generated and the interval between multiple wavelengths.

Terahertz Parametric Generators and Detectors for Nondestructive Testing Through High-Attenuation Packaging Materials

Article

Full-text available

Sep 2023

Here, we introduce an injection-seeded terahertz (THz)-wave parametric generator (is-TPG) spectroscopic system and its application to nondestructive inspection through a packaging material with high attenuation. Recent technological innovations have dramatically improved is-TPG output. Combined with THz parametric detection, whereby detection is performed in the reverse process of generation, a spectrometer with an extremely high dynamic range has been achieved. THz spectroscopic imaging has enabled the previously difficult visualization of substance spatial distributions, even through thick packaging materials. Moreover, the introduction of machine learning has improved the accuracy of identification. High-speed wavelength tuning and multi-wavelength generation enable real-time acquisition of sample information and real-time identification by image recognition, thus broadening the range of applicability of the is-TPG. Additionally, detection sensitivity has improved to a level of < 1 aJ through multi-stage THz parametric detection. The system combining is-TPG and THz parametric detection now exhibits a dynamic range of 125 dB, enabling imaging through thick, high scattering materials with an attenuation factor of −100 dB; to our knowledge, such measurements are difficult to achieve with other THz-wave systems.

Terahertz Spectroscopy and Imaging Techniques for Herbal Medicinal Plants Detection: A Comprehensive Review

Article

Full-text available

Mar 2023
CRIT REV ANAL CHEM

Herbal medicine (HM), derived from various therapeutic plants, has garnered considerable attention for its remarkable effectiveness in treating diseases. However, numerous issues including improved varieties selection, hazardous residue detection, and concoction management affect herb quality throughout the manufacturing process. Therefore, a practical, rapid, nondestructive detection technology is necessary. Terahertz (THz) spectroscopy, with low energy, penetration, and fingerprint features, becomes preferable method for herb quality appraisal. There are three parts in our review. THz techniques, data processing, and modeling methods were introduced in Part I. Three primary applications (authenticity, composition and active ingredients, and origin detection) of THz in medicinal plants quality detection in industrial processing and marketing were detailed in Part II. A thorough investigation and outlook on the well-known applications and advancements of this field were presented in Part III. This review aims to bring new enlightenment to the in-depth THz application research in herbal medicinal plants.

About possibility of the broadband THz pulse’s spectrum finding based on measurements in time for a substance response at some frequencies

Conference Paper

Mar 2024

Terahertz spectra reconstructed using convolutional denoising autoencoder for identification of rice grains infested with Sitophilus oryzae at different growth stages

Article

Feb 2024

Terahertz Data Extraction and Analysis Based on Deep Learning Techniques for Emerging Applications

Article

Full-text available

Jan 2024

Following the recent progress in the development of Terahertz (THz) generation and detection, THz technology is being widely used to characterize test sample properties in various applications including nondestructive testing, security inspection and medical applications. In this paper, we have presented a broad review of the recent usage of artificial intelligence (AI) particularly, deep learning techniques in various THz sensing, imaging, and spectroscopic applications with emphasis on their implementation for medical imaging of cancerous cells. Initially, the fundamentals principles and techniques for THz generation and detection, imaging and spectroscopy are introduced. Subsequently, a brief overview of AI – machine learning and deep learning techniques is summarized, and their performance is compared. Further, the usage of deep learning algorithms in various THz applications is reported, with focus on metamaterials design and classification, detection, reconstruction, segmentation, parameter extraction and denoising tasks. Moreover, we also report the metrics used to evaluate the performance of deep learning models and finally, the existing research challenges in the application of deep learning in THz cancer imaging applications are identified and possible solutions are suggested through emerging trends. With the continuous increase of acquired THz data – sensing, spectral and imaging, artificial intelligence has emerged as a dominant paradigm for embedded data extraction, understanding, perception, decision making and analysis. Towards this end, the integration of state-of-the-art machine learning techniques such as deep learning with THz applications enable detailed computational and theoretical analysis for better validation and verification than modelling techniques that precede the era of machine learning. The study will facilitate the large-scale clinical applications of deep learning enabled THz imaging systems for the development of smart and connected next generation healthcare systems as well as provide a roadmap for future research direction.

Pulse train terahertz wave parametric generation

Conference Paper

Sep 2023

Terahertz Parametric Generation by Collinear Injection Seeding

Conference Paper

Sep 2023

Identification of heavy metal pollutants in wheat by THz spectroscopy and Deep support vector machine

Article

Jul 2023

This paper proposes to detect heavy metal pollutants in wheat using terahertz spectroscopy and deep support vector machine (DSVM). Five heavy metal pollutants, arsenic, lead, mercury, chromium, and cadmium, were considered for detection in wheat samples. THz spectral data were pre-processed by wavelet denoising. DSVM was introduced to further enhance the accuracy of the SVM classification model. According to the relationship between the accuracy and the training time with the number of hidden layers ranging from 1 to 4, the model performs the best when the hidden layer network has three layers. Besides, using the back-propagation algorithm to optimize the entire DSVM network. Compared with Deep neural network (DNN) and SVM models, the comprehensive evaluation index of the proposed model optimized by DSVM has the highest accuracy of 91.3 %. It realized the exploration enhanced the classification accuracy of the heavy metal pollutants in wheat.

Terahertz Technology: Principles and Applications in the Agri-Food Industry

Book

Jun 2023

Machine learning for pattern and waveform recognitions in terahertz image data

Article

Full-text available

Jan 2021

Several machine learning (ML) techniques were tested for the feasibility of performing automated pattern and waveform recognitions of terahertz time-domain spectroscopy datasets. Out of all the ML techniques under test, it was observed that random forest statistical algorithm works well with the THz datasets in both the frequency and time domains. With such ML algorithm, a classifier can be created with less than 1% out-of-bag error for segmentation of rusted and non-rusted sample regions of the image datasets in frequency domain. The degree of linear correlation between the rusted area percentage and the image spatial resolution with terahertz frequency can be used as an additional cross-validation criteria for the evaluation of classifier quality. However, for different rust staging measured datasets, a standardized procedure of image pre-processing is necessary to create/apply a single classifier and its usage is only limited to 1 ± 0.2 THz. Moreover, random forest is practically the best choice among the several popular ML techniques under test for waveform recognition of time-domain data in terms of classification accuracy and timing. Our results demonstrate the usefulness of random forest and several other machine learning algorithms for terahertz hyperspectral pattern recognition.

Security screening system based on terahertz-wave spectroscopic gas detection

Article

Full-text available

Jan 2021
OPT EXPRESS

Tunable terahertz (THz)-wave absorption spectroscopy is a promising technique to detect trace gases suspended in ambient air owing to their strong absorption fingerprints in the THz-wave spectral region. Here, we present a THz-wave spectroscopic gas detection platform based on a frequency-tunable injection-seeded THz-wave parametric generator and compact multipass gas absorption cells. Using a 1.8-m-path-length multipass cell, we detected gas-phase methanol (CH3OH) down to a trace concentration of 0.2 ppm at the 1.48-THz transparent atmospheric window. We also developed a transportable walk-through screening prototype using a 6-m-path-length multipass cell to identify suspicious subjects. Our results demonstrate the potential of the proposed system for security screening applications.

Highly sensitive multi-stage terahertz parametric detector

Article

Full-text available

Jul 2020
OPT LETT

In this Letter, we developed a high-sensitivity multi-stage terahertz (THz)-wave parametric detection system that operates at room temperature. This detection system has high sensitivity over a wide wavelength range through upconversion of a THz wave to near-infrared light. The broadband noise associated with parametric generation limited the detection sensitivity in the previous setup; however, in the multi-stage configuration using multiple ${{\rm LiNbO}_3}$ L i N b O 3 crystals, the THz parametric detection sensitivity was improved by spatially eliminating the broadband noise using an iris between the former and latter stages. With this improvement, the minimum detectable sensitivity at 1.05 THz approached 130 zJ ( ${\rm zJ} = {{10}^{- 21}}\;{\rm J}$ z J = 10 − 21 J ), which is equivalent to 90 photons or less. Furthermore, by combining this detector with an injection-seeded THz-wave parametric generator, which is a high-power, tunable THz-wave source, the THz-wave measurement system achieved a maximum dynamic range of 125 dB.

Terahertz spectroscopy and machine learning algorithm for non-destructive evaluation of protein conformation

Article

Full-text available

Apr 2020
OPT QUANT ELECTRON

Given the condition that protein conformation and activity are highly susceptible to environment factors such as temperature and pH, evaluation of protein conformation and activity is urgently needed in many fields. For example, most protein drugs need a stable and proper environment during production, storage and transportation, and it’s an enormous challenge to maintain protein activity throughout the whole process. Therefore, it’s necessary to ensure the safety and effectiveness of protein drugs by monitoring their activity before use. In our study, we presented an improved method for non-destructive evaluation of protein conformation and biological activity by terahertz spectroscopy combined with t-SNE-XGBoost. Firstly, bovine serum albumin (BSA) samples heated to different temperature were measured with THz-TDS. The obtained results indicated that native-conformation BSA will undergo transient states in the process of temperature induced denaturation. However, for any single given sample, it’s difficult to identify its conformation and activity directly by using the measured raw terahertz data. Therefore, we applied several different algorithms to the raw data for recognition of BSA samples with different conformation and activity induced by temperature. Finally, the models obtained by different algorithms were evaluated by calculating the root mean standard error of prediction (RMSEP) and the correlation coefficient of prediction ($R_p$). The THz-TDS plus t-SNE-XGBoost proved to be an effective non-destructive and label-free method for evaluation of protein conformation and activity. It can provide a new technique in many applications, such as pharmaceutical industry, clinical diagnosis and quality control.

Terahertz tag identifiable through shielding materials using machine learning

Article

Full-text available

Jan 2020
OPT EXPRESS

In recent years, there has been great interest in chipless radio-frequency identification (RFID) devices that work in the terahertz (THz) frequency range. Despite advances in RFID technology, its practical use in the THz range has yet to be realized, due to cost and detection accuracy issues associated with shielding materials. In this study, we propose two types of low-cost THz-tags; one is based on the thickness variation of coated polyethylene and the other on the fingerprint spectra of reagents. In the proposed approach, machine learning, specifically a deep-learning method, is used for high-precision tag identification even with weak signals, or when the spectrum is disturbed by passing through shielding materials. We achieved almost 100% identification accuracy despite using an inexpensive tag placed under thick shielding materials with an attenuation rate of about −50 dB. Furthermore, real-time tag identification was demonstrated by combining a multiwavelength injection-seeded THz parametric generator and a convolutional neural network.

Automatic recognition of breast invasive ductal carcinoma based on terahertz spectroscopy with wavelet packet transform and machine learning

Article

Full-text available

Jan 2020

We demonstrate an automatic recognition strategy for terahertz (THz) pulsed signals of breast invasive ductal carcinoma (IDC) based on a wavelet entropy feature extraction and a machine learning classifier. The wavelet packet transform was implemented into the complexity analysis of the transmission THz signal from a breast tissue sample. A novel index of energy to Shannon entropy ratio (ESER) was proposed to distinguish different tissues. Furthermore, the principal component analysis (PCA) method and machine learning classifier were further adopted and optimized for automatic classification of the THz signal from breast IDC sample. The areas under the receiver operating characteristic curves are all larger than 0.89 for the three adopted classifiers. The best breast IDC recognition performance is with the precision, sensitivity and specificity of 92.85%, 89.66% and 96.67%, respectively. The results demonstrate the effectiveness of the ESER index together with the machine learning classifier for automatically identifying different breast tissues.

Perspective: Terahertz wave parametric generator and its applications

Article

Full-text available

Oct 2018

The injection-seeded terahertz (THz) parametric generator (is-TPG) is one of the most high-power single-longitudinal mode THz-wave sources. Our system is less influenced by scattering, refraction, and multiple reflections by samples because it is a narrow-linewidth source, and the detection area of the THz parametric detector is large. Thus, it is suitable for nondestructive inspection of practical samples in the real world. In 2003, we reported on the development of a mail inspection system that employed a THz parametric oscillator. However, with a dynamic range of less than four orders of magnitude, this system could only identify reagents through thin envelopes. Recently, we succeeded in developing a high-power, highly sensitive THz-wave spectroscopic imaging system with a dynamic range of 100 dB using the is-TPG and a THz parametric detector. Nondestructive inspection of reagents inside thick envelopes and three-dimensional computed tomography of plastics, which attenuate THz-waves by more than 60 dB, were conducted using this system. More recently, we have focused our efforts on a real-time measurement system using a multiwavelength is-TPG, which gives rise to numerous potential applications, given the significantly shorter measurement times. Thus, this system will facilitate the implementation of THz-wave measurements in real-world applications. In this paper, we report on our recent results and provide a perspective on the is-TPG.

Quantitative characterization of bovine serum albumin thin-films using terahertz spectroscopy and machine learning methods

Article

Full-text available

Jun 2018

The development of new spectral analysis methods in bio thin-film detection has generated intense interest in terahertz (THz) spectroscopy and its application in a wide range of fields. In this paper, it is the first time that machine learning methods are applied to the quantitative characterization of bovine serum albumin (BSA) deposited thin-films detected by terahertz time-domain spectroscopy. The spectra data of BSA thin-films prepared by solutions with concentrations ranging from 0.5 to 35 mg/ml are analyzed using the support vector regression method to learn the underlying model of the frequency against the target concentration. The learned mode successfully predicts the concentrations of the unknown test samples with a coefficient of determination R² = 0.97932. Furthermore, aiming to identify the relevance of each frequency to the concentration, the maximal information coefficient statistical analysis is used and the three most discriminating frequencies in THz frequency are identified at 1.2, 1.1 and 0.5 THz respectively, which means a good prediction for BSA concentration can be achieved by using the top three relevant frequencies. Moreover, the top discriminating frequencies are in good agreement with the frequencies predicted by a long-wavelength elastic vibration model for BSA protein.

Multivariate analysis of energy dispersive X-ray diffraction data for the detection of illicit drugs in border control

Article

Full-text available

Jan 2017

A system using energy dispersive X-ray diffraction has been tested to detect the presence of illicit drugs concealed within parcels typical of those which are imported into the UK via postal and courier services. The system was used to record diffraction data from calibration samples of diamorphine (heroin) and common cutting agents and a partial least squares regression model was established between diamorphine concentration and diffraction spectra. Parcels containing various crystalline and amorphous materials, including diamorphine, were then scanned to obtain multiple localised diffraction spectra and to form a hyperspectral image. The calibration model was used for the prediction of diamorphine concentration throughout the volume of parcels and enabled the presence and location of diamorphine to be determined from the visual inspection of concentration maps. This research demonstrates for the first time the potential of an EDXRD system to generate continuous hyperspectral images of real parcels from volume scanning in security applications and introduces the opportunity to explore hyperspectral image analysis in chemical and material identification. However, more work must be done to make the system ready for implementation in border control operations by bringing down the procedure time to operational requirements and by proving the system’s portability.

Multiwavelength terahertz-wave parametric generator for one-pulse spectroscopy

Article

Full-text available

Feb 2017
APEX

In this study, the simultaneous generation of multiwavelength terahertz (THz) waves by an injection-seeded THz parametric generator (is-TPG) was achieved for the first time. The output and stability of the multiwavelength THz waves were equivalent to those of the THz waves generated via a single-wavelength is-TPG. Spatial separation of frequencies and high-sensitivity detection were achieved by converting the THz waves to near-infrared detection beams. Furthermore, one-pulse spectroscopy of saccharides was realized, and a dynamic range of more than 60 dB was obtained. The results demonstrated the possibility of using the is-TPG to significantly shorten the measurement times of spectroscopic systems.

Application of Machine Learning to Terahertz Spectroscopic Imaging of Reagents Hidden By Thick Shielding Materials

Abstract

Recommended publications

Terahertz Parametric Generators and Detectors for Nondestructive Testing Through High-Attenuation Pa...

Wide dynamic range imaging system using three-stage terahertz parametric detector

Development of a Terahertz Parametric Generator/Detector for Nondestructive Testing

Real-time spectroscopic measurement using terahertz parametric generator