ArticlePDF Available

LEQNet: Light Earthquake Deep Neural Network for Earthquake Detection and Phase Picking

July 2022
10:848237

DOI:10.3389/feart.2022.848237

License
CC BY 4.0

Authors:

Developing seismic signal detection and phase picking is an essential step for an on-site early earthquake warning system. A few deep learning approaches have been developed to improve the accuracy of seismic signal detection and phase picking. To run the existing deep learning models, high-throughput computing resources are required. In addition, the deep learning architecture must be optimized for mounting the model in small devices using low-cost sensors for earthquake detection. In this study, we designed a lightweight deep neural network model that operates on a very small device. We reduced the size of the deep learning model using the deeper bottleneck, recursive structure, and depthwise separable convolution. We evaluated our lightweight deep learning model using the Stanford Earthquake Dataset and compared it with EQTransformer. While our model size is reduced by 87.68% compared to EQTransformer, the performance of our model is comparable to that of EQTransformer.

EQTransformer to input signals, detect earthquakes in earthquake waveforms, mark P/S wave arrivals, and output their probabilistic statistics. There are four plots in the output. Light blue lines indicate the earthquake locations in the three upper plots. In the last output plot, a green dotted line indicates the probability of an earthquake, a blue dotted line indicates the probability for the starting point of the P wave, and a red dotted line indicates the probability for the starting point of the S wave.

…

Structure of the EQTransformer. (A) EQTransformer is divided into four main parts: encoder, decoder, ResNet, and LSTM & transformer. (B) Structure of our LEQNet. LEQNet comprises four main parts: an encoder that uses the depthwise separable CNN with recurrent CNN, a decoder that uses the depthwise separable CNN with recurrent CNN, residual Bottleneck CNN, and LSTM & Transformer.

…

LEQNet model architecture. Depthwise separable CNN and recurrent CNN were applied to the encoder and decoder, respectively. The number of layers in the encoder was reduced to five, while the EQTransformer had seven layers in the encoder. Features can be extracted via four blocks using deeper bottleneck and via five blocks in the EQTransformer. Detailed description of each block is provided in the Methods section. The convolutional layers represent the number of kernels, and kr is the kernel size.

…

(A) Comparison of the Netscore, LEQNet, EQTransformer, and Yews. (B) Comparison of detection, S-phase, and P-phase scores between LEQNet and EQTransformer.

…

Number of parameters in the EQTransformer and LEQNet.

…

Figures - available via license: Creative Commons Attribution 4.0 International

Content may be subject to copyright.

Access to this full-text is provided by Frontiers.

Learn more

Content available from Frontiers in Earth Science

This content is subject to copyright.

LEQNet: Light Earthquake Deep

Neural Network for Earthquake

Detection and Phase Picking

Jongseong Lim

†

, Sunghun Jung

†

, Chan JeGal

†

, Gwanghoon Jung

, Jung Ho Yoo

Jin Kyu Gahm

* and Giltae Song

Division of Artiﬁcial Intelligence, Pusan National University, Busan, South Korea,

School of Computer Science and Engineering,

Pusan National University, Busan, South Korea,

Department of Industrial Engineering, Pusan National University, Busan, South

Korea,

FACDAMM, Busan, South Korea

Developing seismic signal detection and phase picking is an essential step for an on-site

early earthquake warning system. A few deep learning approaches have been developed

to improve the accuracy of seismic signal detection and phase picking. To run the existing

deep learning models, high-throughput computing resources are required. In addition, the

deep learning architecture must be optimized for mounting the model in small devices

using low-cost sensors for earthquake detection. In this study, we designed a lightweight

deep neural network model that operates on a very small device. We reduced the size of

the deep learning model using the deeper bottleneck, recursive structure, and depthwise

separable convolution. We evaluated our lightweight deep learning model using the

Stanford Earthquake Dataset and compared it with EQTransformer. While our model

size is reduced by 87.68% compared to EQTransformer, the performance of our model is

comparable to that of EQTransformer.

Keywords: seismic wave, earthquake detection, lightweight technology, deep learning, convolutional neural

network

1 INTRODUCTION

Detection of seismic signals is essential for an on-site early earthquake warning system (EEWS).

Major approaches for seismic signal detection, such as short-time average/long-time average (STA/

LTA) (Allen, 1978), require an analysis of the ambient noise and structural vibration of sites in

advance and the optimization of threshold values via multiple trials and errors for reducing false

detection. Since the on-site EEWS needs multiple seismic sensors, data acquisition, and management

systems, it is important that sensing systems are cost-effective, and detection is performed with no

pre-analysis on on-site ambient noises.

Traditional approaches to detect P/S waves, such as STA/LTA and AIC (Maeda, 1985) techniques,

process seismic signals using short-term and long-term averages for the amplitudes of continuous

seismic waveforms. This process cannot be fully automated because various noises caused by the

geographical locations and environments of the seismic monitoring equipment need to be removed.

To overcome these issues, several studies have detected seismic signals using machine learning and

deep learning (Ross et al., 2019;Zhu et al., 2019;Mousavi et al., 2020). Among these methods,

EQTransformer is considered to perform the best (Mousavi et al., 2020). The EQTransformer

structure consists of one encoder and three decoders that compress and restore data and is connected

using long short-term memory (LSTM) and an attention mechanism. The EQTransformer model

outputs probabilistic statistics for earthquakes and P/S waves Figure 1.

Edited by:

Said Gaci,

Algerian Petroleum Institute, Algeria

Reviewed by:

Stephen Wu,

Institute of Statistical Mathematics,

Japan

Chong Xu,

Ministry of Emergency Management,

China

*Correspondence:

Jin Kyu Gahm

gahmj@pusan.ac.kr

Giltae Song

gsong@pusan.ac.kr

†

These authors have contributed

equally to this work and share ﬁrst

authorship

Specialty section:

This article was submitted to

Geohazards and Georisks,

a section of the journal

Frontiers in Earth Science

Received: 04 January 2022

Accepted: 17 May 2022

Published: 04 July 2022

Citation:

Lim J, Jung S, JeGal C, Jung G,

Yoo JH, Gahm JK and Song G (2022)

LEQNet: Light Earthquake Deep

Neural Network for Earthquake

Detection and Phase Picking.

Front. Earth Sci. 10:848237.

doi: 10.3389/feart.2022.848237

Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482371

ORIGINAL RESEARCH

published: 04 July 2022

doi: 10.3389/feart.2022.848237

Although the EQTransformer model detects earthquakes and

P/S phase pickings with high accuracy, it is almost impossible to

mount the model on small IoT (Internet of Things) devices such

as earthquake detection sensors that equip limited computing

resources.

In this study, we proposed a lightweight seismic signal

detection model called LEQNet to operate even in ultra-small

devices. We applied various lightweight deep learning techniques,

such as the recursive, deeper bottleneck, depthwise, and pointwise

separable structures, to reduce the size of the deep learning

model. We evaluated our lightweight deep learning system

using STEAD datasets and compared them with

EQTransformer. Compared to EQTransformer, our LEQNet

reduced the number of parameters substantially, with no

signiﬁcant performance degradation (Mousavi et al., 2020).

Our LEQNet model can be operated to detect seismic signals

even in small devices. The source code of LEQNet is available at

https://github.com/LEQNet/LEQNet.

2 MATERIALS AND METHODS

2.1 Datasets and Preprocessing

STEAD (Mousavi et al., 2019), a high-quality, large-scale, global

dataset was used in this experiment. STEAD consists of seismic

waveforms with an epicenter of less than 350 km and noise

waveforms without seismic signals. Regarding the

conﬁguration of the data, 120M data are provided, including

450K time series seismic data and noise data for 19,000 h.

One data include 6,000 data points, collected at 100 Hz for

1 min, and three channels, namely, E on the east–west axis, N on

the north–south axis, and Z perpendicular to the ground. Based

on the discussion of EQTransformer, in which the size of the

training dataset did not signiﬁcantly affect the performance,

earthquake data and noise data were under-sampled by 50,000

each. The two classes were in equal ratio to resolve data imbalance

and build a deep learning detection model. The ratio of training,

validation, and test data set was divided into a general ratio of 8:

1:1.

To add more diverse situations to the training data, we applied

data aggregation techniques (Van Dyk and Meng, 2001).

Augmented techniques include adding events, moving

sequences in parallel, adding noise, deleting channels, and

adding empty sequences with random probabilities.

2.2 Performance Evaluation

2.2.1 Confusion Matrix

The confusion matrix is typically used as an evaluation index

in binary classiﬁcation problems. When real seismic waves are

detected as earthquakes, it is regarded as a true positive (TP).

Conversely, if they are not detected, it is a false negative (FN).

Noise data are true negative (TN) when predicted as noise and

false positive (FP) when called earthquakes. For phase picking,

it is counted as TP when the actual arrival time of P and S

waves and time of the model are within 0.5 s. When actual P

and S waves do not exist and our model does not call P and S

arrivals, it is regarded as TN. We calculated precision and

recall, as shown in Eqs 1 and 2.F1-score,thatis,theharmonic

average of precision and recall, was calculated as presented in

Eq. 3.

precision TP

TP +FP,(1)

recall TP

TP +FN,(2)

F1−score 2×precision ×recall

precision +recall.(3)

FIGURE 1 | EQTransformer to input signals, detect earthquakes in earthquake waveforms, mark P/S wave arrivals, and output their probabilistic statistics. There

are four plots in the output. Light blue lines indicate the earthquake locations in the three upper plots. In the last output plot, a green dotted line indicates the probability of

an earthquake, a blue dotted line indicates the probability for the starting point of the P wave, and a red dotted line indicates the probability for the starting point of the S

wave.

Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482372

Lim et al. Light Earthquake Deep Neural Network

2.2.2 Information Density

Information density is used to evaluate the lightweight

technology. When the accuracy of the model is denoted as

a(N), the number of parameters of the model as p(N) and the

information density D(N) is calculated as shown in Eq. 4. This

evaluation considers the number of parameters and accuracy.

()

aN

()

.(4)

2.2.3 Netscore

To consider the amount of computation and inference speed, we

measured Netscore Ω(N), as proposed in Eq. 5.(Wong, 2019).

The multiply-accumulate of the model is denoted by m(N). α,β,

and γare coefﬁcients for controlling the inﬂuence of network

accuracy, model complexity, and computational complexity,

respectively. They are usually set to α=2,β= 0.5, and γ=

0.5 (Wong, 2019). When the model became lighter, it had a

smaller number of parameters and fewer operations. If the

performance of the model remains unchanged, the Netscore

increases. The higher the value of Netscore, the lighter is the

model.

ΩN

()

20 × log aN

()

β×mN

()



.(5)

2.3 Baseline Model EQTransformer

Figure 2A illustrates the structure of EQTransformer that was

designed with encoder, decoder, ResNet, and LSTM &

transformer. The encoder consists of seven convolutional

neural network (CNN) layers. This reduces STEAD seismic

signal data into a lower dimension in terms of seismic signal

length and allows to reduce the number of parameters in the

model. The decoder consists of seven CNN layers. This decoder

restores the extracted features to the original dimension. In the

EQTransformer, there are three decoders for P, S, and earthquake

detection. The ResNet part of the EQTransformer consists of ﬁve

residual blocks. Each residual block outputs the addition of input

and output in two linearly connected CNN layers. This allows to

avoid performance degradation caused by a high number of

layers. LSTM learns sequential features extracted by the

encoder. The transformer carries the characteristics of the

seismic signals to each decoder (for P, S, and earthquake

detection). EQTransformer comprises several layers and

learning parameters, which exceed 300k as it focuses on

maximizing the predictive accuracy.

Figure 2A and Table 1 show the structure of EQTransformer

that uses a total of 323,063 parameters. A total of 34,672

parameters were used for the encoder, 109,696 for ResNet, and

42,372 for LSTM & transformer. As EQTransformer uses

separate decoders in earthquake, P-wave, and S-wave

detections, 136,323 parameters are used. The ratio of the total

number of parameters is 11% for the encoder, 34% for ResNet,

13% for LSTM & transformer, and 42% for the decoder. The

layers with the most parameters in the structure include an

encoder, a decoder, and ResNet composed of CNN layer-based

FIGURE 2 | Structure of the EQTransformer. (A) EQTransformer is divided into four main parts: encoder, decoder, ResNet, and LSTM & transformer. (B) Structure

of our LEQNet. LEQNet comprises four main parts: an encoder that uses the depthwise separable CNN with recurrent CNN, a decoder that uses the depthwise

separable CNN with recurrent CNN, residual Bottleneck CNN, and LSTM & Transformer.

TABLE 1 | Number of parameters in the EQTransformer and LEQNet.

Section EQTransformer LEQNet

Encoder 34,672 1,289

ResNet 109,696 3,712

Decoder 136,323 4,131

LSTM & transformer 42,372 30,644

Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482373

Lim et al. Light Earthquake Deep Neural Network

structures, accounting for 87% of the total model parameters. The

lightweight methods used in this study simplify the CNN-based

layer structure with no signiﬁcant performance degradation.

2.4 Model Lightweight Techniques

2.4.1 Depthwise Separable CNN

Depthwise separable convolution is a CNN that combines the

depthwise and pointwise methods. It was ﬁrst introduced in

mobile net v1 (Howard et al., 2017)(Chollet, 2017).

Depthwise convolution extracts channel features by

performing a convolution operation per channel.

Consequently, the input and output channels comprise the

same number of channels. In addition, computations are

exponentially reduced by skipping convolution between

channels. Pointwise convolution compresses data at the same

location inside each channel. It extracts the features between

channels and controls the number of channels between the input

and output. (Hua et al., 2018).

When convolution operations in the CNN perform

redundantly among multiple channels, the depthwise separable

convolution reduces the huge amount of computation required

among the channels. (Paoletti et al., 2020).

2.4.2 Deeper Bottleneck Architecture

Deeper bottleneck architecture was proposed in ResNet to reduce

training time (He et al., 2016). In this method, an additional

convolution is performed before the convolution to reduce the

size of the input channel. After convolution with input channels

of reduced size, a convolution operation to restore the number of

channels is executed. When one residual block consists of two

layers in the EQTransformer, we changed it to three layers. While

we have an additional layer, we used a kernel of size one in the

ﬁrst and last layers. This reduces the number of parameters

substantially in the intermediate layer that performs the actual

operation.

2.4.3 Recurrent CNN

Recurrent CNN is a concept that reuses the output of the CNN

layer. To use the recurrent CNN, the number of channels in input

and output needs to be equal with the same kernel shape. The

layers also need to be continuous. We applied the recurrent CNN

to the encoder and the decoder, respectively. This reduced the

model size by reusing parameters.

The use of the recurrent CNN has an additional beneﬁtof

decreasing memory access costs needed for loading initial

parameters, increasing nonlinearity, and updating gradients in

multiple parts for learning the model (Köpüklü et al., 2019).

3 RESULTS

We learned a detection model using 50,000 sets of earthquake

data and noise data sampled from STEAD and repeated the

learning process for 10 epochs. The threshold values were set for

the probability of seismic detection and P- and S-phase pickings

as detection = 0.5, p= 0.3, and S = 0.3, which were equal to

EQTransformer.

3.1 LEQNet

Figure 2B illustrates the structure of LEQNet that was designed

using the depthwise separable CNN, deeper bottleneck, and

recurrent CNN. Depthwise separable CNN was applied to the

encoder and decoder which were used in the feature compression

and decompression steps. In addition, memory access time in the

encoder and decoder was reduced using the recurrent CNN layers

in LEQNet. These recurrent CNN layers also decreased the

inference time and model size via the reuse of existing

parameters.

Table 1 summarizes the reduction in the number of

parameters. The number of parameters in the encoder was

reduced from 34,672 to 1,289, and from 136,323 to 4,131 in

the decoder. ResNet for central feature extraction in the

EQTransformer was replaced by the deeper bottleneck CNN

in our LEQNet. This reduced the number of parameters from

109,696 to 3,712. Although there was no structural change in

LSTM & transformer, the decrease in the number of input

parameters itself reduced the number of parameters from

42,372 to 30,644 in LSTM & transformer.

Figure 3 illustrates the LEQNet architecture. Each encoder

and decoder of LEQNet consists of ﬁve layers, and the deeper

bottleneck consists of four layers; however, EQTranformer has

seven layers in each encoder and decoder and ﬁve layers for

ResNet. The number of output channels in the encoder in the

LEQNet has changed to 32, similar to the number of input

channels, while the number of channels in the EQTransformer

was 64.

3.2 Model Size Reduction

LEQNet reduces the number of parameters and computation

time substantially via the lightweight techniques (2 and

Figure 4A). The number of parameters for the LEQNet in this

study decreased by about 88% compared to the EQTransformer.

The amount of computation ﬂoating-point operations (FLOPs)

decreased from 79,687,040 to 5,271,488 (Table. 2). These results

reduced the model size by about 79% compared to the

EQTransformer (from 4.5MB to 0.94 MB). This suggests that

our LEQNet model can operate in small devices with tiny

memory.

Table 2 shows the degree of model size reduction for the

EQTransformer, Yews (Zhu et al., 2019) earthquake detection

deep learning model, and LEQNet in terms of information

density and Netscore. As the accuracy between these two

models does not differ signiﬁcantly, the information density

score only depends on the number of parameters. LEQNet has

improved the information density by about 8.09 times compared

to EQTransformer.

Figure 4A shows Netscore, which measures the amount of

computation according to FLOPs. When Netscore coefﬁcients are

usually set to α=2,β= 0.5, and γ= 0.5, we changed γto 0.1

because the Netscore was measured according to FLOPs in CNN

layers. The theoretical maximum value of Netscore is 80, while

EQTransformer showed 8.29 and Yews 9.82 in Netscore, LEQNet

scored 20.30 (see Figure 4A), indicating that our LEQNet

improves the Netscore by 2.44 times compared to the

EQTransformer.

Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482374

Lim et al. Light Earthquake Deep Neural Network

3.3 Detection Performance

We evaluated the detection performance of LEQNet and

compared it with other existing methods, such as

EQTransformer, PhaseNet (Ross et al., 2019), Yews, and STA/

LTA (Table 3). F1-scores for earthquake detection, P-phase

picking, and S-phase picking were also compared.

EQTransformer showed F1-scores of 1.0, 0.99, and 0.98 in

these three tasks. F1-scores (0.99, 0.98, and 0.97) of our

LEQNet were almost similar to those of EQTransformer. Our

F1-scores were higher than those of PhaseNet by 0.02 in P-phase

picking and by 0.03 in S-phase picking, higher than those of Yews

by 0.37 in P-phase picking and by 0.31 in S-phase picking, and by

FIGURE 3 | LEQNet model architecture. Depthwise separable CNN and recurrent CNN were applied to the encoder and decoder, respectively. The number of

layers in the encoder was reduced to ﬁve, while the EQTransformer had seven layers in the encoder. Features can be extracted via four blocks using deeper bottleneck

and via ﬁve blocks in the EQTransformer. Detailed description of each block is provided in the Methods section. The convolutional layers represent the number of kernels,

and kr is the kernel size.

Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482375

Lim et al. Light Earthquake Deep Neural Network

0.14 in earthquake detection. Interestingly, FLOPs in our LEQNet

was similar to those in Yews, although our LEQNet performed

better than Yews in terms of the F1-score.

4 DISCUSSION

EQTransformer and PhaseNet improved the performance of

earthquake detection, P-phase picking, and S-phase picking

substantially using deep neural networks. These models are

difﬁcult to operate on low-cost embedded devices for earthquake

detection. The models should be processed in low-cost embedded

devices using EEWS. Therefore, we aimed to construct a lightweight

model while maintaining high performance.

Compared to existing deep neural network models, the size

of our LEQNet model is reduced while maintaining high

performance, as supported by the ﬁnding of this study.

Remarkably, we reduced the number of parameters by

87.68% and FLOPs in CNN layers by 93.38%, compared to

the EQTransformer. This model optimization was achieved by

applying the depthwise separable CNN and recurrent CNN

and by removing some layers in the encoder and decoder,

which occupy 53% of the model. In addition, decreasing the

number of output channels in the ResNet also contributed to

reducing the number of parameters in our LEQNet. However,

performance degradation was observed when some

parameters were reduced in the encoder and decoder. To

resolve this issue, we applied the deeper bottleneck

architecture to our model, as in the EQTransformer. This

increased the information density and Netscore of our

LEQNet model: information density increased from 3.06e-4

to 24.76e-4, 8.09 times higher than that of EQTransformer,

and Netscore increased from 8.29 to 20.30, 2.44 times higher

than that of EQTransformer.

Although we reduced the model size substantially, there

remains room for improvements. Our model needs to run on

the traditional TensorFlow-like environments (Abadi et al.,

2016), which may not be suitable for small devices. Therefore,

we aimed to address this issue using TensorFlow Lite (Google,

2020) in future works.

FIGURE 4 | (A) Comparison of the Netscore, LEQNet, EQTransformer, and Yews. (B) Comparison of detection, S-phase, and P-phase scores between LEQNet

and EQTransformer.

TABLE 2 | Model size comparison results.

Model Parameter FLOP Information density Netscore

EQTransformer 323,063 79,687,040 3.06e-4 8.29

Yews 108,691 7,806,976 6.47e-4 9.82

LEQNet 39,776 5,271,488 24.76e-4 20.30

TABLE 3 | Model result of detection and P-phase and S-phase scores.

Model Detection F1-score P-phase F1-score S-phase F1-score Training data Training size Reference

EQTransformer 1.0 0.99 0.98 Global 1.2M Mousavi et al. (2020)

PhaseNet - 0.96 0.94 North California 780K Ross et al. (2019)

CDRP (DetNet) 0.94 0.90 0.95 China and Japan 30K Zhou et al. (2019)

Yews 0.85 0.61 0.66 Taiwan 1.4M Zhu et al. (2019)

STA/LTA 0.95 - - - - Allen, (1978)

LEQNet 0.99 0.98 0.97 Global 100K

Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482376

Lim et al. Light Earthquake Deep Neural Network

5 CONCLUSION

A model for seismic signal detection for the EEWS needs to

operate in low-cost sensors in an environment without the

support of a server and interconnection with other

management systems. As the EQTransformer has a model size

of about 4.5 MB and needs a lot of computation, it is difﬁcult to

operate in limited environments such as low-power wireless

communication devices and Arduino.

To resolve this issue, we developed LEQNet using lightweight

deep learning techniques. LEQNet reduced the number of

parameters of the detection model by 88% and the model size

by 79% (from 4.5 MB to 0.94 MB), as compared to the

EQTransformer, with no signiﬁcant performance degradation.

Our model can be mounted on IoT devices that include

embedded RAM of less than 1 MB with no special external

management systems.

Although this LEQNet model reduces the model size drastically

compared to the EQTransformer, there remains room for

improvement. Recently, tiny AI models, which can operate in

smaller devices with memory less than 256 kB, are in high

demand. Therefore, our model needs to be reduced more than the

current model size to meet this requirement. Our LEQNet package is

easy to install and is available for public use. We believe that this

lightweight earthquake deep neural network can be a useful tool in a

community burdened with geohazards and georisks.

DATA AVAILABILITY STATEMENT

Publicly available datasets were analyzed in this study. These data

can be found at: https://github.com/smousavi05/STEAD.

AUTHOR CONTRIBUTIONS

JL, SJ, CJ, JG, and GS contributed to the conception and design of the

study. JL, SJ, and CJ wrote the ﬁrst draft of the manuscript. GS, JG,

GJ, and JY wrote sections of the manuscript. All authors contributed

to manuscript revision, read, and approved the submitted version.

FUNDING

This work was supported by the Institute of Information and

Communications Technology Planning and Evaluation (IITP)

grant funded by the Korean government (MSIT) (No. 2020-0-

01450, Artiﬁcial Intelligence Convergence Research Center

(Pusan National University)) to GS.

REFERENCES

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., et al. (2016).

“Tensorﬂow: A System for Large-Scale Machine Learning,”in 12th \{USENIX\}

Symposium on Operating Systems Design and Implementation (\{OSDI\} 16),

265–283.

Allen, R. V. (1978). Automatic Earthquake Recognition and Timing from Single

Traces. Bull. Seismol. Soc. Am. 68, 1521–1532. doi:10.1785/bssa0680051521

Chollet, F. (2017). “Xception: Deep Learning with Depthwise Separable

Convolutions,”in Proceedings of the IEEE conference on computer vision

and pattern recognition, 1251–1258. doi:10.1109/cvpr.2017.195

Google, L. L. C. (2020). Tensorﬂow Lite. Available at: https://www.tensorﬂow.org/

lite (Accessed Febuary 26, 2022).

He, K., Zhang, X., Ren, S., and Sun, J. (2016). “Deep Residual Learning for Image

Recognition,”in Proceedings of the IEEE conference on computer vision and

pattern recognition, 770–778. doi:10.1109/cvpr.2016.90

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al.

(2017). Mobilenets: Efﬁcient Convolutional Neural Networks for Mobile Vision

Applications. arXiv Prepr. arXiv:1704.04861.

Hua, B.-S., Tran, M.-K., and Yeung, S.-K. (2018). “Pointwise Convolutional Neural

Networks,”in Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition, 984–993. doi:10.1109/cvpr.2018.00109

Köpüklü, O., Babaee, M., Hörmann, S., and Rigoll, G. (2019). “Convolutional

Neural Networks with Layer Reuse,”in 2019 IEEE International Conference on

Image Processing (ICIP) (Piscataway, NJ, USA: IEEE), 345–349. doi:10.1109/

icip.2019.8802998

Maeda, N. (1985). A Method for Reading and Checking Phase Time in Auto-

Processing System of Seismic Wave Data. J. Seismol. Soc. Jpn. 38, 365–379.

doi:10.4294/zisin1948.38.3_365

Mousavi, S. M., Ellsworth, W. L., Zhu, W., Chuang, L. Y., and Beroza, G. C. (2020).

Earthquake Transformer-An Attentive Deep-Learning Model for Simultaneous

Earthquake Detection and Phase Picking. Nat. Commun. 11, 3952. doi:10.1038/

s41467-020-17591-w

[Dataset] Mousavi, S. M., Sheng, Y., Zhu, W., and Beroza, G. C. (2019). Stanford

Earthquake Dataset (Stead): A Global Data Set of Seismic Signals for Ai. IEEE

Access 7, 179464–179476. doi:10.1109/ACCESS.2019.2947848

Paoletti,M.E.,Haut,J.M.,Tao,X.,Plaza,J.,andPlaza,A.(2020).Flop-reduction

through Memory Allocations within Cnn for Hyperspectral Image Classiﬁcation.

IEEE Trans. Geoscience Remote Sens. 59, 5938. doi:10.1109/TGRS.2020.3024730

Ross, Z. E., Yue, Y., Meier, M. A., Hauksson, E., and Heaton, T. H. (2019).

Phaselink: A Deep Learning Approach to Seismic Phase Association. J. Geophys.

Res. Solid Earth 124, 856–869. doi:10.1029/2018jb016674

Van Dyk, D. A., and Meng, X.-L. (2001). The Art of Data Augmentation.

J. Comput. Graph. Statistics 10, 1–50. doi:10.1198/10618600152418584

Wong,A.(2019).“Netscore: Towards Universal Metrics for Large-Scale Performance

Analysis of Deep Neural Networks for Practical On-Device Edge Usage,”in

International Conference on Image Analysis and Recognition (Berlin, Germany:

Springer), 15–26. doi:10.1007/978-3-030-27272-2_2

Zhou, Y., Yue, H., Kong, Q., and Zhou, S. (2019). Hybrid Event Detection and

Phase-Picking Algorithm Using Convolutional and Recurrent Neural

Networks. Seismol. Res. Lett. 90, 1079–1087. doi:10.1785/0220180319

Zhu, L., Peng, Z., McClellan, J., Li, C., Yao, D., Li, Z., et al. (2019). Deep Learning for Seismic

Phase Detection and Picking in the Aftershock Zone of 2008 M7.9 Wenchuan

Earthquake. Phys. Earth Planet. Interiors 293, 106261. doi:10.1016/j.pepi.2019.05.004

Conﬂict of Interest: Author JY was employed by FACDAMM.

The remaining authors declare that the research was conducted in the absence of

any commercial or ﬁnancial relationships that could be construed as a potential

conﬂict of interest.

Publisher’s Note: All claims expressed in this article are solely those of the authors

and do not necessarily represent those of their afﬁliated organizations, or those of

the publisher, the editors, and the reviewers. Any product that may be evaluated in

this article, or claim that may be made by its manufacturer, is not guaranteed or

endorsed by the publisher.

access article distributed under the terms of the Creative Commons Attribution License

(CC BY). The use, distribution or reproduction in other forums is permitted, provided the

original author(s) and the copyright owner(s) are credited and that the original

publication in this journal is cited, in accordance with accepted academic practice.

No use, distribution or reproduction is permitted which does not comply with these terms.

Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482377

Lim et al. Light Earthquake Deep Neural Network

Available via license: CC BY

Content may be subject to copyright.

Deep learning-based modeling of the cyclic behavior of replaceable fuse buckling-restrained braces (BRBs)

Article

May 2024

This paper presents a novel approach for predicting the behavior of buckling restrained braces (BRBs) using long short-term memory (LSTM) and residual neural network (ResNet) models. Nonlinear finite element analysis (NLFEA) usually requires 3D modeling of complex parts and assigning various material and simulation properties before conducting the analysis, which can be time-consuming and computationally costly. LSTM networks are particularly well-suited for modeling sequential data, such as time series, and ResNet is a discriminative deep learning-based model that can handle deeper architectures without the issue of vanishing gradient. Developing a general LSTM model requires a comprehensive database for training, cross-validation, and blind testing. Even if such a database were available, concatenating data from different BRB specimens would not be feasible when modeling a data series. Such a method would intertwine the features of the individual specimens, rendering the data useless for machine-learning purposes. On the other hand, a well-trained LSTM can predict cyclic behavior for specimens that resemble the training data. As such, a group of well-trained models can produce superior predictions when used on individually familiar data. Hence, there is a need for a prescreening classification stage. The framework presented herein utilizes an AI classifier (ResNet) that automatically recognizes the BRB specimen type of the input data and directs it to the appropriate LSTM model for prediction. The presented framework demonstrates excellent predictions with an accuracy greater than 99%. It is customizable, adaptable, and scalable to include different BRB specimens. The input data consisted of the displacement of the BRB specimens based on the AISC 341–10 qualification recommendations, while the output was the corresponding force (hysteretic response to cyclic loading). The proposed framework can be valuable in earthquake engineering applications as it demonstrates superior predictions.

LightEQ: On-Device Earthquake Detection with Embedded Machine Learning

Conference Paper

May 2023

1D Convolutional Seismic Event Classification Method Based on Attention Mechanism and Light Inception Block

Article

Jun 2024
APPL GEOPHYS

Waveforms of artificially induced explosions and collapse events recorded by the seismic network share similarities with natural earthquakes. Failure to identify and screen them in a timely manner can introduce confusion into the earthquake catalog established using these recordings, thereby impacting future seismological research. Therefore, the identification and separation of natural earthquakes from continuous seismic signals contribute to the monitoring and early warning of destructive tectonic earthquakes. A 1D convolutional neural network (CNN) is proposed for seismic event classification using an efficient channel attention mechanism and an improved light inception block. A total of 9937 seismic sample records are obtained after waveform interception, filtering, and normalization. The proposed model can obtain better classification performance than other major existing methods, exhibiting 96.79% overall classification accuracy and 96.73%, 94.85%, and 96.35% classification accuracy for natural seismic events, collapse events, and blasting events, respectively. Meanwhile, the proposed model is lighter than the 2D convolutional and common inception networks. We also apply the proposed model to the seismic data recorded at the University of Utah seismograph stations and compare its performance with that of the CNN-waveform model.

A Lightweight Network for Seismic Phase Picking on Embedded Systems

Article

Full-text available

Jan 2024

Phase picking is a critical task in seismic data processing, where deep learning methods have been applied to enhance its accuracy. While lightweight deep learning networks have been optimized for edge computing devices, there is a lack of networks developed explicitly for embedded systems. This paper presents a seismic phase picking model, a hybrid network integrating convolutional neural networks and Transformer, designed for embedded systems. Optimizing network parameters and computational resources, the model significantly reduces resource consumption while guaranteeing accuracy. It employs a multi-branch architecture. Specifically, the global branch employs a modified self-attention mechanism, effectively extracting global features through shared contextual information. The local branch retains local information from the input features. Such a multi-branch architecture facilitates effective interaction between global features and local details, thereby more efficiently capturing the relationships among features. The model can be configured into variants with different sizes to match various embedded systems. This research evaluated the model using the Stanford Earthquake Dataset, achieving a precision of 99.9% for the P-phase and 99.3% for the S-phase. On Raspberry Pi, the model reduced inference time by 58.1% compared to the earthquake transformer while maintaining comparable detection performance.

LimitNet: Progressive, Content-Aware Image Offloading for Extremely Weak Devices & Networks

Conference Paper

Jun 2024

Identifying Earthquakes in Low-Cost Sensor Signals Contaminated with Vehicular Noise

Article

Full-text available

Sep 2023

The importance of monitoring earthquakes for disaster management, public safety, and scientific research can hardly be overstated. The emergence of low-cost seismic sensors offers potential for widespread deployment due to their affordability. Nevertheless, vehicular noise in low-cost seismic sensors presents as a significant challenge in urban environments where such sensors are often deployed. In order to address these challenges, this work proposes the use of an amalgamated deep neural network constituent of a DNN trained on earthquake signals from professional sensory equipment as well as a DNN trained on vehicular signals from low-cost sensors for the purpose of earthquake identification in signals from low-cost sensors contaminated with vehicular noise. To this end, we present low-cost seismic sensory equipment and three discrete datasets that—when the proposed methodology is applied—are shown to significantly outperform a generic stochastic differential model in terms of effectiveness and efficiency.

Real-time arrival picking of rock microfracture signals based on convolutional-recurrent neural network and its engineering application

Article

Aug 2023

FLOP-Reduction Through Memory Allocations Within CNN for Hyperspectral Image Classification

Article

Full-text available

Sep 2020

Convolutional neural networks (CNNs) have proven to be a powerful tool for the classification of hyperspectral images (HSIs). The CNN kernels are able to naturally include spatial information to smooth out the spectral variability and the noise present in HSI data. However, these kernels are composed of a large number of learning parameters that must be correctly adjusted to achieve good performance. This forces the model to consume a large amount of training data, being prone to overfitting when limited labeled samples are available. In addition, the execution of kernels is computationally very expensive, increasing quadratically with respect to the size of the convolution filter. This significantly reduces the performance of the model. To overcome the aforementioned limitations, this work presents a new few-parameter CNN (based on shift operations) for HSI classification that dramatically reduces both the number of parameters and the computational complexity of the model in terms of floating-point operations (FLOPs). The operational module combines a shift kernel (which adjusts the input data in particular directions without involving any parameters nor FLOPs) with pointwise convolutions that perform the feature extraction stage. The newly developed shift-based CNN has been employed to conduct HSI classification over five widely used and challenging data sets, achieving very promising results in terms of computational performance and classification accuracy.

Earthquake transformer—an attentive deep-learning model for simultaneous earthquake detection and phase picking

Article

Full-text available

Aug 2020

Earthquake signal detection and seismic phase picking are challenging tasks in the processing of noisy data and the monitoring of microearthquakes. Here we present a global deep-learning model for simultaneous earthquake detection and phase picking. Performing these two related tasks in tandem improves model performance in each individual task by combining information in phases and in the full waveform of earthquake signals by using a hierarchical attention mechanism. We show that our model outperforms previous deep-learning and traditional phase-picking and detection algorithms. Applying our model to 5 weeks of continuous data recorded during 2000 Tottori earthquakes in Japan, we were able to detect and locate two times more earthquakes using only a portion (less than 1/3) of seismic stations. Our model picks P and S phases with precision close to manual picks by human analysts; however, its high efficiency and higher sensitivity can result in detecting and characterizing more and smaller events. The authors here present a deep learning model that simultaneously detects earthquake signals and measures seismic-phase arrival times. The model performs particularly well for cases with high background noise and the challenging task of picking the S wave arrival.

STanford EArthquake Dataset (STEAD): A Global Data Set of Seismic Signals for AI

Article

Full-text available

Oct 2019

Seismology is a data rich and data-driven science. Application of machine learning for gaining new insights from seismic data is a rapidly evolving sub-field of seismology. The availability of a large amount of seismic data and computational resources, together with the development of advanced techniques can foster more robust models and algorithms to process and analyze seismic signals. Known examples or labeled data sets, are the essential requisite for building supervised models. Seismology has labeled data, but the reliability of those labels is highly variable, and the lack of high-quality labeled data sets to serve as ground truth as well as the lack of standard benchmarks are obstacles to more rapid progress. In this paper we present a high-quality, large-scale, and global data set of local earthquake and non-earthquake signals recorded by seismic instruments. The data set in its current state contains two categories: (1) local earthquake waveforms (recorded at “local” distances within 350 km of earthquakes) and (2) seismic noise waveforms that are free of earthquake signals. Together these data comprise ∼ 1.2 million time series or more than 19,000 hours of seismic signal recordings. Constructing such a large-scale database with reliable labels is a challenging task. Here, we present the properties of the data set, describe the data collection, quality control procedures, and processing steps we undertook to insure accurate labeling, and discuss potential applications. We hope that the scale and accuracy of STEAD presents new and unparalleled opportunities to researchers in the seismological community and beyond.

Deep learning for seismic phase detection and picking in the aftershock zone of 2008 M7.9 Wenchuan Earthquake

Article

Full-text available

May 2019
PHYS EARTH PLANET IN

The increasing volume of seismic data from long-term continuous monitoring motivates the development of algorithms based on convolutional neural network (CNN) for faster and more reliable phase detection and picking. However, many less studied regions lack a significant amount of labeled events needed for traditional CNN approaches. In this paper, we present a CNN-based Phase-Identification Classifier (CPIC) designed for phase detection and picking on small to medium sized training datasets. When trained on 30,146 labeled phases and applied to one-month of continuous recordings during the aftershock sequences of the 2008 M_W 7.9 Wenchuan Earthquake in Sichuan, China, CPIC detects 97.5% of the manually picked phases in the standard catalog and predicts their arrival times with a five-times improvement over the ObsPy AR picker. In addition, unlike other CNN-based approaches that require millions of training samples, when the off-line training set size of CPIC is reduced to only a few thousand training samples the accuracy stays above 95%. The online implementation of CPIC takes less than 12 hours to pick arrivals in 31-day recordings on 14 stations. In addition to the catalog phases manually picked by analysts, CPIC finds more phases for existing events and new events missed in the catalog. Among those additional detections, some are confirmed by a matched filter method while others require further investigation. Finally, when tested on a small dataset from a different region (Oklahoma, US), CPIC achieves 97% accuracy after fine tuning only the fully connected layer of the model. This result suggests that the CPIC developed in this study can be used to identify and pick P/S arrivals in other regions with no or minimum labeled phases.

Hybrid Event Detection and Phase‐Picking Algorithm Using Convolutional and Recurrent Neural Networks

Article

Full-text available

Apr 2019

We developed a hybrid algorithm using both convolutional and recurrent neural networks (CNNs and RNNs, respectively) to pick phases from archived continuous waveforms in two steps. First, an eight‐layer CNN is trained to detect earthquake events from 30‐second‐long three‐component seismograms. The event seismograms are then sent to a two‐layer bidirectional RNN to pick P‐ and S‐arrival times. The data for training and validation and testing of the networks are obtained from the continuous waveforms of 16 stations recording the aftershock sequence of the 2008 Wenchuan earthquake. The augmented training set has 135,966 P–S‐wave arrival‐time pairs. The CNN achieved 94% and 98% hit rate for event and noise segments in the test set, respectively. The RNN picking accuracies for P and S waves are −0.03±0.48 (mean error ± standard deviation) and 0.03±0.56 s⁠, respectively.

Pointwise Convolutional Neural Networks

Conference Paper

Full-text available

Jun 2018

PhaseLink: A Deep Learning Approach to Seismic Phase Association

Article

Full-text available

Jan 2019

Seismic phase association is a fundamental task in seismology that pertains to linking together phase detections on different sensors that originate from a common earthquake. It is widely employed to detect earthquakes on permanent and temporary seismic networks and underlies most seismicity catalogs produced around the world. This task can be challenging because the number of sources is unknown, events frequently overlap in time, or can occur simultaneously in different parts of a network. We present PhaseLink, a framework based on recent advances in deep learning for grid‐free earthquake phase association. Our approach learns to link phases together that share a common origin and is trained entirely on millions of synthetic sequences of P and S wave arrival times generated using a 1‐D velocity model. Our approach is simple to implement for any tectonic regime, suitable for real‐time processing, and can naturally incorporate errors in arrival time picks. Rather than tuning a set of ad hoc hyperparameters to improve performance, PhaseLink can be improved by simply adding examples of problematic cases to the training data set. We demonstrate the state‐of‐the‐art performance of PhaseLink on a challenging sequence from southern California and synthesized sequences from Japan designed to test the point at which the method fails. For the examined data sets, PhaseLink can precisely associate phases to events that occur only ∼12 s apart in origin time. This approach is expected to improve the resolution of seismicity catalogs, add stability to real‐time seismic monitoring, and streamline automated processing of large seismic data sets.

Convolutional Neural Networks with Layer Reuse

Conference Paper

Sep 2019

NetScore: Towards Universal Metrics for Large-Scale Performance Analysis of Deep Neural Networks for Practical On-Device Edge Usage

Chapter

Aug 2019

Alexander Wong

Much of the focus in the design of deep neural networks has been on improving accuracy, leading to more powerful yet highly complex network architectures that are difficult to deploy in practical scenarios, particularly on edge devices such as mobile and other consumer devices given their high computational and memory requirements. As a result, there has been a recent interest in the design of quantitative metrics for evaluating deep neural networks that accounts for more than just model accuracy as the sole indicator of network performance. In this study, we continue the conversation towards universal metrics for evaluating the performance of deep neural networks for practical on-device edge usage. In particular, we propose a new balanced metric called NetScore, which is designed specifically to provide a quantitative assessment of the balance between accuracy, computational complexity, and network architecture complexity of a deep neural network, which is important for on-device edge operation. In what is one of the largest comparative analysis between deep neural networks in literature, the NetScore metric, the top-1 accuracy metric, and the popular information density metric were compared across a diverse set of 60 different deep convolutional neural networks for image classification on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC 2012) dataset. The evaluation results across these three metrics for this diverse set of networks are presented in this study to act as a reference guide for practitioners in the field. The proposed NetScore metric, along with the other tested metrics, are by no means perfect, but the hope is to push the conversation towards better universal metrics for evaluating deep neural networks for use in practical on-device edge scenarios to help guide practitioners in model design for such scenarios.

Xception: Deep Learning with Depthwise Separable Convolutions

Conference Paper