ArticlePDF Available

LEQNet: Light Earthquake Deep Neural Network for Earthquake Detection and Phase Picking

Frontiers
Frontiers in Earth Science
Authors:

Abstract and Figures

Developing seismic signal detection and phase picking is an essential step for an on-site early earthquake warning system. A few deep learning approaches have been developed to improve the accuracy of seismic signal detection and phase picking. To run the existing deep learning models, high-throughput computing resources are required. In addition, the deep learning architecture must be optimized for mounting the model in small devices using low-cost sensors for earthquake detection. In this study, we designed a lightweight deep neural network model that operates on a very small device. We reduced the size of the deep learning model using the deeper bottleneck, recursive structure, and depthwise separable convolution. We evaluated our lightweight deep learning model using the Stanford Earthquake Dataset and compared it with EQTransformer. While our model size is reduced by 87.68% compared to EQTransformer, the performance of our model is comparable to that of EQTransformer.
Content may be subject to copyright.
LEQNet: Light Earthquake Deep
Neural Network for Earthquake
Detection and Phase Picking
Jongseong Lim
1
, Sunghun Jung
2
, Chan JeGal
2
, Gwanghoon Jung
3
, Jung Ho Yoo
4
,
Jin Kyu Gahm
1
,
2
* and Giltae Song
1
,
2
*
1
Division of Articial Intelligence, Pusan National University, Busan, South Korea,
2
School of Computer Science and Engineering,
Pusan National University, Busan, South Korea,
3
Department of Industrial Engineering, Pusan National University, Busan, South
Korea,
4
FACDAMM, Busan, South Korea
Developing seismic signal detection and phase picking is an essential step for an on-site
early earthquake warning system. A few deep learning approaches have been developed
to improve the accuracy of seismic signal detection and phase picking. To run the existing
deep learning models, high-throughput computing resources are required. In addition, the
deep learning architecture must be optimized for mounting the model in small devices
using low-cost sensors for earthquake detection. In this study, we designed a lightweight
deep neural network model that operates on a very small device. We reduced the size of
the deep learning model using the deeper bottleneck, recursive structure, and depthwise
separable convolution. We evaluated our lightweight deep learning model using the
Stanford Earthquake Dataset and compared it with EQTransformer. While our model
size is reduced by 87.68% compared to EQTransformer, the performance of our model is
comparable to that of EQTransformer.
Keywords: seismic wave, earthquake detection, lightweight technology, deep learning, convolutional neural
network
1 INTRODUCTION
Detection of seismic signals is essential for an on-site early earthquake warning system (EEWS).
Major approaches for seismic signal detection, such as short-time average/long-time average (STA/
LTA) (Allen, 1978), require an analysis of the ambient noise and structural vibration of sites in
advance and the optimization of threshold values via multiple trials and errors for reducing false
detection. Since the on-site EEWS needs multiple seismic sensors, data acquisition, and management
systems, it is important that sensing systems are cost-effective, and detection is performed with no
pre-analysis on on-site ambient noises.
Traditional approaches to detect P/S waves, such as STA/LTA and AIC (Maeda, 1985) techniques,
process seismic signals using short-term and long-term averages for the amplitudes of continuous
seismic waveforms. This process cannot be fully automated because various noises caused by the
geographical locations and environments of the seismic monitoring equipment need to be removed.
To overcome these issues, several studies have detected seismic signals using machine learning and
deep learning (Ross et al., 2019;Zhu et al., 2019;Mousavi et al., 2020). Among these methods,
EQTransformer is considered to perform the best (Mousavi et al., 2020). The EQTransformer
structure consists of one encoder and three decoders that compress and restore data and is connected
using long short-term memory (LSTM) and an attention mechanism. The EQTransformer model
outputs probabilistic statistics for earthquakes and P/S waves Figure 1.
Edited by:
Said Gaci,
Algerian Petroleum Institute, Algeria
Reviewed by:
Stephen Wu,
Institute of Statistical Mathematics,
Japan
Chong Xu,
Ministry of Emergency Management,
China
*Correspondence:
Jin Kyu Gahm
gahmj@pusan.ac.kr
Giltae Song
gsong@pusan.ac.kr
These authors have contributed
equally to this work and share rst
authorship
Specialty section:
This article was submitted to
Geohazards and Georisks,
a section of the journal
Frontiers in Earth Science
Received: 04 January 2022
Accepted: 17 May 2022
Published: 04 July 2022
Citation:
Lim J, Jung S, JeGal C, Jung G,
Yoo JH, Gahm JK and Song G (2022)
LEQNet: Light Earthquake Deep
Neural Network for Earthquake
Detection and Phase Picking.
Front. Earth Sci. 10:848237.
doi: 10.3389/feart.2022.848237
Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482371
ORIGINAL RESEARCH
published: 04 July 2022
doi: 10.3389/feart.2022.848237
Although the EQTransformer model detects earthquakes and
P/S phase pickings with high accuracy, it is almost impossible to
mount the model on small IoT (Internet of Things) devices such
as earthquake detection sensors that equip limited computing
resources.
In this study, we proposed a lightweight seismic signal
detection model called LEQNet to operate even in ultra-small
devices. We applied various lightweight deep learning techniques,
such as the recursive, deeper bottleneck, depthwise, and pointwise
separable structures, to reduce the size of the deep learning
model. We evaluated our lightweight deep learning system
using STEAD datasets and compared them with
EQTransformer. Compared to EQTransformer, our LEQNet
reduced the number of parameters substantially, with no
signicant performance degradation (Mousavi et al., 2020).
Our LEQNet model can be operated to detect seismic signals
even in small devices. The source code of LEQNet is available at
https://github.com/LEQNet/LEQNet.
2 MATERIALS AND METHODS
2.1 Datasets and Preprocessing
STEAD (Mousavi et al., 2019), a high-quality, large-scale, global
dataset was used in this experiment. STEAD consists of seismic
waveforms with an epicenter of less than 350 km and noise
waveforms without seismic signals. Regarding the
conguration of the data, 120M data are provided, including
450K time series seismic data and noise data for 19,000 h.
One data include 6,000 data points, collected at 100 Hz for
1 min, and three channels, namely, E on the eastwest axis, N on
the northsouth axis, and Z perpendicular to the ground. Based
on the discussion of EQTransformer, in which the size of the
training dataset did not signicantly affect the performance,
earthquake data and noise data were under-sampled by 50,000
each. The two classes were in equal ratio to resolve data imbalance
and build a deep learning detection model. The ratio of training,
validation, and test data set was divided into a general ratio of 8:
1:1.
To add more diverse situations to the training data, we applied
data aggregation techniques (Van Dyk and Meng, 2001).
Augmented techniques include adding events, moving
sequences in parallel, adding noise, deleting channels, and
adding empty sequences with random probabilities.
2.2 Performance Evaluation
2.2.1 Confusion Matrix
The confusion matrix is typically used as an evaluation index
in binary classication problems. When real seismic waves are
detected as earthquakes, it is regarded as a true positive (TP).
Conversely, if they are not detected, it is a false negative (FN).
Noise data are true negative (TN) when predicted as noise and
false positive (FP) when called earthquakes. For phase picking,
it is counted as TP when the actual arrival time of P and S
waves and time of the model are within 0.5 s. When actual P
and S waves do not exist and our model does not call P and S
arrivals, it is regarded as TN. We calculated precision and
recall, as shown in Eqs 1 and 2.F1-score,thatis,theharmonic
average of precision and recall, was calculated as presented in
Eq. 3.
precision TP
TP +FP,(1)
recall TP
TP +FN,(2)
F1score precision ×recall
precision +recall.(3)
FIGURE 1 | EQTransformer to input signals, detect earthquakes in earthquake waveforms, mark P/S wave arrivals, and output their probabilistic statistics. There
are four plots in the output. Light blue lines indicate the earthquake locations in the three upper plots. In the last output plot, a green dotted line indicates the probability of
an earthquake, a blue dotted line indicates the probability for the starting point of the P wave, and a red dotted line indicates the probability for the starting point of the S
wave.
Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482372
Lim et al. Light Earthquake Deep Neural Network
2.2.2 Information Density
Information density is used to evaluate the lightweight
technology. When the accuracy of the model is denoted as
a(N), the number of parameters of the model as p(N) and the
information density D(N) is calculated as shown in Eq. 4. This
evaluation considers the number of parameters and accuracy.
DN
()
aN
()
pN
()
.(4)
2.2.3 Netscore
To consider the amount of computation and inference speed, we
measured Netscore Ω(N), as proposed in Eq. 5.(Wong, 2019).
The multiply-accumulate of the model is denoted by m(N). α,β,
and γare coefcients for controlling the inuence of network
accuracy, model complexity, and computational complexity,
respectively. They are usually set to α=2,β= 0.5, and γ=
0.5 (Wong, 2019). When the model became lighter, it had a
smaller number of parameters and fewer operations. If the
performance of the model remains unchanged, the Netscore
increases. The higher the value of Netscore, the lighter is the
model.
ΩN
()
20 × log aN
()
α
pN
()
β×mN
()
γ

.(5)
2.3 Baseline Model EQTransformer
Figure 2A illustrates the structure of EQTransformer that was
designed with encoder, decoder, ResNet, and LSTM &
transformer. The encoder consists of seven convolutional
neural network (CNN) layers. This reduces STEAD seismic
signal data into a lower dimension in terms of seismic signal
length and allows to reduce the number of parameters in the
model. The decoder consists of seven CNN layers. This decoder
restores the extracted features to the original dimension. In the
EQTransformer, there are three decoders for P, S, and earthquake
detection. The ResNet part of the EQTransformer consists of ve
residual blocks. Each residual block outputs the addition of input
and output in two linearly connected CNN layers. This allows to
avoid performance degradation caused by a high number of
layers. LSTM learns sequential features extracted by the
encoder. The transformer carries the characteristics of the
seismic signals to each decoder (for P, S, and earthquake
detection). EQTransformer comprises several layers and
learning parameters, which exceed 300k as it focuses on
maximizing the predictive accuracy.
Figure 2A and Table 1 show the structure of EQTransformer
that uses a total of 323,063 parameters. A total of 34,672
parameters were used for the encoder, 109,696 for ResNet, and
42,372 for LSTM & transformer. As EQTransformer uses
separate decoders in earthquake, P-wave, and S-wave
detections, 136,323 parameters are used. The ratio of the total
number of parameters is 11% for the encoder, 34% for ResNet,
13% for LSTM & transformer, and 42% for the decoder. The
layers with the most parameters in the structure include an
encoder, a decoder, and ResNet composed of CNN layer-based
FIGURE 2 | Structure of the EQTransformer. (A) EQTransformer is divided into four main parts: encoder, decoder, ResNet, and LSTM & transformer. (B) Structure
of our LEQNet. LEQNet comprises four main parts: an encoder that uses the depthwise separable CNN with recurrent CNN, a decoder that uses the depthwise
separable CNN with recurrent CNN, residual Bottleneck CNN, and LSTM & Transformer.
TABLE 1 | Number of parameters in the EQTransformer and LEQNet.
Section EQTransformer LEQNet
Encoder 34,672 1,289
ResNet 109,696 3,712
Decoder 136,323 4,131
LSTM & transformer 42,372 30,644
Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482373
Lim et al. Light Earthquake Deep Neural Network
structures, accounting for 87% of the total model parameters. The
lightweight methods used in this study simplify the CNN-based
layer structure with no signicant performance degradation.
2.4 Model Lightweight Techniques
2.4.1 Depthwise Separable CNN
Depthwise separable convolution is a CNN that combines the
depthwise and pointwise methods. It was rst introduced in
mobile net v1 (Howard et al., 2017)(Chollet, 2017).
Depthwise convolution extracts channel features by
performing a convolution operation per channel.
Consequently, the input and output channels comprise the
same number of channels. In addition, computations are
exponentially reduced by skipping convolution between
channels. Pointwise convolution compresses data at the same
location inside each channel. It extracts the features between
channels and controls the number of channels between the input
and output. (Hua et al., 2018).
When convolution operations in the CNN perform
redundantly among multiple channels, the depthwise separable
convolution reduces the huge amount of computation required
among the channels. (Paoletti et al., 2020).
2.4.2 Deeper Bottleneck Architecture
Deeper bottleneck architecture was proposed in ResNet to reduce
training time (He et al., 2016). In this method, an additional
convolution is performed before the convolution to reduce the
size of the input channel. After convolution with input channels
of reduced size, a convolution operation to restore the number of
channels is executed. When one residual block consists of two
layers in the EQTransformer, we changed it to three layers. While
we have an additional layer, we used a kernel of size one in the
rst and last layers. This reduces the number of parameters
substantially in the intermediate layer that performs the actual
operation.
2.4.3 Recurrent CNN
Recurrent CNN is a concept that reuses the output of the CNN
layer. To use the recurrent CNN, the number of channels in input
and output needs to be equal with the same kernel shape. The
layers also need to be continuous. We applied the recurrent CNN
to the encoder and the decoder, respectively. This reduced the
model size by reusing parameters.
The use of the recurrent CNN has an additional benetof
decreasing memory access costs needed for loading initial
parameters, increasing nonlinearity, and updating gradients in
multiple parts for learning the model (Köpüklü et al., 2019).
3 RESULTS
We learned a detection model using 50,000 sets of earthquake
data and noise data sampled from STEAD and repeated the
learning process for 10 epochs. The threshold values were set for
the probability of seismic detection and P- and S-phase pickings
as detection = 0.5, p= 0.3, and S = 0.3, which were equal to
EQTransformer.
3.1 LEQNet
Figure 2B illustrates the structure of LEQNet that was designed
using the depthwise separable CNN, deeper bottleneck, and
recurrent CNN. Depthwise separable CNN was applied to the
encoder and decoder which were used in the feature compression
and decompression steps. In addition, memory access time in the
encoder and decoder was reduced using the recurrent CNN layers
in LEQNet. These recurrent CNN layers also decreased the
inference time and model size via the reuse of existing
parameters.
Table 1 summarizes the reduction in the number of
parameters. The number of parameters in the encoder was
reduced from 34,672 to 1,289, and from 136,323 to 4,131 in
the decoder. ResNet for central feature extraction in the
EQTransformer was replaced by the deeper bottleneck CNN
in our LEQNet. This reduced the number of parameters from
109,696 to 3,712. Although there was no structural change in
LSTM & transformer, the decrease in the number of input
parameters itself reduced the number of parameters from
42,372 to 30,644 in LSTM & transformer.
Figure 3 illustrates the LEQNet architecture. Each encoder
and decoder of LEQNet consists of ve layers, and the deeper
bottleneck consists of four layers; however, EQTranformer has
seven layers in each encoder and decoder and ve layers for
ResNet. The number of output channels in the encoder in the
LEQNet has changed to 32, similar to the number of input
channels, while the number of channels in the EQTransformer
was 64.
3.2 Model Size Reduction
LEQNet reduces the number of parameters and computation
time substantially via the lightweight techniques (2 and
Figure 4A). The number of parameters for the LEQNet in this
study decreased by about 88% compared to the EQTransformer.
The amount of computation oating-point operations (FLOPs)
decreased from 79,687,040 to 5,271,488 (Table. 2). These results
reduced the model size by about 79% compared to the
EQTransformer (from 4.5MB to 0.94 MB). This suggests that
our LEQNet model can operate in small devices with tiny
memory.
Table 2 shows the degree of model size reduction for the
EQTransformer, Yews (Zhu et al., 2019) earthquake detection
deep learning model, and LEQNet in terms of information
density and Netscore. As the accuracy between these two
models does not differ signicantly, the information density
score only depends on the number of parameters. LEQNet has
improved the information density by about 8.09 times compared
to EQTransformer.
Figure 4A shows Netscore, which measures the amount of
computation according to FLOPs. When Netscore coefcients are
usually set to α=2,β= 0.5, and γ= 0.5, we changed γto 0.1
because the Netscore was measured according to FLOPs in CNN
layers. The theoretical maximum value of Netscore is 80, while
EQTransformer showed 8.29 and Yews 9.82 in Netscore, LEQNet
scored 20.30 (see Figure 4A), indicating that our LEQNet
improves the Netscore by 2.44 times compared to the
EQTransformer.
Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482374
Lim et al. Light Earthquake Deep Neural Network
3.3 Detection Performance
We evaluated the detection performance of LEQNet and
compared it with other existing methods, such as
EQTransformer, PhaseNet (Ross et al., 2019), Yews, and STA/
LTA (Table 3). F1-scores for earthquake detection, P-phase
picking, and S-phase picking were also compared.
EQTransformer showed F1-scores of 1.0, 0.99, and 0.98 in
these three tasks. F1-scores (0.99, 0.98, and 0.97) of our
LEQNet were almost similar to those of EQTransformer. Our
F1-scores were higher than those of PhaseNet by 0.02 in P-phase
picking and by 0.03 in S-phase picking, higher than those of Yews
by 0.37 in P-phase picking and by 0.31 in S-phase picking, and by
FIGURE 3 | LEQNet model architecture. Depthwise separable CNN and recurrent CNN were applied to the encoder and decoder, respectively. The number of
layers in the encoder was reduced to ve, while the EQTransformer had seven layers in the encoder. Features can be extracted via four blocks using deeper bottleneck
and via ve blocks in the EQTransformer. Detailed description of each block is provided in the Methods section. The convolutional layers represent the number of kernels,
and kr is the kernel size.
Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482375
Lim et al. Light Earthquake Deep Neural Network
0.14 in earthquake detection. Interestingly, FLOPs in our LEQNet
was similar to those in Yews, although our LEQNet performed
better than Yews in terms of the F1-score.
4 DISCUSSION
EQTransformer and PhaseNet improved the performance of
earthquake detection, P-phase picking, and S-phase picking
substantially using deep neural networks. These models are
difcult to operate on low-cost embedded devices for earthquake
detection. The models should be processed in low-cost embedded
devices using EEWS. Therefore, we aimed to construct a lightweight
model while maintaining high performance.
Compared to existing deep neural network models, the size
of our LEQNet model is reduced while maintaining high
performance, as supported by the nding of this study.
Remarkably, we reduced the number of parameters by
87.68% and FLOPs in CNN layers by 93.38%, compared to
the EQTransformer. This model optimization was achieved by
applying the depthwise separable CNN and recurrent CNN
and by removing some layers in the encoder and decoder,
which occupy 53% of the model. In addition, decreasing the
number of output channels in the ResNet also contributed to
reducing the number of parameters in our LEQNet. However,
performance degradation was observed when some
parameters were reduced in the encoder and decoder. To
resolve this issue, we applied the deeper bottleneck
architecture to our model, as in the EQTransformer. This
increased the information density and Netscore of our
LEQNet model: information density increased from 3.06e-4
to 24.76e-4, 8.09 times higher than that of EQTransformer,
and Netscore increased from 8.29 to 20.30, 2.44 times higher
than that of EQTransformer.
Although we reduced the model size substantially, there
remains room for improvements. Our model needs to run on
the traditional TensorFlow-like environments (Abadi et al.,
2016), which may not be suitable for small devices. Therefore,
we aimed to address this issue using TensorFlow Lite (Google,
2020) in future works.
FIGURE 4 | (A) Comparison of the Netscore, LEQNet, EQTransformer, and Yews. (B) Comparison of detection, S-phase, and P-phase scores between LEQNet
and EQTransformer.
TABLE 2 | Model size comparison results.
Model Parameter FLOP Information density Netscore
EQTransformer 323,063 79,687,040 3.06e-4 8.29
Yews 108,691 7,806,976 6.47e-4 9.82
LEQNet 39,776 5,271,488 24.76e-4 20.30
TABLE 3 | Model result of detection and P-phase and S-phase scores.
Model Detection F1-score P-phase F1-score S-phase F1-score Training data Training size Reference
EQTransformer 1.0 0.99 0.98 Global 1.2M Mousavi et al. (2020)
PhaseNet - 0.96 0.94 North California 780K Ross et al. (2019)
CDRP (DetNet) 0.94 0.90 0.95 China and Japan 30K Zhou et al. (2019)
Yews 0.85 0.61 0.66 Taiwan 1.4M Zhu et al. (2019)
STA/LTA 0.95 - - - - Allen, (1978)
LEQNet 0.99 0.98 0.97 Global 100K
Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482376
Lim et al. Light Earthquake Deep Neural Network
5 CONCLUSION
A model for seismic signal detection for the EEWS needs to
operate in low-cost sensors in an environment without the
support of a server and interconnection with other
management systems. As the EQTransformer has a model size
of about 4.5 MB and needs a lot of computation, it is difcult to
operate in limited environments such as low-power wireless
communication devices and Arduino.
To resolve this issue, we developed LEQNet using lightweight
deep learning techniques. LEQNet reduced the number of
parameters of the detection model by 88% and the model size
by 79% (from 4.5 MB to 0.94 MB), as compared to the
EQTransformer, with no signicant performance degradation.
Our model can be mounted on IoT devices that include
embedded RAM of less than 1 MB with no special external
management systems.
Although this LEQNet model reduces the model size drastically
compared to the EQTransformer, there remains room for
improvement. Recently, tiny AI models, which can operate in
smaller devices with memory less than 256 kB, are in high
demand. Therefore, our model needs to be reduced more than the
current model size to meet this requirement. Our LEQNet package is
easy to install and is available for public use. We believe that this
lightweight earthquake deep neural network can be a useful tool in a
community burdened with geohazards and georisks.
DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. These data
can be found at: https://github.com/smousavi05/STEAD.
AUTHOR CONTRIBUTIONS
JL, SJ, CJ, JG, and GS contributed to the conception and design of the
study. JL, SJ, and CJ wrote the rst draft of the manuscript. GS, JG,
GJ, and JY wrote sections of the manuscript. All authors contributed
to manuscript revision, read, and approved the submitted version.
FUNDING
This work was supported by the Institute of Information and
Communications Technology Planning and Evaluation (IITP)
grant funded by the Korean government (MSIT) (No. 2020-0-
01450, Articial Intelligence Convergence Research Center
(Pusan National University)) to GS.
REFERENCES
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., et al. (2016).
Tensorow: A System for Large-Scale Machine Learning,in 12th \{USENIX\}
Symposium on Operating Systems Design and Implementation (\{OSDI\} 16),
265283.
Allen, R. V. (1978). Automatic Earthquake Recognition and Timing from Single
Traces. Bull. Seismol. Soc. Am. 68, 15211532. doi:10.1785/bssa0680051521
Chollet, F. (2017). Xception: Deep Learning with Depthwise Separable
Convolutions,in Proceedings of the IEEE conference on computer vision
and pattern recognition, 12511258. doi:10.1109/cvpr.2017.195
Google, L. L. C. (2020). Tensorow Lite. Available at: https://www.tensorow.org/
lite (Accessed Febuary 26, 2022).
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep Residual Learning for Image
Recognition,in Proceedings of the IEEE conference on computer vision and
pattern recognition, 770778. doi:10.1109/cvpr.2016.90
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al.
(2017). Mobilenets: Efcient Convolutional Neural Networks for Mobile Vision
Applications. arXiv Prepr. arXiv:1704.04861.
Hua, B.-S., Tran, M.-K., and Yeung, S.-K. (2018). Pointwise Convolutional Neural
Networks,in Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, 984993. doi:10.1109/cvpr.2018.00109
Köpüklü, O., Babaee, M., Hörmann, S., and Rigoll, G. (2019). Convolutional
Neural Networks with Layer Reuse,in 2019 IEEE International Conference on
Image Processing (ICIP) (Piscataway, NJ, USA: IEEE), 345349. doi:10.1109/
icip.2019.8802998
Maeda, N. (1985). A Method for Reading and Checking Phase Time in Auto-
Processing System of Seismic Wave Data. J. Seismol. Soc. Jpn. 38, 365379.
doi:10.4294/zisin1948.38.3_365
Mousavi, S. M., Ellsworth, W. L., Zhu, W., Chuang, L. Y., and Beroza, G. C. (2020).
Earthquake Transformer-An Attentive Deep-Learning Model for Simultaneous
Earthquake Detection and Phase Picking. Nat. Commun. 11, 3952. doi:10.1038/
s41467-020-17591-w
[Dataset] Mousavi, S. M., Sheng, Y., Zhu, W., and Beroza, G. C. (2019). Stanford
Earthquake Dataset (Stead): A Global Data Set of Seismic Signals for Ai. IEEE
Access 7, 179464179476. doi:10.1109/ACCESS.2019.2947848
Paoletti,M.E.,Haut,J.M.,Tao,X.,Plaza,J.,andPlaza,A.(2020).Flop-reduction
through Memory Allocations within Cnn for Hyperspectral Image Classication.
IEEE Trans. Geoscience Remote Sens. 59, 5938. doi:10.1109/TGRS.2020.3024730
Ross, Z. E., Yue, Y., Meier, M. A., Hauksson, E., and Heaton, T. H. (2019).
Phaselink: A Deep Learning Approach to Seismic Phase Association. J. Geophys.
Res. Solid Earth 124, 856869. doi:10.1029/2018jb016674
Van Dyk, D. A., and Meng, X.-L. (2001). The Art of Data Augmentation.
J. Comput. Graph. Statistics 10, 150. doi:10.1198/10618600152418584
Wong,A.(2019).Netscore: Towards Universal Metrics for Large-Scale Performance
Analysis of Deep Neural Networks for Practical On-Device Edge Usage,in
International Conference on Image Analysis and Recognition (Berlin, Germany:
Springer), 1526. doi:10.1007/978-3-030-27272-2_2
Zhou, Y., Yue, H., Kong, Q., and Zhou, S. (2019). Hybrid Event Detection and
Phase-Picking Algorithm Using Convolutional and Recurrent Neural
Networks. Seismol. Res. Lett. 90, 10791087. doi:10.1785/0220180319
Zhu, L., Peng, Z., McClellan, J., Li, C., Yao, D., Li, Z., et al. (2019). Deep Learning for Seismic
Phase Detection and Picking in the Aftershock Zone of 2008 M7.9 Wenchuan
Earthquake. Phys. Earth Planet. Interiors 293, 106261. doi:10.1016/j.pepi.2019.05.004
Conict of Interest: Author JY was employed by FACDAMM.
The remaining authors declare that the research was conducted in the absence of
any commercial or nancial relationships that could be construed as a potential
conict of interest.
Publishers Note: All claims expressed in this article are solely those of the authors
and do not necessarily represent those of their afliated organizations, or those of
the publisher, the editors, and the reviewers. Any product that may be evaluated in
this article, or claim that may be made by its manufacturer, is not guaranteed or
endorsed by the publisher.
Copyright © 2022 Lim, Jung, JeGal, Jung, Yoo, Gahm and Song. This is an open-
access article distributed under the terms of the Creative Commons Attribution License
(CC BY). The use, distribution or reproduction in other forums is permitted, provided the
original author(s) and the copyright owner(s) are credited and that the original
publication in this journal is cited, in accordance with accepted academic practice.
No use, distribution or reproduction is permitted which does not comply with these terms.
Frontiers in Earth Science | www.frontiersin.org July 2022 | Volume 10 | Article 8482377
Lim et al. Light Earthquake Deep Neural Network
... While instances of simultaneous application of ResNet and LSTM models can be found in existing literature, their specific implementations vary. For instance, Lim et al. [43] employed ResNet to diminish training time by parameter reduction. Moreover, Karim et al. [44] utilized LSTM fully convolutional networks and ResNet to classify time series data, albeit without forecasting future values. ...
Article
This paper presents a novel approach for predicting the behavior of buckling restrained braces (BRBs) using long short-term memory (LSTM) and residual neural network (ResNet) models. Nonlinear finite element analysis (NLFEA) usually requires 3D modeling of complex parts and assigning various material and simulation properties before conducting the analysis, which can be time-consuming and computationally costly. LSTM networks are particularly well-suited for modeling sequential data, such as time series, and ResNet is a discriminative deep learning-based model that can handle deeper architectures without the issue of vanishing gradient. Developing a general LSTM model requires a comprehensive database for training, cross-validation, and blind testing. Even if such a database were available, concatenating data from different BRB specimens would not be feasible when modeling a data series. Such a method would intertwine the features of the individual specimens, rendering the data useless for machine-learning purposes. On the other hand, a well-trained LSTM can predict cyclic behavior for specimens that resemble the training data. As such, a group of well-trained models can produce superior predictions when used on individually familiar data. Hence, there is a need for a prescreening classification stage. The framework presented herein utilizes an AI classifier (ResNet) that automatically recognizes the BRB specimen type of the input data and directs it to the appropriate LSTM model for prediction. The presented framework demonstrates excellent predictions with an accuracy greater than 99%. It is customizable, adaptable, and scalable to include different BRB specimens. The input data consisted of the displacement of the BRB specimens based on the AISC 341–10 qualification recommendations, while the output was the corresponding force (hysteretic response to cyclic loading). The proposed framework can be valuable in earthquake engineering applications as it demonstrates superior predictions.
... Networks for the Sensor Plane. LEQNet [19] proposes a deep neural network model that uses a deeper bottleneck, recursive structure and depthwise separable CNN which reduces the model size by 87.68% as compared to EQTransformer [22] and have the F1 score of 0.99. However, their model is based on a transformer architecture, which is not available in today's frameworks for embedded devices such as Tensorflow-Lite, due to its complexity and heavyweight nature is for MCUs [6]. ...
Article
Waveforms of artificially induced explosions and collapse events recorded by the seismic network share similarities with natural earthquakes. Failure to identify and screen them in a timely manner can introduce confusion into the earthquake catalog established using these recordings, thereby impacting future seismological research. Therefore, the identification and separation of natural earthquakes from continuous seismic signals contribute to the monitoring and early warning of destructive tectonic earthquakes. A 1D convolutional neural network (CNN) is proposed for seismic event classification using an efficient channel attention mechanism and an improved light inception block. A total of 9937 seismic sample records are obtained after waveform interception, filtering, and normalization. The proposed model can obtain better classification performance than other major existing methods, exhibiting 96.79% overall classification accuracy and 96.73%, 94.85%, and 96.35% classification accuracy for natural seismic events, collapse events, and blasting events, respectively. Meanwhile, the proposed model is lighter than the 2D convolutional and common inception networks. We also apply the proposed model to the seismic data recorded at the University of Utah seismograph stations and compare its performance with that of the CNN-waveform model.
Article
Full-text available
Phase picking is a critical task in seismic data processing, where deep learning methods have been applied to enhance its accuracy. While lightweight deep learning networks have been optimized for edge computing devices, there is a lack of networks developed explicitly for embedded systems. This paper presents a seismic phase picking model, a hybrid network integrating convolutional neural networks and Transformer, designed for embedded systems. Optimizing network parameters and computational resources, the model significantly reduces resource consumption while guaranteeing accuracy. It employs a multi-branch architecture. Specifically, the global branch employs a modified self-attention mechanism, effectively extracting global features through shared contextual information. The local branch retains local information from the input features. Such a multi-branch architecture facilitates effective interaction between global features and local details, thereby more efficiently capturing the relationships among features. The model can be configured into variants with different sizes to match various embedded systems. This research evaluated the model using the Stanford Earthquake Dataset, achieving a precision of 99.9% for the P-phase and 99.3% for the S-phase. On Raspberry Pi, the model reduced inference time by 58.1% compared to the earthquake transformer while maintaining comparable detection performance.
Article
Full-text available
The importance of monitoring earthquakes for disaster management, public safety, and scientific research can hardly be overstated. The emergence of low-cost seismic sensors offers potential for widespread deployment due to their affordability. Nevertheless, vehicular noise in low-cost seismic sensors presents as a significant challenge in urban environments where such sensors are often deployed. In order to address these challenges, this work proposes the use of an amalgamated deep neural network constituent of a DNN trained on earthquake signals from professional sensory equipment as well as a DNN trained on vehicular signals from low-cost sensors for the purpose of earthquake identification in signals from low-cost sensors contaminated with vehicular noise. To this end, we present low-cost seismic sensory equipment and three discrete datasets that—when the proposed methodology is applied—are shown to significantly outperform a generic stochastic differential model in terms of effectiveness and efficiency.
Article
Full-text available
Convolutional neural networks (CNNs) have proven to be a powerful tool for the classification of hyperspectral images (HSIs). The CNN kernels are able to naturally include spatial information to smooth out the spectral variability and the noise present in HSI data. However, these kernels are composed of a large number of learning parameters that must be correctly adjusted to achieve good performance. This forces the model to consume a large amount of training data, being prone to overfitting when limited labeled samples are available. In addition, the execution of kernels is computationally very expensive, increasing quadratically with respect to the size of the convolution filter. This significantly reduces the performance of the model. To overcome the aforementioned limitations, this work presents a new few-parameter CNN (based on shift operations) for HSI classification that dramatically reduces both the number of parameters and the computational complexity of the model in terms of floating-point operations (FLOPs). The operational module combines a shift kernel (which adjusts the input data in particular directions without involving any parameters nor FLOPs) with pointwise convolutions that perform the feature extraction stage. The newly developed shift-based CNN has been employed to conduct HSI classification over five widely used and challenging data sets, achieving very promising results in terms of computational performance and classification accuracy.
Article
Full-text available
Earthquake signal detection and seismic phase picking are challenging tasks in the processing of noisy data and the monitoring of microearthquakes. Here we present a global deep-learning model for simultaneous earthquake detection and phase picking. Performing these two related tasks in tandem improves model performance in each individual task by combining information in phases and in the full waveform of earthquake signals by using a hierarchical attention mechanism. We show that our model outperforms previous deep-learning and traditional phase-picking and detection algorithms. Applying our model to 5 weeks of continuous data recorded during 2000 Tottori earthquakes in Japan, we were able to detect and locate two times more earthquakes using only a portion (less than 1/3) of seismic stations. Our model picks P and S phases with precision close to manual picks by human analysts; however, its high efficiency and higher sensitivity can result in detecting and characterizing more and smaller events. The authors here present a deep learning model that simultaneously detects earthquake signals and measures seismic-phase arrival times. The model performs particularly well for cases with high background noise and the challenging task of picking the S wave arrival.
Article
Full-text available
Seismology is a data rich and data-driven science. Application of machine learning for gaining new insights from seismic data is a rapidly evolving sub-field of seismology. The availability of a large amount of seismic data and computational resources, together with the development of advanced techniques can foster more robust models and algorithms to process and analyze seismic signals. Known examples or labeled data sets, are the essential requisite for building supervised models. Seismology has labeled data, but the reliability of those labels is highly variable, and the lack of high-quality labeled data sets to serve as ground truth as well as the lack of standard benchmarks are obstacles to more rapid progress. In this paper we present a high-quality, large-scale, and global data set of local earthquake and non-earthquake signals recorded by seismic instruments. The data set in its current state contains two categories: (1) local earthquake waveforms (recorded at “local” distances within 350 km of earthquakes) and (2) seismic noise waveforms that are free of earthquake signals. Together these data comprise ∼ 1.2 million time series or more than 19,000 hours of seismic signal recordings. Constructing such a large-scale database with reliable labels is a challenging task. Here, we present the properties of the data set, describe the data collection, quality control procedures, and processing steps we undertook to insure accurate labeling, and discuss potential applications. We hope that the scale and accuracy of STEAD presents new and unparalleled opportunities to researchers in the seismological community and beyond.
Article
Full-text available
The increasing volume of seismic data from long-term continuous monitoring motivates the development of algorithms based on convolutional neural network (CNN) for faster and more reliable phase detection and picking. However, many less studied regions lack a significant amount of labeled events needed for traditional CNN approaches. In this paper, we present a CNN-based Phase-Identification Classifier (CPIC) designed for phase detection and picking on small to medium sized training datasets. When trained on 30,146 labeled phases and applied to one-month of continuous recordings during the aftershock sequences of the 2008 M_W 7.9 Wenchuan Earthquake in Sichuan, China, CPIC detects 97.5% of the manually picked phases in the standard catalog and predicts their arrival times with a five-times improvement over the ObsPy AR picker. In addition, unlike other CNN-based approaches that require millions of training samples, when the off-line training set size of CPIC is reduced to only a few thousand training samples the accuracy stays above 95%. The online implementation of CPIC takes less than 12 hours to pick arrivals in 31-day recordings on 14 stations. In addition to the catalog phases manually picked by analysts, CPIC finds more phases for existing events and new events missed in the catalog. Among those additional detections, some are confirmed by a matched filter method while others require further investigation. Finally, when tested on a small dataset from a different region (Oklahoma, US), CPIC achieves 97% accuracy after fine tuning only the fully connected layer of the model. This result suggests that the CPIC developed in this study can be used to identify and pick P/S arrivals in other regions with no or minimum labeled phases.
Article
Full-text available
We developed a hybrid algorithm using both convolutional and recurrent neural networks (CNNs and RNNs, respectively) to pick phases from archived continuous waveforms in two steps. First, an eight‐layer CNN is trained to detect earthquake events from 30‐second‐long three‐component seismograms. The event seismograms are then sent to a two‐layer bidirectional RNN to pick P‐ and S‐arrival times. The data for training and validation and testing of the networks are obtained from the continuous waveforms of 16 stations recording the aftershock sequence of the 2008 Wenchuan earthquake. The augmented training set has 135,966 P–S‐wave arrival‐time pairs. The CNN achieved 94% and 98% hit rate for event and noise segments in the test set, respectively. The RNN picking accuracies for P and S waves are −0.03±0.48 (mean error ± standard deviation) and 0.03±0.56 s⁠, respectively.
Article
Full-text available
Seismic phase association is a fundamental task in seismology that pertains to linking together phase detections on different sensors that originate from a common earthquake. It is widely employed to detect earthquakes on permanent and temporary seismic networks and underlies most seismicity catalogs produced around the world. This task can be challenging because the number of sources is unknown, events frequently overlap in time, or can occur simultaneously in different parts of a network. We present PhaseLink, a framework based on recent advances in deep learning for grid‐free earthquake phase association. Our approach learns to link phases together that share a common origin and is trained entirely on millions of synthetic sequences of P and S wave arrival times generated using a 1‐D velocity model. Our approach is simple to implement for any tectonic regime, suitable for real‐time processing, and can naturally incorporate errors in arrival time picks. Rather than tuning a set of ad hoc hyperparameters to improve performance, PhaseLink can be improved by simply adding examples of problematic cases to the training data set. We demonstrate the state‐of‐the‐art performance of PhaseLink on a challenging sequence from southern California and synthesized sequences from Japan designed to test the point at which the method fails. For the examined data sets, PhaseLink can precisely associate phases to events that occur only ∼12 s apart in origin time. This approach is expected to improve the resolution of seismicity catalogs, add stability to real‐time seismic monitoring, and streamline automated processing of large seismic data sets.
Chapter
Much of the focus in the design of deep neural networks has been on improving accuracy, leading to more powerful yet highly complex network architectures that are difficult to deploy in practical scenarios, particularly on edge devices such as mobile and other consumer devices given their high computational and memory requirements. As a result, there has been a recent interest in the design of quantitative metrics for evaluating deep neural networks that accounts for more than just model accuracy as the sole indicator of network performance. In this study, we continue the conversation towards universal metrics for evaluating the performance of deep neural networks for practical on-device edge usage. In particular, we propose a new balanced metric called NetScore, which is designed specifically to provide a quantitative assessment of the balance between accuracy, computational complexity, and network architecture complexity of a deep neural network, which is important for on-device edge operation. In what is one of the largest comparative analysis between deep neural networks in literature, the NetScore metric, the top-1 accuracy metric, and the popular information density metric were compared across a diverse set of 60 different deep convolutional neural networks for image classification on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC 2012) dataset. The evaluation results across these three metrics for this diverse set of networks are presented in this study to act as a reference guide for practitioners in the field. The proposed NetScore metric, along with the other tested metrics, are by no means perfect, but the hope is to push the conversation towards better universal metrics for evaluating deep neural networks for use in practical on-device edge scenarios to help guide practitioners in model design for such scenarios.