ResearchPDF Available

Analysis and Prediction of Earthquakes using different Machine Learning techniques

Authors:

Abstract and Figures

A reliable and accurate method for earthquake prediction has the potential to save countless human lives. With that objective in mind, this paper looks into various methods to predict the magnitude and depth of earthquakes. In this paper, real-world earthquake data is analysed to identify patterns and gain insight into this natural calamity. This data is then used to train four machine learning models namely Random forest, linear regression, polynomial regression , and Long Short Term Memory for predicting the magnitude and depth of earthquakes. The performances are compared to find the most effective model. It is very difficult to accurately predict the magnitude of earthquakes however, in this paper it can be seen that polynomial regression shows the best overall results. Also, Random forests are incredibly effective in predicting the depth of an earthquake.
Content may be subject to copyright.
Analysis and Prediction of Earthquakes using different
Machine Learning techniques
Manaswi Mondol
University of Twente
P.O. Box 217, 7500AE Enschede
The Netherlands
m.mondol@student.utwente.nl
ABSTRACT
A reliable and accurate method for earthquake prediction
has the potential to save countless human lives. With that
objective in mind, this paper looks into various methods to
predict the magnitude and depth of earthquakes. In this
paper, real-world earthquake data is analysed to identify
patterns and gain insight into this natural calamity. This
data is then used to train four machine learning models
namely Random forest, linear regression, polynomial re-
gression, and Long Short Term Memory for predicting
the magnitude and depth of earthquakes. The perfor-
mances are compared to find the most effective model.
It is very difficult to accurately predict the magnitude of
earthquakes however, in this paper it can be seen that
polynomial regression shows the best overall results. Also,
Random forests are incredibly effective in predicting the
depth of an earthquake.
Keywords
Earthquake Prediction, Machine Learning, Regression Anal-
ysis, Random Forest, Linear Regression,Polynomial Re-
gression, Long Short Term Memory
1. INTRODUCTION
Earthquakes are devastating natural calamities responsi-
ble for widespread death and destruction throughout the
world. It is impossible to prevent earthquakes and, their
occurrence does not follow any noticeable pattern. A rea-
sonably accurate and timely prediction of an earthquake
can potentially save thousands of lives and also prevent
damage to property. In the last two decades, countless
studies have been conducted on the topic of earthquake
prediction. Predicting an earthquake involves stating the
exact time, magnitude and location of an upcoming earth-
quake. This prediction can be classified into short-term,
intermediate and long-term prediction based on the dura-
tion of the time scale. This problem is nigh impossible to
solve and as a result in 1997, Geller et al. [5] concluded
that predicting earthquakes is impossible . Over time re-
search has shown that it is not completely impossible to
predict earthquakes. In this paper, some of the studies
that have been successful in developing a reasonably ac-
curate prediction model have been discussed [2, 8, 10, 3,
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy oth-
erwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee.
35th Twente Student Conference on IT July 2nd , 2021, Enschede, The
Netherlands.
Copyright 2021, University of Twente, Faculty of Electrical Engineer-
ing, Mathematics and Computer Science.
9]. However, in spite of such great effort by scientists
throughout the world a valid and reliable method to make
perfectly accurate prediction has not been found yet. In
this paper, the general problem of earthquake prediction
is narrowed down to predicting the magnitude and depth
of an earthquake.
In this paper the following research question will be an-
swered
How can machine learning techniques be used to pre-
dict the approximate magnitude and depth of earth-
quakes by analyzing earthquake data?
In order to answer this question analysis of the data set will
be done followed by training different machine learning
models and finally comparing their performances. Conse-
quently, the following sub research questions will be an-
swered.
Which parameters of earthquake data are relevant for
this research?
What insights can be obtained from analysing the
data ?
Which machine learning techniques are most effec-
tive and produces the best results?
This research attempts to analyse earthquake data and
predict the magnitude and depth of earthquakes. The
analysis of earthquake data performed here helps to iden-
tify the important attributes of the data which can be
used for prediction. The analysis also provides some in-
teresting insights like the existence of outliers, the most
frequent earthquake magnitude, and the distribution of
earthquakes on the world map based on the location and
magnitude. Four machine learning models were trained
using the data set namely: random forest, linear regres-
sion, polynomial regression, and Long Short Term Mem-
ory. The results from the different models are compared
to identify the best performing model.
In section 2, the different machine learning models used
in this research are briefly explained. Prior research con-
ducted on this topic is also explored in this section. In
section 3, exploratory data analysis and data cleaning is
performed on the data set to make sure that relevant data
is used to train the machine learning models. In section
4, the machine learning models are trained using the data
and cross validation is performed to find the best hyperpa-
rameters. In section 5, the results from the different mod-
els are evaluated and compared to find the best-performing
machine learning model.
1
2. BACKGROUND
2.1 Regression Analysis
The magnitude and depth of an earthquake being numer-
ical data, regression analysis will be used for the purpose
of prediction. Regression analysis is a statistical method
used for finding relationships between independent and de-
pendent variables. Regression analysis can be used for
prediction and forecasting. In this paper, regression anal-
ysis will be used to predict the magnitude and depth of
earthquakes. Linear regression is the most widely used re-
gression algorithm where a linear model is fitted to explain
the relationship between the independent and dependent
variables[1]. When linear regression has to be performed
for two or more dependent variables, multiple linear re-
gression is used [1]. Multiple linear regression will be used
in this research to predict both magnitude and depth.
Sometimes, a straight line is unable to properly explain
the relationship between dependent and independent vari-
ables. In cases like that fitting a polynomial of degree nto
the data can perform significantly better and help model
non-linear relationships. This is referred to as polynomial
regression and can be viewed as an extension of the simple
linear regression model [1]. Other Non-linear regression
models like random forests and neural networks will also
be explored.
2.2 Random Forest
Decision trees are non-parametric supervised machine learn-
ing techniques used for classification and regression. Ran-
dom forest is a supervised machine learning algorithm
based on decision trees but has significantly better per-
formance. Random forest builds a large number of deci-
sion trees and merges them through bagging to produce
more accurate and stable predictions[4]. Random forests
are very versatile as they are effective for both classifi-
cation and regression problems. A common problem for
machine learning algorithms is overfitting of data however,
this problem can be easily solved in a random forest model
if there are sufficient number of decision trees[4]. Because
of these reasons a random forest model will be used in this
research.
2.3 LSTM
Neural networks are modelled after the human brain and
contain sensory units called neurons which are connected
to each other through weights[15]. Neural networks are
very effective in modelling complex non-linear relation-
ships. Recurrent neural networks are a type of neural
network that is capable of tracking information from previ-
ous states or sequences[15]. RNNs are very effective when
dealing with sequential data. LSTM or Long Short-Term
Memory is a recurrent neural network architecture that
solves the problem of vanishing gradient[7]. An LSTM net-
work has four neural networks, three of which are control
gates namely - the input, output, and the forget gate and
the fourth neural network is used to estimate the memory
cell parameter[15]. The introduction of these gates helps
to improve the learning capabilities and the memory ca-
pacity of an LSTM network. This makes LSTM networks
very useful when dealing with large sequential data.
2.4 Literature Review
The occurrence of earthquakes is a highly random phe-
nomenon and so the existence of a model that can ac-
curately predict the time, location and magnitude of an
earthquake is not known. Several studies have been con-
ducted throughout the years by researchers in this field.
Researchers have tried to approach this problem from dif-
ferent perspectives to find a potential solution. In this
section, some of these studies which are relevant and help-
ful for this research will be discussed.
Since this research involves predicting the magnitude and
depth of earthquakes, the paper written by Mallouhy et al.
[10] serves as a good starting point. In this paper, eight
different machine learning algorithms are implemented to
predict if an earthquake event can be classified as major or
minor. This paper helps to understand different machine
learning algorithms when these are applied to the context
of earthquake data. Random forest and K nearest neigh-
bour algorithm showed the best results when compared to
other models[10].
There are two general approaches to predict earthquakes
namely precursor based and trend based[11]. Precursors
are phenomena or signals that precede an impending earth-
quake for e.g; radon gas emissions and unusual animal be-
haviour. A great example of precursor based prediction
can be found in the paper by Kuyuk et al. [8]. In this
paper, they explain how they were able to train a Long
Short Term Memory (LSTM) to analyze information col-
lected from Earthquake Early Warning systems to detect
earthquakes before the seismic waves reach the centre of
a city or a densely populated area[8]. This can poten-
tially save many lives as being aware of an earthquake even
a bit earlier can drastically improve evacuation measures
[8]. Trend based methods deal with identifying patterns
in real-world data e.g; seismicity, prior earthquakes etc.
to predict earthquakes. This research also involves trend-
based prediction hence finding similar studies will prove
helpful.
In their paper, Asim et al.[2] predicted the magnitude of
earthquakes in the Hindukush region using historic seis-
mic data. They applied four machine learning algorithms
on a data set namely - pattern recognition neural network,
recurrent neural network(LSTM), random forest with 50
trees and linear programming boost ensemble [2]. Since
the study focused on predicting earthquakes in the same
location, the pattern recognition neural network showed
the best results with an accuracy of 65%[2].
Li et al.[9] explore the possibility of predicting aftershocks
with a magnitude greater than 4.0 in their research by
proposing a new model called PR-KNN. PR-KNN is a
combination of the polynomial regression method and the
K nearest neighbours(KNN) algorithm. This proposed
method was applied to experimental data from the Wenchuan
website. PR-KNN achieved noticeably better results than
the traditional KNN and Distance-weighted KNN algo-
rithms.
Lastly in the study conducted by Bhandarkar et al., they
predicted earthquake trends using Long Short Term Mem-
ory (LSTM) and Feed Forward Neural Network (FFNN)
and compare the results[3]. This study is very similar
to this one and was able to provide valuable information
regarding the performance of LSTM networks in this con-
text. The results from this paper show that an LSTM
network is 59% more effective than an FFNN [3].
2
3. METHODOLOGY
The United States Geological Survey (USGS) provides
real-world earthquake data on past earthquakes. The USGS
classifies “significant events”based on a combination of the
magnitude of an earthquake, the number of ‘Did You Feel
It’ responses, and PAGER alert level[14]. The decision to
work with only significant earthquakes was motivated by
two reasons namely, the time constraint and the reduction
in the volume of the data used to train the machine learn-
ing models. The data set used for this research was ob-
tained from www.kaggle.com/usgs/earthquake-database
which contains earthquake data collected from the USGS
website by Kaggle. This data set includes a record of the
date, time, location, depth, magnitude, and source of ev-
ery earthquake with a reported magnitude of 5.5 or higher
from 1965 until 2016.
3.1 Data Cleaning
The data set contains the following attributes describ-
ing earthquakes - ‘Date’, ‘Time’, ‘Latitude’, ‘Longitude’,
‘Type’, ‘Depth’, ‘Depth Error’, ‘Depth Seismic Stations’,
‘Magnitude’, ‘Magnitude Type’, ‘Magnitude Error’, ‘Mag-
nitude Seismic Stations’, ‘Azimuthal Gap’, ‘Horizontal Dis-
tance’, ‘Horizontal Error’, ‘Root Mean Square’, ‘ID’, ‘Source’,
‘Location Source’, ‘Magnitude Source’, and ‘Status’.
Out of these attributes ‘Date’, ‘Time’, ‘Latitude’, ‘Lon-
gitude’, ‘Type’, ‘Depth’, ‘Magnitude’, ‘Magnitude Type’,
‘ID’, ‘Source’, ‘Location Source’, ‘Magnitude Source’, and
‘Status’ have no null values. The attributes ‘Magnitude
Type’, ‘ID’, ‘Source’, ‘Location Source’, ‘Magnitude Source’,
and ‘Status’ are categorical variables and thus have no
significance in regression analysis. Hence, the attributes
‘Date’, ‘Time’, ‘Latitude’, ‘Longitude’, ‘Magnitude’, and
‘Depth’ will be used for regression analysis in this research.
The built-in functions in the pandas library for python
were used for identifying the number of null and non-null
values for each attribute. A detailed overview of this can
be found in table 1
3.2 Exploratory Data Analysis
A total of 23412 earthquakes are recorded in this data set.
By analysing the magnitude types of different earthquakes
it can be seen that 99.5% of all the earthquakes fall under
six categories as observed in table 2 [13]. The different
Magnitude types can be explained as follows:
MW (Moment Magnitude) - Derived from seismic
moment (most common and general type).
MWC (Moment Centroid) - Derived from a centroid
moment tensor inversion of long-period surface waves.
MWB (Moment Body Wave) - Derived from a cen-
troid moment tensor inversion of body waves.
MB (Moment Short Body Wave) - Derived from a
centroid moment tensor inversion of short-period body
waves.
MWW (Moment W-phase) - Derived from a centroid
moment tensor inversion of the W-phase.
MS (Moment Surface Wave) - Derived from a cen-
troid moment tensor inversion of surface waves.
It makes sense to study the basic statistics regarding the
attributes magnitude and depth. The relevant values can
be found in the table 3. First, the outliers in the data
set will be detected based on magnitude and depth val-
ues. Outlier detection can be done using the interquartile
range(IQR) and the 3-sigma rule.
IQR is equal to the difference between 75th(Q3) and 25th(Q1)
percentiles. Outliers are defined as observations that fall
below Q11.5IQR or above Q3 + 1.5IQR.
Earthquakes having magnitude outside this range (4.99,6.60)
are considered outliers. This can be seen in the figure
1. The earthquake with highest magnitude (9.1 on the
Richter scale) in this data set was recorded on December
26, 2004 and is known as the Sumatra-Andaman earth-
quake [13].
Figure 1. Boxplot: Earthquake magnitude
Figure 2. Boxplot: Earthquake depth
Earthquakes detected having depth outside of this range
(44.69,113.21) are considered outliers. This can be seen
in figure 2.
The 3-sigma rule or the empirical rule, states that 68%,
95%, and 99.7% of the values in a data set lie within one,
two, and three standard deviations of the mean, respec-
tively. Thus the absolute value of the z-score of the mag-
nitude and depth of an earthquake should be less than 3.
Using this 3-sigma it can be seen that 1050 and 447 earth-
quakes in this data set are classified as outliers based on
magnitude and depth values respectively.
In figures 3 and 4, the number of occurrences of earth-
quakes with varying magnitudes are plotted. From fig-
ure 4, it is clearly visible that most of the earthquakes
3
Table 1. Data set null values
Values Date Time Latitude Longitude Depth Depth Error
Depth
Seismic
Station
Mag Mag
Type
Mag
Error
Mag
Seismic
Station
Azimuthal
Gap
Hor.
Dist
Hor.
Error
Root
Mean
Square
ID Source Location
Source
Magnitude
Source Status
Non-Null 23412 23412 23412 23412 23412 4461 7097 23412 23409 327 2564 7299 1604 1156 17352 23412 23412 23412 23412 23412
Null 0 0 0 0 0 18951 16315 0 3 23805 20848 16113 21808 22256 6060 0 0 0 0 0
Figure 3. Earthquake Magnitude vs Number of Occurrences
Figure 4. Earthquake Magnitude vs Number of Occurrences
have a magnitude between 5.5 and 6.6. In figure 3, it can
be seen that relatively few earthquakes have magnitude
higher than 6.6 which is consistent with the boxplot in
figure 1.
4
Figure 5. Number of Earthquakes in each year
Table 2. Different Magnitude Types
Magnitude
Type MW MWC MB MWB MWW MS Others
Percentage
(%) 33 24.2 16 10.5 8.5 7.3 0.5
Table 3. Magnitude and Depth statistics
count mean std min 25% 50% 75% max
Magnitude 23412 5.88 0.42 5.5 5.6 5.7 6.6 9.1
Depth 23412 70.77 122.65 -1.1 14.5 33 54 700
3.3 Understanding the Data
The number of earthquakes that occurred each year from
1965 to 2016 can provide some important insight into the
data. In figure 5, a count plot shows the number of earth-
quakes in each year. From this count plot it can be seen
that in the last 50 years the highest number of earthquakes
were recorded in 2011 with 713 earthquakes followed by
2007 when a total of 608 earthquakes were recorded. In
1996, the lowest number of earthquakes were detected with
a total of 234.
It is necessary to know if the two attributes magnitude and
depth are correlated and the possibility to identify the un-
derlying relationship. The correlation coefficient between
the magnitude and depth of the earthquakes recorded in
the data set yields a value of 0.0234. If the value of the
correlation coefficient approaches 0 it can be interpreted
as the two attributes having no correlation at all. It can be
concluded that in the context of this data set magnitude
and depth are not correlated. An earthquake can be de-
tected on the earths surface anywhere between 0700 km
and is categorized as shallow, intermediate or deep based
on the depth. These earthquakes can be detected on land
and also on the ocean floor. However, earthquakes of sim-
ilar magnitude can have varying depths if the detected lo-
cation is below the ocean surface as compared to on land.
Out of the 23412 earthquakes in the data set, 21937 earth-
quakes have a magnitude lower than 6.6 and and the re-
maining 1475 earthquakes have magnitudes higher than
6.6. This information can be visually represented by plot-
ting the latitude and longitude of different earthquakes on
a world map. This provides a better context to the data
and helps understand how different earthquakes are dis-
tributed throughout the Earth based on their source.
Figure 6. Earthquakes with magnitude <6.6
Figure 7. Earthquakes with magnitude >6.6
In figure 6, the earthquakes with magnitude less than 6.6
are shown and they are represented by a green marker.
And in figure 7, the earthquakes with magnitude greater
5
than 6.6 are shown and they are represented by a red
marker.
4. EXPERIMENT
The following four models were trained using the data -
1. Random forest
2. Linear regression
3. Polynomial regression
4. Long Short Term Memory
The date and time of occurrence of an earthquake does not
follow a pattern and the interval between two subsequent
earthquakes is never same. Hence the data cannot be con-
sidered to a time-series. For the models the input will be
Latitude and Longitude of the earthquake and output will
be the magnitude and depth. The data is split between
training and testing sets with the training data set con-
taining 80% and the testing data set containing 20% of
the data.
4.1 Random Forest
The built-in random forest regression function from the
library scikit-learn is used. The model is trained at first
with 10 decision trees as this is the default value.
In supervised machine learning when a model works very
well on the training data but is unable to perform sim-
ilarly on the testing data the situation called overfitting.
This problem can be solved by finding the optimum values
of the hyperparameters of a model. A hyperparameter is
a parameter whose value is used to control the learning
process of a model. In case of random forest this hyperpa-
rameter is the number of decision trees generated during
the learning process. The built-in ‘GridSearchCV’ func-
tion in scikit-learn will be used for the process of tuning
hyperparameters.
Since the default value number of trees is 10, the model
is further trained with the number of decision trees gener-
ated ranging from 20 to 200 with an interval of 10. After
running this process it is observed that the model shows
the best performance when 120 decision trees are used.
Random Forest uses decision trees for regression analysis
and hence does not require feature scaling. The algorithm
partitions the data set so even if feature scaling like min-
max normalization is applied the results would remain un-
changed.
4.2 Linear Regression and Polynomial regres-
sion
A linear regression model is trained using the data and
the built-in functions of scikit-learn are used. Linear re-
gression and polynomial regression are sensitive to high
values. Due to this reason, feature scaling is important
as the attributes in the data set are measured in different
units which vary across a wide range and this might end
up creating a bias. Min-max normalization is used on the
data set to make sure that all values are within the range
(0,1).
For polynomial regression, the relationship between de-
pendent and independent variables is explained using a
polynomial of degree n. The degree parameter is varied
from 2 to 20 and some interesting results can be seen. A
polynomial of degree 16 shows the best results.
4.3 Long Short Term Memory
The keras library is used to implement the LSTM model
in this research. Like in the previous models we will try to
predict the magnitude and depth of an earthquake using
regression analysis. Min-max normalization is used on the
data set. The data set is split into training and testing sets
containing 80% and 20% of the data respectively with 10%
of the dataset used for validation.. For an LSTM network
we have to set the lookback value which is the number of
previous inputs the model will take into account when pre-
dicting the next value. The model is tested with lookback
values 50 and 100 respectively.
The Stacked LSTM is an extension to a standard LSTM
model which can have multiple hidden LSTM layers[6].
The addition of hidden layers makes the model deeper
and helps identify the complex relationship between mul-
tiple variables in a data set[6]. The network used in this
research has a visible layer, 2 hidden layers with 100 neu-
rons and with a sigmoid activation function, and an output
layer that predicts 4 values. The linear activation function
is used in the output layer as this does not change the
weighted sum of the input and returns the direct value.
Dropout is a regularization technique for neural network
models where randomly selected neurons are ignored dur-
ing training[12]. This makes the model less sensitive to
neuron weights and also prevents overfitting [12]. Dropout
with a 20% (i.e. 20% of the neurons are discarded each
weight cycle)is implemented in this model using the built-
in dropout function in keras.
5. RESULTS
5.1 Performance Metrics
R2score, explained variance score (ExV ar) and mean
squared error(MSE ) will be used to evaluate the perfor-
mance of the models implemented in this research.
R2score or the coefficient of determination determines the
effectiveness of the regression model and is defined by :
R2(y, ˆy) = 1 Pn
i=1(yiˆyi)2
Pn
i=1(yi¯y)2
where ¯y=1
nPn
i=1 yiand Pn
i=1(yiˆyi)2=Pn
i=1 2
i
ˆyis the predicted value and yis the corresponding true
value.
Explained variance score is defined as :
Ex var(y, ˆy) = 1 V ar{yˆy}
V ar{y}
where ˆyis the estimated target output, ythe correspond-
ing true target output, and V ar is Variance, the square of
the standard deviation.
Mean squared error measures the average of the square of
errors and is defined as :
MSE(y, ˆy) = 1
nsamples
nsamples1
X
i=0
(yiˆyi)2
where ˆyiis predicted output and yiis true output of the
ith sample.
5.2 Discussion
Table 4 shows the evaluation metrics for all the models
that have been tested and their relative effectiveness when
trying to predict the magnitude and depth of earthquakes.
R2score and explained variance score determines the
6
Figure 8. R2score : Magnitude
Figure 9. R2score : Depth
regression score of a model and can achieve a best possi-
ble score of 1.0. A score of 0.0 indicates that the model
always predicts the expected value of the output, disre-
garding the input features. A negative value implies that
a straight line fits the data better than the tested model.
This can be seen from the scores in table 4. Both R2score
and explained variance score attain a negative value while
predicting magnitude in case of random forest and LSTM.
This can be seen in figure 8 where it also shows that linear
regression performs better in comparison.
In figure 12, the relative performances of the polynomial
regression is compared. The performance gradually in-
creases from degree 2 to degree 12 and then falls abruptly
from there. A polynomial of degree 16 shows the best per-
formance. Coincidentally, polynomial regression with a
polynomial of degree 16 is also the best performing model
showing comparatively better results than others. For pre-
dicting the magnitude of an earthquake polynomial regres-
sion yields the best results followed by linear regression,
random forest, and LSTM. This can be seen from figures
8 and 10, where the R2scores and mean squared error for
magnitude prediction are shown.
However, an interesting result is obtained from the random
forest model. The random forest model is able to predict
the depth of an earthquake incredibly well. Both R2score
and explained variance scores in this case are 0.8574 which
is very close to the perfect value of 1.0 as seen in figure
11. The mean squared error for depth is also the lowest
for random forest when compared to the other models as
seen in figure 11.
6. CONCLUSION
The performance from four different machine learning mod-
els used in this research further supports the argument
that it is very difficult to accurately predict earthquakes.
Analysis of the data set provided great insight and helped
Figure 10. Mean squared error : Magnitude
Figure 11. Mean squared error : Depth
identify patterns and key attributes that would be rele-
vant for this research. Latitude and longitude of detected
earthquakes can be used as dependent variables to pre-
dict the magnitude and depth of future earthquakes. This
can be helpful for determining the magnitude and depth
of earthquakes in seismic zones where the frequency of re-
current earthquakes is high.
Furthermore, this paper presents three important findings
1. Polynomial regression (with a polynomial of degree
16) is the best method for predicting the magnitude
of an earthquake.
2. Random forest can be extremely effective in predict-
ing the depth of earthquakes
3. Polynomial regression is the overall best performing
model.
For further improvement seismic data collected from seis-
mographs positioned all around the world can be analysed
and used to improve the different models implemented
here. Also, given the nature of polynomial regression and
the size of the data set used, there is a high probability of
overfitting in the polynomial regression model used in this
research. This can be another topic for further improve-
ment.
7. REFERENCES
[1] Andrews, D. F. A robust method for multiple
linear regression. Technometrics 16, 4 (1974),
523–531.
[2] Asim, K., Mart
´
ınez- ´
Alvarez, F., Basit, A., and
Iqbal, T. Earthquake magnitude prediction in
hindukush region using machine learning techniques.
Natural Hazards 85 (01 2017), 471–486.
[3] Bhandarkar, T., K, V., Satish, N., Sridhar, S.,
Sivakumar, R., and Ghosh, S. Earthquake trend
prediction using long short-term memory rnn.
International Journal of Electrical and Computer
Engineering (IJECE) 9 (04 2019), 1304.
[4] Breiman, L. Random forests. Mach. Learn. 45, 1
(Oct. 2001), 5–32.
7
Figure 12. Polynomial Regression Performance Comparison
Table 4. Performance : Polynomial regression, Linear regression, Random forest, and LSTM
Model/Metrics R2score
(Mag)
R2score
(Depth)
ExVar
(Mag)
ExVar
(Depth)
MSE
(Mag)
MSE
(Depth)
Poly Reg 0.0132 0.1416 0.0132 0.1420 0.1809 6862
Linear Reg 0.0013 0.0102 0.0013 0.0103 0.1831 14308
Random Forest -0.1207 0.8574 -0.1207 0.8574 0.20550 2061
LSTM -0.1553 -0.2275 -0.0223 -0.0006 0.2131 5350
[5] Geller, R., Jackson, D., Kagan, Y., and
Mulargia, F. Earthquakes cannot be predicted.
Science 275 (1997), 1616 – 1616.
[6] Graves, A., Mohamed, A.-r., and Hinton, G.
Speech recognition with deep recurrent neural
networks. In 2013 IEEE International Conference
on Acoustics, Speech and Signal Processing (2013),
pp. 6645–6649.
[7] Hochreiter, S., and Schmidhuber, J. Long
short-term memory. Neural computation 9 (12
1997), 1735–80.
[8] Kuyuk, H. S., and Susumu, O. Real-time
classification of earthquake using deep learning.
Procedia Computer Science 140 (2018), 298–305.
[9] Li, A., and Kang, L. Knn-based modeling and its
application in aftershock prediction. In Proceedings
of the 2009 International Asia Symposium on
Intelligent Interaction and Affective Computing
(USA, 2009), ASIA ’09, IEEE Computer Society,
p. 83–86.
[10] Mallouhy, R., Abou Jaoude, C., Guyeux, C.,
and Makhoul, A. Major earthquake event
prediction using various machine learning
algorithms. In International Conference on
Information and Communication Technologies for
Disaster Management (Paris, France, Dec. 2019).
[11] Shearer, P. M. Introduction to Seismology, 2 ed.
Cambridge University Press, 2009.
[12] Srivastava, N., Hinton, G., Krizhevsky, A.,
Sutskever, I., and Salakhutdinov, R. Dropout:
A simple way to prevent neural networks from
overfitting. Journal of Machine Learning Research
15 (06 2014), 1929–1958.
[13] USGS. Earthquake hazards. www.usgs.gov/natural-
hazards/earthquake-hazards/earthquakes.
Accessed: 2021-17-06”.
[14] USGS. Significant earthquakes - 2021.
https://earthquake.usgs.gov/earthquakes/browse/significant.php.
Accessed: 2021-17-06”.
[15] Zhang, A., Lipton, Z. C., Li, M., and Smola,
A. J. Dive into Deep Learning. 2019.
http://www.d2l.ai.
8
APPENDIX
A. PERFORMANCE METRICS
Table 5. Performance Ranking: Random forest
Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
No of
Trees 120 200 190 180 170 160 150 140 110 30 20 10 60 130 50 70 80 40 100 90
Table 6. LSTM: 2 hidden layers with 50 neurons (lookback
= 50)
LSTM RScore
(Mag)
RScore
(Depth)
Ex Var
(Mag)
Ex Var
(Depth)
MSE
(Mag)
MSE
(Depth)
50-50 -0.3830 -0.1911 -0.3324 -0.0008 0.3152 7476
B. FIGURES
Figure 13. Scatter plot : Magnitude vs Depth
9
... Aftershocks follow the mainshock, gradually decreasing in frequency and magnitude. Earthquake data represent time series data with location, time, depth, and magnitude information [1][2][3][4]. Throughout history, several major earthquakes have left a lasting impact on the affected regions, including the 1960 Great Chilean Earthquake, the 1964 Prince William Sound Earthquake in Alaska, the 2004 Sumatra-Andaman Islands Earthquake, the 2011 Great Tohoku Earthquake in Japan, and the 1952 earthquake near Petropavlovsk-Kamchatsky, Russia. ...
... These seismic events are among the most catastrophic natural disasters, resulting in significant casualties and imposing substantial economic burdens on affected communities. The impact of earthquakes extends beyond human lives and infrastructure, often causing secondary environmental repercussions like surface ruptures, soil liquefaction, tsunamis, landslides, and fires [4,[8][9][10][11][12][13][14][15][16][17]. The researchers emphasized the devastating consequences of earthquakes, including loss of life, injuries, displacement, and structural damage [18] By forecasting earthquakes, individuals can take timely actions to protect themselves and reduce damage and economic losses. ...
... Traditional methods were limited by manual interpretation and small datasets, hindering accuracy. Data science techniques like machine learning enable researchers to analyse large-scale seismic data for valuable insights, leading to improved forecasting models, hazard assessments, and early warning systems, enhancing our preparedness and response strategies [4,13,19,[21][22][23]. Clustering, a core technique in data mining, emerges as a crucial aspect of this research, aiming to group similar seismic events for deeper insights into earthquake dynamics [24,25]. ...
Article
Full-text available
Earthquakes, as intricate natural phenomena, profoundly impact lives, infrastructure, and the environment. While previous research has explored earthquake patterns through data analysis methods, there has been a gap in examining the time intervals between consecutive earthquakes across various magnitude categories. Given the complexity and vastness of seismic data, this study aims to provide comprehensive insights into global seismic activity by employing sophisticated data analysis methodologies on a century-long dataset of seismic events. The four-phase methodology encompasses exploratory data analysis (EDA), temporal dynamics exploration, spatial pattern analysis, and cluster analysis. The EDA serves as the foundational step, providing fundamental insights into the dataset's attributes and laying the groundwork for subsequent analyses. Temporal dynamics exploration focuses on discerning variations in earthquake occurrences over time. Spatial analysis identifies geographic regions with heightened earthquake activity and uncovers patterns of seismic clustering. K-means clustering is employed to delineate distinct earthquake occurrence clusters or hotspots based on geographical coordinates. The study's findings reveal a notable increase in recorded earthquakes since the 1960s, peaking in 2018. Distinct patterns in seismic activity are linked to factors such as time, human activities, and plate boundaries. The integrated approach enriches understanding of global earthquake trends and patterns, contributing to improved seismic hazard assessments, early warning systems, and risk mitigation efforts.
... In regions with significant earthquake events, accurate earthquake predictions can save lives and avoid infrastructure losses. There have been many prediction methods utilized in the past, including radon emissions in wells [4], strange animal behavior [5], hydro-chemical changes [6], electromagnetic precursors [7], changes in seismic wave velocity [8], foreshocks [9], changes in ground levels using GPS [10] and machine learning (ML) [11,12]. Among these methods, ML are the newest and one of the most promising earthquakes prediction methods [12]. ...
... There have been many prediction methods utilized in the past, including radon emissions in wells [4], strange animal behavior [5], hydro-chemical changes [6], electromagnetic precursors [7], changes in seismic wave velocity [8], foreshocks [9], changes in ground levels using GPS [10] and machine learning (ML) [11,12]. Among these methods, ML are the newest and one of the most promising earthquakes prediction methods [12]. Machine learning and deep learning are innovative areas in computer sciences that uses algorithms to improve the performance of systems by learning from data. ...
... Thus, a variety of algorithms should be used to predict earthquakes including the ML techniques which adapts their outputs to various patterns by figuring out how the inputs and outputs are related to one another. Mondol [12] predicted earthquake magnitudes and their depths in the Hindukush region of Pakistan using historic seismic data using four ML algorithms including the random forest (RF), linear regression, polynomial regression, and the long short-term memory (LSTM). The polynomial regression method was determined to work better than the other methods and its performance was evaluated using R 2 and RMSE [12]. ...
Thesis
Earthquakes are vibrations of the Earth’s surface that can cause ground shaking, fires, tsunamis, landslides, and fissures. These natural phenomena can cause destruction and kill lives. When there is a possibility of an earth­ quake, an accurate prediction can save lives and avoid infrastructure damage. Due to the probabilistic nature of an earthquake occurring and the challenge of achieving an efficient and dependable model for earthquake prediction, efforts to predict earthquakes have been met with mixed results. Therefore, new methods are constantly sought. A deep learning-based technique, specifically a transformer algorithm, was applied to predict earthquake magnitudes using available data for the Horn of Africa. The problem was formulated as a multi-variant time series regression and predictions were made for earthquake magnitudes greater than or equal to 3 for the next three months. A comparison of the results was made with the output obtained from long short-term memory (LSTM), bidirectional long short-term memory (BILSTM), and bidirectional long short-term memory with attention (BILSTM-AT) models. The results showed that the transformer model outperformed the other three models with 0.276, 0.147, 0.383, 28.868% mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE) and mean absolute percentage error (MAPE) respectively in predicting earthquake mag­ nitudes in the Horn of Africa.
Chapter
Random forest regression is an ensemble, supervised learning algorithm capable of executing both classification and regression. Within this report, the use of the following algorithm will be implemented on an earthquake dataset which consists of all recorded occurrences of earthquakes from 1930 to 2018. Certain columns from the database will be used as target variables such as magnitude and depth to predict the following outcome based on trained data. Hyper parameter tuning will be performed to maximize the model's performance by increasing its accuracy, decreasing errors, and ensuring efficiency. The parameter in this model that contributed to the efficiency while performing hyper parameter tuning was number of estimators. Findings from the research report concluded that the model's accuracy levels were approximately 75%. Despite increasing the number of trees used, the model's accuracy did not significantly change and improve but rather significantly slowed down the run-time.
Article
Full-text available
Existing Earthquake Early Warning Systems (EEWSs) calculates the location and magnitude of an earthquake using real-time waveforms from seismic stations within a few seconds. Typically, three to six stations are necessary to estimate earthquake parameters. Waiting for primary (P-) wave information from closest stations results in a blind-zone area where the arrival of secondary (S-) wave cannot be provided around the epicenter of an earthquake. If an earthquake occurred under a city center, EEWSs would not work even though each building has a seismic sensor in a smart city in future. Here, we present a methodology to classify earthquake vibrations into near-source or far-source within one second after P-wave detection. This will allow warnings to citizens who are the residence of earthquake epicenter in case of an earthquake very close by. We trained a deep learning Long Short-Term Memory (LSTM) network for sequence-to-label classification. 305 three component accelerations recorded between 2000 and 2018 in Japan are used to train the artificial network by extracting thirteen features of one second of P-wave. The accuracy of the methodology is 98.2%. 54 out of 55 near-source waveforms classified correctly and only 2 of 80 waveforms were misclassified. We tested the LSTM network with 2018 Northern Osaka (M 6.1.) earthquakes in Japan where closest stations are correctly identified with 83.3% accuracy. Therefore, smart cities donated with smart automated shut-on/off machines and sensors will be more resilient against earthquake disaster even EEWSs are not available in the blind zone area in future.
Article
Full-text available
Earthquake magnitude prediction for Hindukush region has been carried out in this research using the temporal sequence of historic seismic activities in combination with the machine learning classifiers. Prediction has been made on the basis of mathematically calculated eight seismic indicators using the earthquake catalog of the region. These parameters are based on the well-known geophysical facts of Gutenberg–Richter’s inverse law, distribution of characteristic earthquake magnitudes and seismic quiescence. In this research, four machine learning techniques including pattern recognition neural network, recurrent neural network, random forest and linear programming boost ensemble classifier are separately applied to model relationships between calculated seismic parameters and future earthquake occurrences. The problem is formulated as a binary classification task and predictions are made for earthquakes of magnitude greater than or equal to 5.5 (\(M \ge\) 5.5), for the duration of 1 month. Furthermore, the analysis of earthquake prediction results is carried out for every machine learning classifier in terms of sensitivity, specificity, true and false predictive values. Accuracy is another performance measure considered for analyzing the results. Earthquake magnitude prediction for the Hindukush using these aforementioned techniques show significant and encouraging results, thus constituting a step forward toward the final robust prediction mechanism which is not available so far.
Article
The prediction of a natural calamity such as earthquakes has been an area of interest for a long time but accurate results in earthquake forecasting have evaded scientists, even leading some to deem it intrinsically impossible to forecast them accurately. In this paper an attempt to forecast earthquakes and trends using a data of a series of past earthquakes. A type of recurrent neural network called Long Short-Term Memory (LSTM) is used to model the sequence of earthquakes. The trained model is then used to predict the future trend of earthquakes. An ordinary Feed Forward Neural Network (FFNN) solution for the same problem was done for comparison. The LSTM neural network was found to outperform the FFNN. The R^2 score of the LSTM is better than the FFNN’s by 59%. Copyright © 2019 Institute of Advanced Engineering and Science. All rights reserved.
Conference Paper
Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end training methods such as Connectionist Temporal Classification make it possible to train RNNs for sequence labelling problems where the input-output alignment is unknown. The combination of these methods with the Long Short-term Memory RNN architecture has proved particularly fruitful, delivering state-of-the-art results in cursive handwriting recognition. However RNN performance in speech recognition has so far been disappointing, with better results returned by deep feedforward networks. This paper investigates $backslash$emphdeep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs. When trained end-to-end with suitable regularisation, we find that deep Long Short-term Memory RNNs achieve a test set error of 17.7% on the TIMIT phoneme recognition benchmark, which to our knowledge is the best recorded score.
Article
Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. © 2014 Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever and Ruslan Salakhutdinov.
Article
Techniques of fitting are said to be resistant when the result is not greatly altered in the case a small fraction of the data is altered: techniques of fitting are said to be robust of efficiency when their statistical efficiency remains high for conditions more realistic than the utopian cases of Gaussian distributions with errors of equal variance. These properties are particularly important in the formative stages of model building when the form of the response is not known exactly. Techniques with these properties are proposed and discussed.
Article
Can the time, location, and magnitude of future earthquakes be predicted reliably and accurately? In their Perspective, Geller et al.'s answer is “no.” Citing recent results from the physics of nonlinear systems “chaos theory,” they argue that any small earthquake has some chance of cascading into a large event. According to research cited by the authors, whether or not this happens depends on unmeasurably fine details of conditions in Earth's interior. Earthquakes are therefore inherently unpredictable. Geller et al. suggest that controversy over prediction lingers because prediction claims are not stated as objectively testable scientific hypotheses, and due to overly optimistic reports in the mass media.
Conference Paper
For the problem that the prediction accuracy of real-valued attribute data is not high, a modeling method named PR-KNN (polynomial regression and k nearest neighbor) is proposed, which is based on combination of KNN (k nearest neighbor) algorithm and polynomial regression model. Firstly, K nearest decision attribute values in training samples are selected by using KNN algorithm. Secondly, these K nearest decision attribute values are modeled by using polynomial regression method. And this method is applied to aftershock prediction. Experimental data are the sequence data of aftershocks with magnitude greater than or equal to 4.0 from Wenchuan earthquake. Comparing with traditional KNN regression algorithm and distance-weighted KNN regression algorithm, experimental results show that the maximum relative error predicted by PR-KNN reduces by 6.012% and 7.751% respectively, and maximum absolute error reduces by 0.367 and 0.473 respectively.
Chapter
This book provides an approachable and concise introduction to seismology theory. It clearly explains the fundamental concepts, emphasizing intuitive understanding over lengthy derivations. Topics include all that is needed for a comprehensive first course in seismology: stress/strain theory, seismic wave equation, ray theory, tomography, reflection seismology, surface waves, source theory, anisotropy and earthquake prediction. Detailed exercises follow each chapter, giving students the opportunity to apply the techniques they have learned to compute results of interest and to illustrate some of Earth's seismic properties. In several cases, computer subroutines are provided to assist with these exercises. Numerous illustrations accompany the text, including examples of seismograms and images of the global seismic wavefield. This textbook is ideal for any introductory course in seismology taught to upper-division undergraduates or first-year graduate students, and is especially suited for a one-semester