ResearchPDF Available

Analysis and Prediction of Earthquakes using different Machine Learning techniques

July 2021

July 2021

DOI:10.13140/RG.2.2.15085.10727

Authors:

University of Zurich

A reliable and accurate method for earthquake prediction has the potential to save countless human lives. With that objective in mind, this paper looks into various methods to predict the magnitude and depth of earthquakes. In this paper, real-world earthquake data is analysed to identify patterns and gain insight into this natural calamity. This data is then used to train four machine learning models namely Random forest, linear regression, polynomial regression , and Long Short Term Memory for predicting the magnitude and depth of earthquakes. The performances are compared to find the most effective model. It is very difficult to accurately predict the magnitude of earthquakes however, in this paper it can be seen that polynomial regression shows the best overall results. Also, Random forests are incredibly effective in predicting the depth of an earthquake.

Boxplot: Earthquake magnitude

…

Earthquake Magnitude vs Number of Occurrences

…

Number of Earthquakes in each year

…

Earthquakes with magnitude < 6.6

…

R 2 score : Magnitude

…

Figures - uploaded by Manaswi Mondol

Content may be subject to copyright.

Content uploaded by Manaswi Mondol

Content may be subject to copyright.

Analysis and Prediction of Earthquakes using different

Machine Learning techniques

Manaswi Mondol

University of Twente

P.O. Box 217, 7500AE Enschede

The Netherlands

m.mondol@student.utwente.nl

ABSTRACT

A reliable and accurate method for earthquake prediction

has the potential to save countless human lives. With that

objective in mind, this paper looks into various methods to

predict the magnitude and depth of earthquakes. In this

paper, real-world earthquake data is analysed to identify

patterns and gain insight into this natural calamity. This

data is then used to train four machine learning models

namely Random forest, linear regression, polynomial re-

gression, and Long Short Term Memory for predicting

the magnitude and depth of earthquakes. The perfor-

mances are compared to ﬁnd the most eﬀective model.

It is very diﬃcult to accurately predict the magnitude of

earthquakes however, in this paper it can be seen that

polynomial regression shows the best overall results. Also,

Random forests are incredibly eﬀective in predicting the

depth of an earthquake.

Keywords

Earthquake Prediction, Machine Learning, Regression Anal-

ysis, Random Forest, Linear Regression,Polynomial Re-

gression, Long Short Term Memory

1. INTRODUCTION

Earthquakes are devastating natural calamities responsi-

ble for widespread death and destruction throughout the

world. It is impossible to prevent earthquakes and, their

occurrence does not follow any noticeable pattern. A rea-

sonably accurate and timely prediction of an earthquake

can potentially save thousands of lives and also prevent

damage to property. In the last two decades, countless

studies have been conducted on the topic of earthquake

prediction. Predicting an earthquake involves stating the

exact time, magnitude and location of an upcoming earth-

quake. This prediction can be classiﬁed into short-term,

intermediate and long-term prediction based on the dura-

tion of the time scale. This problem is nigh impossible to

solve and as a result in 1997, Geller et al. [5] concluded

that predicting earthquakes is impossible . Over time re-

search has shown that it is not completely impossible to

predict earthquakes. In this paper, some of the studies

that have been successful in developing a reasonably ac-

curate prediction model have been discussed [2, 8, 10, 3,

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies

are not made or distributed for proﬁt or commercial advantage and that

copies bear this notice and the full citation on the ﬁrst page. To copy oth-

erwise, or republish, to post on servers or to redistribute to lists, requires

prior speciﬁc permission and/or a fee.

35th Twente Student Conference on IT July 2nd , 2021, Enschede, The

Netherlands.

ing, Mathematics and Computer Science.

9]. However, in spite of such great eﬀort by scientists

throughout the world a valid and reliable method to make

perfectly accurate prediction has not been found yet. In

this paper, the general problem of earthquake prediction

is narrowed down to predicting the magnitude and depth

of an earthquake.

In this paper the following research question will be an-

swered

•How can machine learning techniques be used to pre-

dict the approximate magnitude and depth of earth-

quakes by analyzing earthquake data?

In order to answer this question analysis of the data set will

be done followed by training diﬀerent machine learning

models and ﬁnally comparing their performances. Conse-

quently, the following sub research questions will be an-

swered.

•Which parameters of earthquake data are relevant for

this research?

•What insights can be obtained from analysing the

data ?

•Which machine learning techniques are most eﬀec-

tive and produces the best results?

This research attempts to analyse earthquake data and

predict the magnitude and depth of earthquakes. The

analysis of earthquake data performed here helps to iden-

tify the important attributes of the data which can be

used for prediction. The analysis also provides some in-

teresting insights like the existence of outliers, the most

frequent earthquake magnitude, and the distribution of

earthquakes on the world map based on the location and

magnitude. Four machine learning models were trained

using the data set namely: random forest, linear regres-

sion, polynomial regression, and Long Short Term Mem-

ory. The results from the diﬀerent models are compared

to identify the best performing model.

In section 2, the diﬀerent machine learning models used

in this research are brieﬂy explained. Prior research con-

ducted on this topic is also explored in this section. In

section 3, exploratory data analysis and data cleaning is

performed on the data set to make sure that relevant data

is used to train the machine learning models. In section

4, the machine learning models are trained using the data

and cross validation is performed to ﬁnd the best hyperpa-

rameters. In section 5, the results from the diﬀerent mod-

els are evaluated and compared to ﬁnd the best-performing

machine learning model.

2. BACKGROUND

2.1 Regression Analysis

The magnitude and depth of an earthquake being numer-

ical data, regression analysis will be used for the purpose

of prediction. Regression analysis is a statistical method

used for ﬁnding relationships between independent and de-

pendent variables. Regression analysis can be used for

prediction and forecasting. In this paper, regression anal-

ysis will be used to predict the magnitude and depth of

earthquakes. Linear regression is the most widely used re-

gression algorithm where a linear model is ﬁtted to explain

the relationship between the independent and dependent

variables[1]. When linear regression has to be performed

for two or more dependent variables, multiple linear re-

gression is used [1]. Multiple linear regression will be used

in this research to predict both magnitude and depth.

Sometimes, a straight line is unable to properly explain

the relationship between dependent and independent vari-

ables. In cases like that ﬁtting a polynomial of degree nto

the data can perform signiﬁcantly better and help model

non-linear relationships. This is referred to as polynomial

regression and can be viewed as an extension of the simple

linear regression model [1]. Other Non-linear regression

models like random forests and neural networks will also

be explored.

2.2 Random Forest

Decision trees are non-parametric supervised machine learn-

ing techniques used for classiﬁcation and regression. Ran-

dom forest is a supervised machine learning algorithm

based on decision trees but has signiﬁcantly better per-

formance. Random forest builds a large number of deci-

sion trees and merges them through bagging to produce

more accurate and stable predictions[4]. Random forests

are very versatile as they are eﬀective for both classiﬁ-

cation and regression problems. A common problem for

machine learning algorithms is overﬁtting of data however,

this problem can be easily solved in a random forest model

if there are suﬃcient number of decision trees[4]. Because

of these reasons a random forest model will be used in this

research.

2.3 LSTM

Neural networks are modelled after the human brain and

contain sensory units called neurons which are connected

to each other through weights[15]. Neural networks are

very eﬀective in modelling complex non-linear relation-

ships. Recurrent neural networks are a type of neural

network that is capable of tracking information from previ-

ous states or sequences[15]. RNNs are very eﬀective when

dealing with sequential data. LSTM or Long Short-Term

Memory is a recurrent neural network architecture that

solves the problem of vanishing gradient[7]. An LSTM net-

work has four neural networks, three of which are control

gates namely - the input, output, and the forget gate and

the fourth neural network is used to estimate the memory

cell parameter[15]. The introduction of these gates helps

to improve the learning capabilities and the memory ca-

pacity of an LSTM network. This makes LSTM networks

very useful when dealing with large sequential data.

2.4 Literature Review

The occurrence of earthquakes is a highly random phe-

nomenon and so the existence of a model that can ac-

curately predict the time, location and magnitude of an

earthquake is not known. Several studies have been con-

ducted throughout the years by researchers in this ﬁeld.

Researchers have tried to approach this problem from dif-

ferent perspectives to ﬁnd a potential solution. In this

section, some of these studies which are relevant and help-

ful for this research will be discussed.

Since this research involves predicting the magnitude and

depth of earthquakes, the paper written by Mallouhy et al.

[10] serves as a good starting point. In this paper, eight

diﬀerent machine learning algorithms are implemented to

predict if an earthquake event can be classiﬁed as major or

minor. This paper helps to understand diﬀerent machine

learning algorithms when these are applied to the context

of earthquake data. Random forest and K nearest neigh-

bour algorithm showed the best results when compared to

other models[10].

There are two general approaches to predict earthquakes

namely precursor based and trend based[11]. Precursors

are phenomena or signals that precede an impending earth-

quake for e.g; radon gas emissions and unusual animal be-

haviour. A great example of precursor based prediction

can be found in the paper by Kuyuk et al. [8]. In this

paper, they explain how they were able to train a Long

Short Term Memory (LSTM) to analyze information col-

lected from Earthquake Early Warning systems to detect

earthquakes before the seismic waves reach the centre of

a city or a densely populated area[8]. This can poten-

tially save many lives as being aware of an earthquake even

a bit earlier can drastically improve evacuation measures

[8]. Trend based methods deal with identifying patterns

in real-world data e.g; seismicity, prior earthquakes etc.

to predict earthquakes. This research also involves trend-

based prediction hence ﬁnding similar studies will prove

helpful.

In their paper, Asim et al.[2] predicted the magnitude of

earthquakes in the Hindukush region using historic seis-

mic data. They applied four machine learning algorithms

on a data set namely - pattern recognition neural network,

recurrent neural network(LSTM), random forest with 50

trees and linear programming boost ensemble [2]. Since

the study focused on predicting earthquakes in the same

location, the pattern recognition neural network showed

the best results with an accuracy of 65%[2].

Li et al.[9] explore the possibility of predicting aftershocks

with a magnitude greater than 4.0 in their research by

proposing a new model called PR-KNN. PR-KNN is a

combination of the polynomial regression method and the

K nearest neighbours(KNN) algorithm. This proposed

method was applied to experimental data from the Wenchuan

website. PR-KNN achieved noticeably better results than

the traditional KNN and Distance-weighted KNN algo-

rithms.

Lastly in the study conducted by Bhandarkar et al., they

predicted earthquake trends using Long Short Term Mem-

ory (LSTM) and Feed Forward Neural Network (FFNN)

and compare the results[3]. This study is very similar

to this one and was able to provide valuable information

regarding the performance of LSTM networks in this con-

text. The results from this paper show that an LSTM

network is 59% more eﬀective than an FFNN [3].

3. METHODOLOGY

The United States Geological Survey (USGS) provides

real-world earthquake data on past earthquakes. The USGS

classiﬁes “signiﬁcant events”based on a combination of the

magnitude of an earthquake, the number of ‘Did You Feel

It’ responses, and PAGER alert level[14]. The decision to

work with only signiﬁcant earthquakes was motivated by

two reasons namely, the time constraint and the reduction

in the volume of the data used to train the machine learn-

ing models. The data set used for this research was ob-

tained from www.kaggle.com/usgs/earthquake-database

which contains earthquake data collected from the USGS

website by Kaggle. This data set includes a record of the

date, time, location, depth, magnitude, and source of ev-

ery earthquake with a reported magnitude of 5.5 or higher

from 1965 until 2016.

3.1 Data Cleaning

The data set contains the following attributes describ-

ing earthquakes - ‘Date’, ‘Time’, ‘Latitude’, ‘Longitude’,

‘Type’, ‘Depth’, ‘Depth Error’, ‘Depth Seismic Stations’,

‘Magnitude’, ‘Magnitude Type’, ‘Magnitude Error’, ‘Mag-

nitude Seismic Stations’, ‘Azimuthal Gap’, ‘Horizontal Dis-

tance’, ‘Horizontal Error’, ‘Root Mean Square’, ‘ID’, ‘Source’,

‘Location Source’, ‘Magnitude Source’, and ‘Status’.

Out of these attributes ‘Date’, ‘Time’, ‘Latitude’, ‘Lon-

gitude’, ‘Type’, ‘Depth’, ‘Magnitude’, ‘Magnitude Type’,

‘ID’, ‘Source’, ‘Location Source’, ‘Magnitude Source’, and

‘Status’ have no null values. The attributes ‘Magnitude

Type’, ‘ID’, ‘Source’, ‘Location Source’, ‘Magnitude Source’,

and ‘Status’ are categorical variables and thus have no

signiﬁcance in regression analysis. Hence, the attributes

‘Date’, ‘Time’, ‘Latitude’, ‘Longitude’, ‘Magnitude’, and

‘Depth’ will be used for regression analysis in this research.

The built-in functions in the pandas library for python

were used for identifying the number of null and non-null

values for each attribute. A detailed overview of this can

be found in table 1

3.2 Exploratory Data Analysis

A total of 23412 earthquakes are recorded in this data set.

By analysing the magnitude types of diﬀerent earthquakes

it can be seen that 99.5% of all the earthquakes fall under

six categories as observed in table 2 [13]. The diﬀerent

Magnitude types can be explained as follows:

•MW (Moment Magnitude) - Derived from seismic

moment (most common and general type).

•MWC (Moment Centroid) - Derived from a centroid

moment tensor inversion of long-period surface waves.

•MWB (Moment Body Wave) - Derived from a cen-

troid moment tensor inversion of body waves.

•MB (Moment Short Body Wave) - Derived from a

centroid moment tensor inversion of short-period body

waves.

•MWW (Moment W-phase) - Derived from a centroid

moment tensor inversion of the W-phase.

•MS (Moment Surface Wave) - Derived from a cen-

troid moment tensor inversion of surface waves.

It makes sense to study the basic statistics regarding the

attributes magnitude and depth. The relevant values can

be found in the table 3. First, the outliers in the data

set will be detected based on magnitude and depth val-

ues. Outlier detection can be done using the interquartile

range(IQR) and the 3-sigma rule.

IQR is equal to the diﬀerence between 75th(Q3) and 25th(Q1)

percentiles. Outliers are deﬁned as observations that fall

below Q1−1.5∗IQR or above Q3 + 1.5∗IQR.

Earthquakes having magnitude outside this range (4.99,6.60)

are considered outliers. This can be seen in the ﬁgure

1. The earthquake with highest magnitude (9.1 on the

Richter scale) in this data set was recorded on December

26, 2004 and is known as the Sumatra-Andaman earth-

quake [13].

Figure 1. Boxplot: Earthquake magnitude

Figure 2. Boxplot: Earthquake depth

Earthquakes detected having depth outside of this range

(−44.69,113.21) are considered outliers. This can be seen

in ﬁgure 2.

The 3-sigma rule or the empirical rule, states that 68%,

95%, and 99.7% of the values in a data set lie within one,

two, and three standard deviations of the mean, respec-

tively. Thus the absolute value of the z-score of the mag-

nitude and depth of an earthquake should be less than 3.

Using this 3-sigma it can be seen that 1050 and 447 earth-

quakes in this data set are classiﬁed as outliers based on

magnitude and depth values respectively.

In ﬁgures 3 and 4, the number of occurrences of earth-

quakes with varying magnitudes are plotted. From ﬁg-

ure 4, it is clearly visible that most of the earthquakes

Table 1. Data set null values

Values Date Time Latitude Longitude Depth Depth Error

Depth

Seismic

Station

Mag Mag

Type

Mag

Error

Mag

Seismic

Station

Azimuthal

Gap

Hor.

Dist

Hor.

Error

Root

Mean

Square

ID Source Location

Source

Magnitude

Source Status

Non-Null 23412 23412 23412 23412 23412 4461 7097 23412 23409 327 2564 7299 1604 1156 17352 23412 23412 23412 23412 23412

Null 0 0 0 0 0 18951 16315 0 3 23805 20848 16113 21808 22256 6060 0 0 0 0 0

Figure 3. Earthquake Magnitude vs Number of Occurrences

Figure 4. Earthquake Magnitude vs Number of Occurrences

have a magnitude between 5.5 and 6.6. In ﬁgure 3, it can

be seen that relatively few earthquakes have magnitude

higher than 6.6 which is consistent with the boxplot in

ﬁgure 1.

Figure 5. Number of Earthquakes in each year

Table 2. Diﬀerent Magnitude Types

Magnitude

Type MW MWC MB MWB MWW MS Others

Percentage

(%) 33 24.2 16 10.5 8.5 7.3 0.5

Table 3. Magnitude and Depth statistics

count mean std min 25% 50% 75% max

Magnitude 23412 5.88 0.42 5.5 5.6 5.7 6.6 9.1

Depth 23412 70.77 122.65 -1.1 14.5 33 54 700

3.3 Understanding the Data

The number of earthquakes that occurred each year from

1965 to 2016 can provide some important insight into the

data. In ﬁgure 5, a count plot shows the number of earth-

quakes in each year. From this count plot it can be seen

that in the last 50 years the highest number of earthquakes

were recorded in 2011 with 713 earthquakes followed by

2007 when a total of 608 earthquakes were recorded. In

1996, the lowest number of earthquakes were detected with

a total of 234.

It is necessary to know if the two attributes magnitude and

depth are correlated and the possibility to identify the un-

derlying relationship. The correlation coeﬃcient between

the magnitude and depth of the earthquakes recorded in

the data set yields a value of 0.0234. If the value of the

correlation coeﬃcient approaches 0 it can be interpreted

as the two attributes having no correlation at all. It can be

concluded that in the context of this data set magnitude

and depth are not correlated. An earthquake can be de-

tected on the earths surface anywhere between 0−700 km

and is categorized as shallow, intermediate or deep based

on the depth. These earthquakes can be detected on land

and also on the ocean ﬂoor. However, earthquakes of sim-

ilar magnitude can have varying depths if the detected lo-

cation is below the ocean surface as compared to on land.

Out of the 23412 earthquakes in the data set, 21937 earth-

quakes have a magnitude lower than 6.6 and and the re-

maining 1475 earthquakes have magnitudes higher than

6.6. This information can be visually represented by plot-

ting the latitude and longitude of diﬀerent earthquakes on

a world map. This provides a better context to the data

and helps understand how diﬀerent earthquakes are dis-

tributed throughout the Earth based on their source.

Figure 6. Earthquakes with magnitude <6.6

Figure 7. Earthquakes with magnitude >6.6

In ﬁgure 6, the earthquakes with magnitude less than 6.6

are shown and they are represented by a green marker.

And in ﬁgure 7, the earthquakes with magnitude greater

than 6.6 are shown and they are represented by a red

marker.

4. EXPERIMENT

The following four models were trained using the data -

1. Random forest

2. Linear regression

3. Polynomial regression

4. Long Short Term Memory

The date and time of occurrence of an earthquake does not

follow a pattern and the interval between two subsequent

earthquakes is never same. Hence the data cannot be con-

sidered to a time-series. For the models the input will be

Latitude and Longitude of the earthquake and output will

be the magnitude and depth. The data is split between

training and testing sets with the training data set con-

taining 80% and the testing data set containing 20% of

the data.

4.1 Random Forest

The built-in random forest regression function from the

library scikit-learn is used. The model is trained at ﬁrst

with 10 decision trees as this is the default value.

In supervised machine learning when a model works very

well on the training data but is unable to perform sim-

ilarly on the testing data the situation called overﬁtting.

This problem can be solved by ﬁnding the optimum values

of the hyperparameters of a model. A hyperparameter is

a parameter whose value is used to control the learning

process of a model. In case of random forest this hyperpa-

rameter is the number of decision trees generated during

the learning process. The built-in ‘GridSearchCV’ func-

tion in scikit-learn will be used for the process of tuning

hyperparameters.

Since the default value number of trees is 10, the model

is further trained with the number of decision trees gener-

ated ranging from 20 to 200 with an interval of 10. After

running this process it is observed that the model shows

the best performance when 120 decision trees are used.

Random Forest uses decision trees for regression analysis

and hence does not require feature scaling. The algorithm

partitions the data set so even if feature scaling like min-

max normalization is applied the results would remain un-

changed.

4.2 Linear Regression and Polynomial regres-

sion

A linear regression model is trained using the data and

the built-in functions of scikit-learn are used. Linear re-

gression and polynomial regression are sensitive to high

values. Due to this reason, feature scaling is important

as the attributes in the data set are measured in diﬀerent

units which vary across a wide range and this might end

up creating a bias. Min-max normalization is used on the

data set to make sure that all values are within the range

(0,1).

For polynomial regression, the relationship between de-

pendent and independent variables is explained using a

polynomial of degree n. The degree parameter is varied

from 2 to 20 and some interesting results can be seen. A

polynomial of degree 16 shows the best results.

4.3 Long Short Term Memory

The keras library is used to implement the LSTM model

in this research. Like in the previous models we will try to

predict the magnitude and depth of an earthquake using

regression analysis. Min-max normalization is used on the

data set. The data set is split into training and testing sets

containing 80% and 20% of the data respectively with 10%

of the dataset used for validation.. For an LSTM network

we have to set the lookback value which is the number of

previous inputs the model will take into account when pre-

dicting the next value. The model is tested with lookback

values 50 and 100 respectively.

The Stacked LSTM is an extension to a standard LSTM

model which can have multiple hidden LSTM layers[6].

The addition of hidden layers makes the model deeper

and helps identify the complex relationship between mul-

tiple variables in a data set[6]. The network used in this

research has a visible layer, 2 hidden layers with 100 neu-

rons and with a sigmoid activation function, and an output

layer that predicts 4 values. The linear activation function

is used in the output layer as this does not change the

weighted sum of the input and returns the direct value.

Dropout is a regularization technique for neural network

models where randomly selected neurons are ignored dur-

ing training[12]. This makes the model less sensitive to

neuron weights and also prevents overﬁtting [12]. Dropout

with a 20% (i.e. 20% of the neurons are discarded each

weight cycle)is implemented in this model using the built-

in dropout function in keras.

5. RESULTS

5.1 Performance Metrics

R2score, explained variance score (ExV ar) and mean

squared error(MSE ) will be used to evaluate the perfor-

mance of the models implemented in this research.

R2score or the coeﬃcient of determination determines the

eﬀectiveness of the regression model and is deﬁned by :

R2(y, ˆy) = 1 −Pn

i=1(yi−ˆyi)2

i=1(yi−¯y)2

where ¯y=1

nPn

i=1 yiand Pn

i=1(yi−ˆyi)2=Pn

i=1 2

ˆyis the predicted value and yis the corresponding true

value.

Explained variance score is deﬁned as :

Ex var(y, ˆy) = 1 −V ar{y−ˆy}

V ar{y}

where ˆyis the estimated target output, ythe correspond-

ing true target output, and V ar is Variance, the square of

the standard deviation.

Mean squared error measures the average of the square of

errors and is deﬁned as :

MSE(y, ˆy) = 1

nsamples

nsamples−1

i=0

(yi−ˆyi)2

where ˆyiis predicted output and yiis true output of the

ith sample.

5.2 Discussion

Table 4 shows the evaluation metrics for all the models

that have been tested and their relative eﬀectiveness when

trying to predict the magnitude and depth of earthquakes.

R2score and explained variance score determines the

Figure 8. R2score : Magnitude

Figure 9. R2score : Depth

regression score of a model and can achieve a best possi-

ble score of 1.0. A score of 0.0 indicates that the model

always predicts the expected value of the output, disre-

garding the input features. A negative value implies that

a straight line ﬁts the data better than the tested model.

This can be seen from the scores in table 4. Both R2score

and explained variance score attain a negative value while

predicting magnitude in case of random forest and LSTM.

This can be seen in ﬁgure 8 where it also shows that linear

regression performs better in comparison.

In ﬁgure 12, the relative performances of the polynomial

regression is compared. The performance gradually in-

creases from degree 2 to degree 12 and then falls abruptly

from there. A polynomial of degree 16 shows the best per-

formance. Coincidentally, polynomial regression with a

polynomial of degree 16 is also the best performing model

showing comparatively better results than others. For pre-

dicting the magnitude of an earthquake polynomial regres-

sion yields the best results followed by linear regression,

random forest, and LSTM. This can be seen from ﬁgures

8 and 10, where the R2scores and mean squared error for

magnitude prediction are shown.

However, an interesting result is obtained from the random

forest model. The random forest model is able to predict

the depth of an earthquake incredibly well. Both R2score

and explained variance scores in this case are 0.8574 which

is very close to the perfect value of 1.0 as seen in ﬁgure

11. The mean squared error for depth is also the lowest

for random forest when compared to the other models as

seen in ﬁgure 11.

6. CONCLUSION

The performance from four diﬀerent machine learning mod-

els used in this research further supports the argument

that it is very diﬃcult to accurately predict earthquakes.

Analysis of the data set provided great insight and helped

Figure 10. Mean squared error : Magnitude

Figure 11. Mean squared error : Depth

identify patterns and key attributes that would be rele-

vant for this research. Latitude and longitude of detected

earthquakes can be used as dependent variables to pre-

dict the magnitude and depth of future earthquakes. This

can be helpful for determining the magnitude and depth

of earthquakes in seismic zones where the frequency of re-

current earthquakes is high.

Furthermore, this paper presents three important ﬁndings

1. Polynomial regression (with a polynomial of degree

16) is the best method for predicting the magnitude

of an earthquake.

2. Random forest can be extremely eﬀective in predict-

ing the depth of earthquakes

3. Polynomial regression is the overall best performing

model.

For further improvement seismic data collected from seis-

mographs positioned all around the world can be analysed

and used to improve the diﬀerent models implemented

here. Also, given the nature of polynomial regression and

the size of the data set used, there is a high probability of

overﬁtting in the polynomial regression model used in this

research. This can be another topic for further improve-

ment.

7. REFERENCES

[1] Andrews, D. F. A robust method for multiple

linear regression. Technometrics 16, 4 (1974),

523–531.

[2] Asim, K., Mart

ınez- ´

Alvarez, F., Basit, A., and

Iqbal, T. Earthquake magnitude prediction in

hindukush region using machine learning techniques.

Natural Hazards 85 (01 2017), 471–486.

[3] Bhandarkar, T., K, V., Satish, N., Sridhar, S.,

Sivakumar, R., and Ghosh, S. Earthquake trend

prediction using long short-term memory rnn.

International Journal of Electrical and Computer

Engineering (IJECE) 9 (04 2019), 1304.

[4] Breiman, L. Random forests. Mach. Learn. 45, 1

(Oct. 2001), 5–32.

Figure 12. Polynomial Regression Performance Comparison

Table 4. Performance : Polynomial regression, Linear regression, Random forest, and LSTM

Model/Metrics R2score

(Mag)

R2score

(Depth)

ExVar

(Mag)

ExVar

(Depth)

MSE

(Mag)

MSE

(Depth)

Poly Reg 0.0132 0.1416 0.0132 0.1420 0.1809 6862

Linear Reg 0.0013 0.0102 0.0013 0.0103 0.1831 14308

Random Forest -0.1207 0.8574 -0.1207 0.8574 0.20550 2061

LSTM -0.1553 -0.2275 -0.0223 -0.0006 0.2131 5350

[5] Geller, R., Jackson, D., Kagan, Y., and

Mulargia, F. Earthquakes cannot be predicted.

Science 275 (1997), 1616 – 1616.

[6] Graves, A., Mohamed, A.-r., and Hinton, G.

Speech recognition with deep recurrent neural

networks. In 2013 IEEE International Conference

on Acoustics, Speech and Signal Processing (2013),

pp. 6645–6649.

[7] Hochreiter, S., and Schmidhuber, J. Long

short-term memory. Neural computation 9 (12

1997), 1735–80.

[8] Kuyuk, H. S., and Susumu, O. Real-time

classiﬁcation of earthquake using deep learning.

Procedia Computer Science 140 (2018), 298–305.

[9] Li, A., and Kang, L. Knn-based modeling and its

application in aftershock prediction. In Proceedings

of the 2009 International Asia Symposium on

Intelligent Interaction and Aﬀective Computing

(USA, 2009), ASIA ’09, IEEE Computer Society,

p. 83–86.

[10] Mallouhy, R., Abou Jaoude, C., Guyeux, C.,

and Makhoul, A. Major earthquake event

prediction using various machine learning

algorithms. In International Conference on

Information and Communication Technologies for

Disaster Management (Paris, France, Dec. 2019).

[11] Shearer, P. M. Introduction to Seismology, 2 ed.

Cambridge University Press, 2009.

[12] Srivastava, N., Hinton, G., Krizhevsky, A.,

Sutskever, I., and Salakhutdinov, R. Dropout:

A simple way to prevent neural networks from

overﬁtting. Journal of Machine Learning Research

15 (06 2014), 1929–1958.

[13] USGS. Earthquake hazards. www.usgs.gov/natural-

hazards/earthquake-hazards/earthquakes.

”Accessed: 2021-17-06”.

[14] USGS. Signiﬁcant earthquakes - 2021.

https://earthquake.usgs.gov/earthquakes/browse/signiﬁcant.php.

”Accessed: 2021-17-06”.

[15] Zhang, A., Lipton, Z. C., Li, M., and Smola,

A. J. Dive into Deep Learning. 2019.

http://www.d2l.ai.

APPENDIX

A. PERFORMANCE METRICS

Table 5. Performance Ranking: Random forest

Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

No of

Trees 120 200 190 180 170 160 150 140 110 30 20 10 60 130 50 70 80 40 100 90

Table 6. LSTM: 2 hidden layers with 50 neurons (lookback

= 50)

LSTM RScore

(Mag)

RScore

(Depth)

Ex Var

(Mag)

Ex Var

(Depth)

MSE

(Mag)

MSE

(Depth)

50-50 -0.3830 -0.1911 -0.3324 -0.0008 0.3152 7476

B. FIGURES

Figure 13. Scatter plot : Magnitude vs Depth

An integrated approach for understanding global earthquake patterns and enhancing seismic risk assessment

Article

Full-text available

Mar 2024

Earthquakes, as intricate natural phenomena, profoundly impact lives, infrastructure, and the environment. While previous research has explored earthquake patterns through data analysis methods, there has been a gap in examining the time intervals between consecutive earthquakes across various magnitude categories. Given the complexity and vastness of seismic data, this study aims to provide comprehensive insights into global seismic activity by employing sophisticated data analysis methodologies on a century-long dataset of seismic events. The four-phase methodology encompasses exploratory data analysis (EDA), temporal dynamics exploration, spatial pattern analysis, and cluster analysis. The EDA serves as the foundational step, providing fundamental insights into the dataset's attributes and laying the groundwork for subsequent analyses. Temporal dynamics exploration focuses on discerning variations in earthquake occurrences over time. Spatial analysis identifies geographic regions with heightened earthquake activity and uncovers patterns of seismic clustering. K-means clustering is employed to delineate distinct earthquake occurrence clusters or hotspots based on geographical coordinates. The study's findings reveal a notable increase in recorded earthquakes since the 1960s, peaking in 2018. Distinct patterns in seismic activity are linked to factors such as time, human activities, and plate boundaries. The integrated approach enriches understanding of global earthquake trends and patterns, contributing to improved seismic hazard assessments, early warning systems, and risk mitigation efforts.

Earthquake magnitude prediction article

Thesis

Mar 2023

Earthquakes are vibrations of the Earth’s surface that can cause ground shaking, fires, tsunamis, landslides, and fissures. These natural phenomena can cause destruction and kill lives. When there is a possibility of an earth quake, an accurate prediction can save lives and avoid infrastructure damage. Due to the probabilistic nature of an earthquake occurring and the challenge of achieving an efficient and dependable model for earthquake prediction, efforts to predict earthquakes have been met with mixed results. Therefore, new methods are constantly sought. A deep learning-based technique, specifically a transformer algorithm, was applied to predict earthquake magnitudes using available data for the Horn of Africa. The problem was formulated as a multi-variant time series regression and predictions were made for earthquake magnitudes greater than or equal to 3 for the next three months. A comparison of the results was made with the output obtained from long short-term memory (LSTM), bidirectional long short-term memory (BILSTM), and bidirectional long short-term memory with attention (BILSTM-AT) models. The results showed that the transformer model outperformed the other three models with 0.276, 0.147, 0.383, 28.868% mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE) and mean absolute percentage error (MAPE) respectively in predicting earthquake mag nitudes in the Horn of Africa.

Predicting the Severity of Future Earthquakes by Employing the Random Forest Algorithm

Chapter

Jun 2023

Random forest regression is an ensemble, supervised learning algorithm capable of executing both classification and regression. Within this report, the use of the following algorithm will be implemented on an earthquake dataset which consists of all recorded occurrences of earthquakes from 1930 to 2018. Certain columns from the database will be used as target variables such as magnitude and depth to predict the following outcome based on trained data. Hyper parameter tuning will be performed to maximize the model's performance by increasing its accuracy, decreasing errors, and ensuring efficiency. The parameter in this model that contributed to the efficiency while performing hyper parameter tuning was number of estimators. Findings from the research report concluded that the model's accuracy levels were approximately 75%. Despite increasing the number of trees used, the model's accuracy did not significantly change and improve but rather significantly slowed down the run-time.

Major earthquake event prediction using various machine learning algorithms

Conference Paper

Full-text available

Dec 2019

Real-Time Classification of Earthquake using Deep Learning

Article

Full-text available

Jan 2018

Existing Earthquake Early Warning Systems (EEWSs) calculates the location and magnitude of an earthquake using real-time waveforms from seismic stations within a few seconds. Typically, three to six stations are necessary to estimate earthquake parameters. Waiting for primary (P-) wave information from closest stations results in a blind-zone area where the arrival of secondary (S-) wave cannot be provided around the epicenter of an earthquake. If an earthquake occurred under a city center, EEWSs would not work even though each building has a seismic sensor in a smart city in future. Here, we present a methodology to classify earthquake vibrations into near-source or far-source within one second after P-wave detection. This will allow warnings to citizens who are the residence of earthquake epicenter in case of an earthquake very close by. We trained a deep learning Long Short-Term Memory (LSTM) network for sequence-to-label classification. 305 three component accelerations recorded between 2000 and 2018 in Japan are used to train the artificial network by extracting thirteen features of one second of P-wave. The accuracy of the methodology is 98.2%. 54 out of 55 near-source waveforms classified correctly and only 2 of 80 waveforms were misclassified. We tested the LSTM network with 2018 Northern Osaka (M 6.1.) earthquakes in Japan where closest stations are correctly identified with 83.3% accuracy. Therefore, smart cities donated with smart automated shut-on/off machines and sensors will be more resilient against earthquake disaster even EEWSs are not available in the blind zone area in future.

Earthquake magnitude prediction in Hindukush region using machine learning techniques

Article

Full-text available

Jan 2017
NAT HAZARDS

Earthquake magnitude prediction for Hindukush region has been carried out in this research using the temporal sequence of historic seismic activities in combination with the machine learning classifiers. Prediction has been made on the basis of mathematically calculated eight seismic indicators using the earthquake catalog of the region. These parameters are based on the well-known geophysical facts of Gutenberg–Richter’s inverse law, distribution of characteristic earthquake magnitudes and seismic quiescence. In this research, four machine learning techniques including pattern recognition neural network, recurrent neural network, random forest and linear programming boost ensemble classifier are separately applied to model relationships between calculated seismic parameters and future earthquake occurrences. The problem is formulated as a binary classification task and predictions are made for earthquakes of magnitude greater than or equal to 5.5 ($M \ge$ 5.5), for the duration of 1 month. Furthermore, the analysis of earthquake prediction results is carried out for every machine learning classifier in terms of sensitivity, specificity, true and false predictive values. Accuracy is another performance measure considered for analyzing the results. Earthquake magnitude prediction for the Hindukush using these aforementioned techniques show significant and encouraging results, thus constituting a step forward toward the final robust prediction mechanism which is not available so far.

Earthquake trend prediction using long short-term memory RNN

Article

Apr 2019
IJECE

The prediction of a natural calamity such as earthquakes has been an area of interest for a long time but accurate results in earthquake forecasting have evaded scientists, even leading some to deem it intrinsically impossible to forecast them accurately. In this paper an attempt to forecast earthquakes and trends using a data of a series of past earthquakes. A type of recurrent neural network called Long Short-Term Memory (LSTM) is used to model the sequence of earthquakes. The trained model is then used to predict the future trend of earthquakes. An ordinary Feed Forward Neural Network (FFNN) solution for the same problem was done for comparison. The LSTM neural network was found to outperform the FFNN. The R^2 score of the LSTM is better than the FFNN’s by 59%. Copyright © 2019 Institute of Advanced Engineering and Science. All rights reserved.

Speech Recognition With Deep Recurrent Neural Networks

Conference Paper

Jan 2013

Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end training methods such as Connectionist Temporal Classification make it possible to train RNNs for sequence labelling problems where the input-output alignment is unknown. The combination of these methods with the Long Short-term Memory RNN architecture has proved particularly fruitful, delivering state-of-the-art results in cursive handwriting recognition. However RNN performance in speech recognition has so far been disappointing, with better results returned by deep feedforward networks. This paper investigates $backslash$emphdeep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs. When trained end-to-end with suitable regularisation, we find that deep Long Short-term Memory RNNs achieve a test set error of 17.7% on the TIMIT phoneme recognition benchmark, which to our knowledge is the best recorded score.

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Article

Jun 2014
J MACH LEARN RES

Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. © 2014 Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever and Ruslan Salakhutdinov.

A Robust Method for Multiple Linear Regression

Article

Nov 1974
TECHNOMETRICS

David F. Andrews

Techniques of fitting are said to be resistant when the result is not greatly altered in the case a small fraction of the data is altered: techniques of fitting are said to be robust of efficiency when their statistical efficiency remains high for conditions more realistic than the utopian cases of Gaussian distributions with errors of equal variance. These properties are particularly important in the formative stages of model building when the form of the response is not known exactly. Techniques with these properties are proposed and discussed.

Earthquakes Cannot Be Predicted

Article

Mar 1997
SCIENCE

Can the time, location, and magnitude of future earthquakes be predicted reliably and accurately? In their Perspective, Geller et al.'s answer is “no.” Citing recent results from the physics of nonlinear systems “chaos theory,” they argue that any small earthquake has some chance of cascading into a large event. According to research cited by the authors, whether or not this happens depends on unmeasurably fine details of conditions in Earth's interior. Earthquakes are therefore inherently unpredictable. Geller et al. suggest that controversy over prediction lingers because prediction claims are not stated as objectively testable scientific hypotheses, and due to overly optimistic reports in the mass media.

KNN-Based Modeling and Its Application in Aftershock Prediction

Conference Paper

Jan 2010

For the problem that the prediction accuracy of real-valued attribute data is not high, a modeling method named PR-KNN (polynomial regression and k nearest neighbor) is proposed, which is based on combination of KNN (k nearest neighbor) algorithm and polynomial regression model. Firstly, K nearest decision attribute values in training samples are selected by using KNN algorithm. Secondly, these K nearest decision attribute values are modeled by using polynomial regression method. And this method is applied to aftershock prediction. Experimental data are the sequence data of aftershocks with magnitude greater than or equal to 4.0 from Wenchuan earthquake. Comparing with traditional KNN regression algorithm and distance-weighted KNN regression algorithm, experimental results show that the maximum relative error predicted by PR-KNN reduces by 6.012% and 7.751% respectively, and maximum absolute error reduces by 0.367 and 0.473 respectively.

An Introduction To Seismology

Chapter

Jan 1999

Peter M. Shearer

This book provides an approachable and concise introduction to seismology theory. It clearly explains the fundamental concepts, emphasizing intuitive understanding over lengthy derivations. Topics include all that is needed for a comprehensive first course in seismology: stress/strain theory, seismic wave equation, ray theory, tomography, reflection seismology, surface waves, source theory, anisotropy and earthquake prediction. Detailed exercises follow each chapter, giving students the opportunity to apply the techniques they have learned to compute results of interest and to illustrate some of Earth's seismic properties. In several cases, computer subroutines are provided to assist with these exercises. Numerous illustrations accompany the text, including examples of seismograms and images of the global seismic wavefield. This textbook is ideal for any introductory course in seismology taught to upper-division undergraduates or first-year graduate students, and is especially suited for a one-semester

Analysis and Prediction of Earthquakes using different Machine Learning techniques

Abstract and Figures

Recommended publications

Using Machine Learning Techniques for Earthquake Prediction Through Student Learning Styles

An attention-based LSTM network for large earthquake prediction

Comparative Analysis of Machine Learning Models for Earthquake Prediction Using Large Textual Datase...

Earthquake prognosis using machine learning