ArticlePDF Available

A relational database to support post-earthquake building damage and recovery assessment

Authors:

Abstract

Systematically collected and curated data sets from historical events provide a strong basis for simulating the physical and functional effects of natural hazards on the built environment. This article develops a relational database to support post-earthquake damage and recovery modeling of building portfolios. The current version of the database has been populated with information on the 3695 buildings affected by the 2014 South Napa, California, earthquake. The associated data categories include general building characteristics, site properties and shaking intensities, building damage and repair permitting (timing and type) information, and census-block-level sociodemographics. The Napa data set can be used to validate post-earthquake recovery simulation methodologies and explore the effectiveness of different modeling techniques in predicting damage. The database can be expanded to include other earthquakes and the overall framework can be adapted to other types of natural hazards (e.g. hurricanes, flooding).
Data Paper
Earthquake Spectra
1–21
ÓThe Author(s) 2022
Article reuse guidelines:
sagepub.com/journals-permissions
DOI: 10.1177/87552930211061167
journals.sagepub.com/home/eqs
A relational database to
support post-earthquake
building damage and recovery
assessment
Morolake Omoya, M.EERI
1
,ItohanEro
1
, Mohsen Zaker
Esteghamati, M.EERI
2
, Henry V Burton, M.EERI
1
,
Scott Brandenberg
1
, Han Sun
1
, Zhengxiang Yi
1
,
Hua Kang
1
, and Chukuebuka C Nweke
1
Abstract
Systematically collected and curated data sets from historical events provide a strong
basis for simulating the physical and functional effects of natural hazards on the built
environment. This article develops a relational database to support post-earthquake
damage and recovery modeling of building portfolios. The current version of the
database has been populated with information on the 3695 buildings affected by the
2014 South Napa, California, earthquake. The associated data categories include gen-
eral building characteristics, site properties and shaking intensities, building damage
and repair permitting (timing and type) information, and census-block-level sociode-
mographics. The Napa data set can be used to validate post-earthquake recovery
simulation methodologies and explore the effectiveness of different modeling tech-
niques in predicting damage. The database can be expanded to include other earth-
quakes and the overall framework can be adapted to other types of natural hazards
(e.g. hurricanes, flooding).
Keywords
Relational database, post-earthquake damage and recovery assessment, 2014 South
Napa earthquake
Date received: 1 February 2021; accepted: 28 October 2021
1
Department of Civil & Environmental Engineering, The University of California, Los Angeles, Los Angeles, CA, USA
2
Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA,
USA
Corresponding author:
Morolake Omoya, Department of Civil & Environmental Engineering, The University of California, Los Angeles, Los
Angeles, CA 90095, USA.
Email: morolake@ucla.edu
Introduction
Post-earthquake damage and recovery assessment of building portfolios is essential to seis-
mic risk mitigation and resilience planning. Developing prediction models to estimate the
damage to buildings is a necessary step in quantifying the socioeconomic impacts of an
earthquake. For example, the probable state of damage to a building conditioned on the
ground shaking intensity at its site is a key input into regional seismic loss estimation mod-
els. Damage assessments are also useful for quantifying the societal cost of earthquakes
such as the loss of housing and essential services (e.g. education, healthcare, emergency
operations) and the need for temporary building facilities (e.g. temporary shelter). The
current focus on seismic resilience has also underscored the importance of post-earthquake
recovery models, which are useful for quantifying the initial (immediately following an
event) and cumulative (during the period following an event) loss of functionality in build-
ing portfolios.
Like many other types of engineering models, the ones used to assess post-earthquake
damage and recovery rely on empirical data from buildings that have been subjected to
historical seismic events. The HAZUS methodology (Federal Emergency Management
Agency (FEMA), 2003a), which represents the current state of the art in regional earth-
quake impact assessment, utilizes a hybrid or semi-empirical model for estimating building
damage. More specifically, while the relationship between ground shaking and building
response is based on simplified engineering models (i.e. the advanced engineering building
module), the damage conditioned on the response (i.e. via fragility functions) is largely
informed by empirical data. The second-generation performance-based earthquake engi-
neering (PBEE) framework (FEMA, 2012), which is increasingly being used for regional
seismic impact assessments, utilizes more sophisticated (relative to HAZUS) mechanics-
based simulations to quantify structural response. However, similar to HAZUS, the fragi-
lity functions that link the engineering demand parameters from response history analysis
to component damage are informed by empirical data. In addition, researchers have begun
to explore the use of machine learning models to predict earthquake-induced building
damage (e.g. Mangalathu and Burton, 2019; Mangalathu et al., 2020). Unlike the HAZUS
and PBEE methods, these models are exclusively reliant on empirical data.
Because of the complex dynamic interactions that they attempt to simulate, mathemati-
cal models of post-earthquake recovery (building-specific or portfolio) are heavily reliant
on empirical data. One type of input to such models is the variables that capture the dura-
tion of recovery-related activities (e.g., time to inspection, time to obtain financing and any
necessary permits, repair time), which are inescapably linked to empirical observations. As
a specific example from structural engineering practice, the resilience-based earthquake
design initiative (REDi) (Almufti and Willford, 2013) framework is used by structural engi-
neering practitioners to estimate building and portfolio-scale post-earthquake recovery.
The framework specifies probability distributions for the times associated with so-called
‘‘impeding factors’’ (e.g., time to inspection, permit and finance acquisition, and engineer-
ing and contractor mobilization) that are based on empirical data. In the research litera-
ture, empirical data have been used to evaluate (Kang et al., 2018) and calibrate (Miles and
Chang, 2007) post-earthquake recovery models. Besides the duration-based variables,
sociodemographic data for the affected population have been used as inputs into post-
earthquake recovery models that seek to capture stakeholder decision-making (Burton
et al., 2019) and their influence on building recovery trajectories (Kang et al., 2018).
2Earthquake Spectra 00(0)
This article presents a relational database (RDB) to support post-earthquake damage
and recovery modeling for building portfolios and includes site characteristics, damage,
recovery, and sociodemographic data for buildings affected by the 2014 South Napa
Earthquake. The database is publicly available (Omoya et al., 2021) and is hosted on the
DesignSafe cyberinfrastructure (Rathje et al., 2017). The structure of the database is such
that it can be expanded in the future to include other events. The subsequent section pro-
vides a high-level summary of the information contained and sources used to develop the
data set. A detailed description of the data is then presented, followed by a discussion of
the structure of the RDB, including the schema, a description of the tables, attributes, and
relationships. The tools that have been developed to facilitate querying the data are also
presented along with some illustrative examples. Specific examples of prior and possible
future applications and extensions of the data set are also discussed.
Summary and sources used to develop the database
The data set includes information that is relevant to building damage and recovery assess-
ments. The assembled data are from the 2014 South Napa earthquake. A high-level sum-
mary of the data set is presented in Table 1, which includes the relevant sources, categories,
and number of observations for each data type. The data categories for this event include
sociodemographic data, general information about each building and its respective site and
shaking intensity, and damage and recovery data that are specific to that event. Prior
empirical research has demonstrated the influence of sociodemographics on the pace and
effectiveness of disaster recovery (e.g., Elliott, 2015; Kang et al., 2018; Zhang and Peacock,
2009). All of the demographic information in Table 1 was obtained from census data
(United States Census, 2020). The general building (e.g., number of stories, construction
year, building value) and site (location) information are relevant to both post-earthquake
damage and recovery assessments. The time-average shear wave velocity to a 30 m depth
(VS30) for each site, which is an important parameter in ground motion models (e.g., Boore
et al., 2014), was obtained indirectly based on correlations with simplified geologic units
(Wills et al., 2015). VS30 has also been used as a feature in machine learning–based building
damage prediction models (e.g., Mangalathu et al., 2020). The site class for each building
location is determined using the United States Geological Survey (USGS, 2020) applica-
tion program interface based on the VS30 and criteria suggested by FEMA 450-1 (FEMA,
2003b). A study by Boatwright et al. (2015) found that the extent of building damage dur-
ing the 2014 South Napa earthquake was strongly correlated with the underlying sedimen-
tary basin. For this reason, three basin depth parameters are provided at each site,
including the vertical distance from the ground surface to shear wave velocity isosurfaces
corresponding to 1.0 km/s (z1:0), 1.5 km/s (z1:5), and 2.5 km/s (z2:5) (Brocher et al., 2006).
The shaking intensity at each site in terms of the spectral acceleration at 0.3 s (Sa0:3s) and
peak ground acceleration (pga) are obtained through interpolation using the kriging algo-
rithm (Mangalathu et al., 2020) and 381 pairs of horizontal ground motion component
recordings from the 2014 earthquake (Center for Engineering Strong Motion Data
(CESMD), 2018). The details of the in-person assessments (i.e., inspection date, damage
descriptions, ATC-20 tag) performed by building professional volunteers were obtained
from the Earthquake Engineering Research Institute (EERI, 2016) clearinghouse website.
These data can be used to develop and/or validate the accuracy of different types of dam-
age assessment models (e.g., Mangalathu and Burton, 2019; Mangalathu et al., 2020) or as
inputs (direct or indirect) into post-earthquake recovery models (e.g., Kang et al., 2018).
The permit information acquired from the Napa Building Department website (City of
Omoya et al. 3
Napa, 2020b) can be used to obtain timestamps for the start and completion of specific
recovery activities (i.e. permitting and repair). The associated activity durations can be
used to reconstruct ‘‘observed’’ recovery trajectories or to calibrate simulation models (e.g.
Kang et al., 2018).
Table 1. Summary of the assembled data set and sources for the 2014 South Napa earthquake
Data category Data type Number of
data points
Data source
Sociodemographics Percentage of English-speaking
households
3695 US Census Bureau
Percentage of 25 years and older
population with high-school
diploma
3695
Percentage of population that is
Hispanic or Latino
3695
Percentage of population that is
Black or African American
3695
Percentage of population that is
Asian
3695
Percentage of households without
children below 18 years of age
3695
Per capita income 3695
Percentage of owner-occupied
housing units
3695
General Building
Information
Number of stories 3559 Napa County Tax
AssessorFloor area 3575
Number of units 2909
Construction year 2862
Building value 3507
Number of occupants -
Occupancy type 3030
Construction type (material) -
Lateral force resisting system -
Site Information Latitude and Longitude 3695 EERI Clearinghouse
City 3695
Basin Depth Parameters 3695 Brocher et al. (2006)
Time-averaged shear wave velocity
to 30 m depth
3695 Wills et al. (2015)
Site Class 3695 FEMA (2003b)
2014 South Napa
Earthquake Shaking
Intensity Information
Joyner–Boore distance 3695 Mangalathu et al.
(2020)Spectral acceleration at a period of
0.3 s
3695
Peak ground acceleration 3695
2014 South Napa
Earthquake Building
Damage Information
Observed damage description 3441 EERI Clearinghouse
Inspection date 3695
ATC-20 tag 3695
2014 South Napa
Earthquake Building
Permit Information
Permit description 1012 Napa County Building
DepartmentPermit type 912
Date of application for permit 3677
Date of permit approval 1062
Date of completion for permit-
related work
736
EERI: Earthquake Engineering Research Institute; FEMA: Federal Emergency Management Agency; ATC: Applied
Technology Council.
4Earthquake Spectra 00(0)
The database includes fields for the number of occupants in each building, the construc-
tion type, and lateral force resisting system in each building. The data associated with these
fields are currently unavailable and therefore not included in the present version of the
database. However, the expectation is that these three critical pieces of information will
likely become available and be added to the database in the future.
Description of data
General information and site characteristics
Figure 1 presents a map of the Napa region showing the locations of the buildings included
in the data set, which also correspond to the ones inspected and tagged (per Applied
Technology Council (ATC)-20) (ATC, 1995) after the 2014 earthquake. The building mar-
kers are color-coded based on the year of construction range. Approximately 35% of the
buildings are pre-1950 construction and roughly 15% are more than a century old. Only
4.2% have been constructed since the year 2000. The majority of buildings in the data set
are single-family residences (67%), with the remainder being equally split between multi-
family residences and commercial spaces, respectively. This, in part, explains why 69% of
the buildings in the data set have a floor area that is less than 5000 ft
2
. According to the
Napa County Tax Assessors website (City of Napa, 2020a), the average property value for
single-family residential, multifamily residential, and commercial buildings is $450,000,
$1.1 million, and $3.6 million, respectively. The number of stories ranges from one to four
with 96% of the buildings having one or two stories.
The VS30 values for the Napa building sites range from approximately 176–519 m/s with
an average of 319 m/s. Most (72%) buildings are located on Site Class D (stiff) soil and the
Figure 1. Map showing the locations of buildings in the Napa data set.
Omoya et al. 5
remainder are on Site Class C (dense soil/soft rock) soil. Only one building in the inventory
is located on Site Class E (liquefiable, soft) soil. The average values for the three basin
depth parameters are z
1.0
= 527 m, z
1.5
= 1092 m, and z
2.5
= 1303 m. These values suggest
that Napa Valley resides on shallower sediment deposits compared with adjacent regions
such as the central valley (Sacramento, Fresno) (Brocher et al., 2006; Chen and Lee, 2017).
This range of site conditions is expected to result in a complex seismic site response where
amplification and de-amplification can lead to significant spatial variability of shaking
intensity that would also depend on the magnitude and location of the earthquake event
(Boore et al., 2014; Seyhan and Stewart, 2014).
Sociodemographics
While sociodemographic factors do not directly affect post-earthquake building recovery,
unequal processes that discriminate based on these factors can affect recovery outcomes.
Based on the availability of the relevant information and a review of prior studies on the
factors that have been shown to be correlated with the pace of disaster recovery (e.g.
Elliott, 2015; Kang et al., 2018; Zhang and Peacock, 2009), the following sociodemo-
graphic variables are included in the Napa data set:
Percentage of households where English is spoken (HHEng).
Percentage of the population that is 25 years and older that have at least a high-
school diploma (PN.25+HS ).
Percentage of the total population that is Hispanic or Latino (PNHisp).
Percentage of the total population that is Black or African American (PNBlack).
Percentage of the total population that is Asian (PNAsian).
Percentage of households without individuals below the age of 18 years (HH\18).
Household earnings in the past 12 months (IncAnn).
Percentage of housing units that are owner occupied (HU%Own).
Histograms showing the distribution of the sociodemographic factors at the census
block level (Manson et al., 2017) are shown in Figure 2. Histograms are not included for
PNBlack and PNAsian because these two demographics are not well represented in Napa
County. More specifically, the population of Asian and African Americans in Napa
County is approximately 0.7% and 3%, respectively (United States Census, 2020).
Shaking intensity and building damage from the 2014 South Napa earthquake
The spatial distribution of the ATC-20 tags (red, yellow, and green) assigned to each
building and the epicenter of the 2014 earthquake is shown in Figure 3. While most of the
severe damage appears to be concentrated in the downtown area (as evidenced by the clus-
ter of red tags), buildings as far west as the Browns Valley District and as far east as the
Shurtleff neighborhood (as evidenced by the widespread presence of yellow tags) were
affected. Only 5.4% of the inspected buildings received red tags and the percentage of
green (46.8%) and yellow (47.8%) tags are approximately equal. The documented descrip-
tions during the field inspections indicated that most of the yellow- and red-tagged build-
ings suffered some form of chimney damage. Although less prevalent, there was also
damage to super-structure and foundation walls (e.g. cripple walls) and fireplaces and a
few buildings suffered partial or total porch collapses.
6Earthquake Spectra 00(0)
Recall that the Sa0:3sand pga at each site are determined by applying the kriging algo-
rithm using the strong motion recordings from 381 sites. Figure 4 shows a histogram with
the Sa0:3sdistribution with each bin disaggregated based on the percentage of buildings
assigned red, yellow, and green tags. It is evident that the severity of damage generally
increases with shaking intensity. More specifically, Figure 4 shows that as Sa0:3sincreases,
the percentage of red and yellow tags dominate while the opposite is true for green-tagged
buildings.
Permitting and recovery information
Figure 5 shows a map with the buildings whose permit and repair times are included in
the Napa data set with the markers color-coded to reflect the relative time values
Figure 2. Sociodemographic distribution: (a) HHEng, (b) PN.25 +HS , (c) PNHisp , (d) HH\18, (e) IncAnn , and
(f) HU%Own.
Omoya et al. 7
associated with each variable. The permit time is computed as the number of days from
the date that the building is inspected (also included in the data set) to the date that the
permit was approved. The number of days between the permit and repair approval (as
documented by the building department) dates is taken as the repair time. The buildings
that have been permitted (880) outnumber those with ‘‘officially’’ (i.e. certified by the
building department) completed repairs (672). This is because it is fairly common for own-
ers to perform the repairs outlined in the permit without pursuing the final signoff from
the city. A much less common scenario is when the necessary permits are obtained but the
repairs are never performed or are done in a manner that is inconsistent with the permit.
Figure 3. Map showing the distribution of ATC-20 tags for buildings in the Napa data set.
Figure 4. Histogram showing the relationship between the ATC-20 tags and shaking intensity (Sa0:3s)
for buildings in the Napa data set.
8Earthquake Spectra 00(0)
In general, longer permit and repair times are correlated with more severe damage. The
anecdotal evidence for this observation is the darker red colors (longer times) shown in
the downtown area in Figure 5a and b. Figure 6 shows a histogram of the recovery time
with the bins disaggregated based on the percentage of each ATC-20 tag. The recovery
time is taken as the sum of the inspection, permit, and repair times. Further evidence of
Figure 5. Map showing the distribution of (a) permit and (b) repair times for 880 and 672 buildings,
respectively, in the Napa data set.
Figure 6. Histogram showing the relationship between the ATC-20 tags and recovery time for
buildings in the Napa data set.
Omoya et al. 9
the positive correlation between the severity of damage and the permit and repair time is
provided in Figure 6. More specifically, as the recovery time increases, the percentage of
red- and yellow-tagged buildings in each bin increases. On the contrary, the opposite is
true for green-tagged buildings. In addition, the mean recovery time for green-, yellow-,
and red-tagged buildings is 197, 212, and 308 days, respectively.
RDB structure
Background on RDBs
An open-source RDB is developed using MySQL (2020) to support post-earthquake dam-
age and recovery modeling for building portfolios. RDBs are a better alternative to spread-
sheets because they store and organize data more efficiently, have better visualization
features, and it is easier to link interrelated fields. RDBs are especially suitable for incor-
porating multiple data sources, reducing data redundancy, increasing user efficiency, and
facilitating data visualization.
A well-structured database containing a large number of buildings affected by multiple
earthquakes would provide a wealth of information that can be used for modeling, risk
management, and resilience planning (Zaker Esteghamati et al., 2020). The RDB developed
as part of the current study represents the initial step toward the creation of such a resource
with the inclusion of 3695 buildings affected by a single (the 2014 South Napa) earthquake.
Schema
The structure of the database is defined by a series of interconnected tables that are linked
through shared fields called keys. A single table consists of a combination of keys and attri-
butes. A primary key is a unique identifier for each table, while a foreign key acts as a link
between tables. The primary key is referenced by the foreign key of another table to estab-
lish a relationship between tables. An attribute is the description of table entities that takes
on a type such as character, integer, or date. The structure of the database, or schema,isa
combination of tables, attributes, and relationships between tables.
The structure of the database developed for the current study is shown in Figure 7
where the primary and foreign keys are highlighted in green and blue, respectively. The
schema was developed through an iterative process to ensure efficiency and ease of access
(through querying). Altogether, there are 45 attributes (variables) included in the database
and they are grouped into five categories: (1) building and site properties, (2) damage
assessment information, (3) recovery assessment information, (4) census-block-level demo-
graphics, and (5) earthquake properties.
Description of tables, attributes, and relationships
The attributes of each table are summarized in Tables 2 to 11. The attribute data types
include integers (INT), dates (DATE), and decimals with the precision and scale specified
(NUMERIC(P, S)). Three types of strings that differ based on their maximum length
(VARCHAR, MEDIUMTEXT, and LONGTEXT) are included as datatypes.
The building and site properties category contains the site information table (Table 2)
which documents the location and soil properties unique to each building, the occupancy
type (Table 4), the construction type (Table 5), and the general building information
10 Earthquake Spectra 00(0)
Figure 7. Relational database schema.
Omoya et al. 11
(Table 3). The building table includes foreign keys from the site, occupancy, and construc-
tion type tables to reflect the one-to-many relationships (i.e. each occupancy/site/construc-
tion type is associated with many buildings). The sociodemography table is also a foreign
key in the building table because a single census-block-level sociodemographic variable
value can be associated with multiple buildings.
Table 2. Building table (Building)
Attribute Abbreviated attribute name Data type Examples
Building ID BuildingID INT 1,2,3, etc.
Number of stories NoStories INT 1,2,3, etc.
Floor area FlArea INT 20500, etc.
Number of units NoUnits INT 1,2,3, etc.
Construction year ConstrYear INT 1,2,3, etc.
Building value BuildVal INT 1,2,3, etc.
Site ID SiteID INT 1,2,3, etc.
Occupancy ID OccuID INT 1,2,3, etc.
Construction type ID ConstTypeID INT 1,2,3, etc.
Sociodemography ID SociodemID INT 1,2,3, etc.
Green shaded rows represent primary keys and blue shaded rows signify foreign keys.
Table 3. Site information table (SiteInfo)
Attribute Abbreviated attribute name Data type Examples
Site ID SiteID INT 1,2,3, etc.
Longitude Longitude DECIMAL (15,10) 122.12
Latitude Latitude DECIMAL (15,10) 238.123
City City VARCHAR (45) Napa, etc.
Vs
30
VS30 DECIMAL (15,10) 1.234, etc.
Basin depth Z1.0 Basin_Depth _Z1.0 DECIMAL (15,10) 1.234, etc.
Basin depth Z1.5 Basin_Depth _Z1.5 DECIMAL (15,10) 1.234, etc.
Basin depth Z2.5 Basin_Depth _Z2.5 DECIMAL (15,10) 1.234, etc.
Site class SiteClass VARCHAR(1) B, C, etc.
Green shaded rows represent primary keys.
Table 4. Occupancy table (Occupancy)
Attribute Abbreviated attribute name Data type Examples
Occupancy ID OccuID INT 1,2,3, etc.
Number of occupants NoOccu INT 1,2,3, etc.
Occupancy type OccuType VARCHAR(45) Commercial, office, etc.
Green shaded rows represent primary keys.
Table 5. Construction type table (ConstType)
Attribute Abbreviated
attribute name
Data type Examples
Construction type ID ConstTypeID INT 1,2,3, etc.
Construction type-material ConstTypeMat VARCHAR(45) Wood, steel, etc.
Lateral force resisting system LFRD MEDIUMTEXT Braced moment frame, etc.
Green shaded rows represent primary keys.
12 Earthquake Spectra 00(0)
The damage assessment information category contains the shaking intensity (Table 6)
and inspection information (Table 7) tables. The latter includes the observed damage
description, the date of inspection, and the ATC-tag. The shaking intensity table includes
the Joyner–Boore distance, Sa0:3s, and pga. These tables are unique to a specific building
and earthquake, hence the need for the associated foreign keys.
The recovery assessment category includes the observed recovery table (Table 8) which
documents the inspection, permit, and repair times. The permit information table (Table
9), which includes the permit description types and recovery-related dates, also falls under
the recovery assessment category. The permit information is unique to a given building
and is therefore linked through a foreign key. The permit information is a foreign key in
the observed recovery because the inspection, permit, and repair times are computed using
the information contained in the permit information table.
The census-block-level demographics category (Table 10) includes information about
race, education, tenure (owner or renter), age, and income of the building occupants. As
noted earlier, these variables are linked to the building table through a foreign key. The
Table 6. Shaking intensity information table (ShakeInt)
Attribute Abbreviated attribute name Data type Examples
Shaking intensity info ID ShakeIntID INT 1,2,3, etc.
Joyner–Boore distance Rjb DECIMAL (20,16) 5.67, etc.
Sa
0.3s
Sa0.3s DECIMAL (20,16) 0.654, etc.
PGA PGA DECIMAL (20,16) 0.654, etc.
Building ID BuildingID INT 1,2,3, etc.
Earthquake ID EarthquakeID INT 1,2,3, etc.
Green shaded rows represent primary keys and blue shaded rows signify foreign keys.
Table 7. Inspection information table (Inspection)
Attribute Abbreviated
attribute name
Data type Examples
Inspection ID InspID INT 1,2,3, etc.
Observed damage description DamaDesc VARCHAR(1000) Chimney damage, etc.
Inspection date InspDate DATE 12/12/2012, etc.
ATC-tag ATC_Tag VARCHAR(45) Green, Red, etc.
Building ID BuildingID INT 1,2,3, etc.
Earthquake ID EarthquakeID INT 1,2,3, etc.
Green shaded rows represent primary keys and blue shaded rows signify foreign keys.
Table 8. Observed recovery table (ObservedReco)
Attribute Abbreviated attribute name Data type Examples
Observed recovery ID ObservedRecoID INT 1,2,3, etc.
Inspection time InspTime INT 1,2,3, etc.
Permit time PermitTime INT 1,2,3, etc.
Repair time RepairTime INT 1,2,3, etc.
Permit ID PermitID INT 1,2,3, etc.
Earthquake ID EarthquakeID INT 1,2,3, etc.
Green shaded rows represent primary keys and blue shaded rows signify foreign keys.
Omoya et al. 13
earthquake properties table (Table 11) contains specific information about a given event
(e.g. magnitude, epicenter location) and EarthquakeID is a foreign key in the shaking
intensity inspection and observed recovery tables. As reflected in the one-to-many relation-
ships, each earthquake is associated with multiple intensities, inspections, and recoveries.
Illustrative examples
The RDB is made publicly available as ‘‘earthquake_recovery_db’’ through the
DesignSafe cyberinfrastructure. In this section, four example queries are published in a
Jupyter Notebook and presented to demonstrate how targeted information can be
extracted from the database in DesignSafe. Note that these are meant to serve as illustra-
tive examples and therefore do not show the full range of information covered in the
Table 10. Sociodemography table (Sociodemography)
Attribute Abbreviated attribute name Data type Examples
Sociodemography ID SociodemID INT 1,2,3, etc.
HH
Eng
English_hh DECIMAL (15,10) 85.9, etc.
PN
.25 +HS
Diploma_hh DECIMAL (15,10) 85.9, etc.
PN
Hisp
Latino_hh DECIMAL (15,10) 85.9, etc.
PN
Black
AfrAmer_hh DECIMAL (15,10) 85.9, etc.
PN
Asian
Asian_hh DECIMAL (15,10) 85.9, etc.
HH
\18
Young_hh DECIMAL (15,10) 85.9, etc.
Inc
Ann
Income DECIMAL (15,10) 250,000 etc.
HU
%Own
Ownedunits DECIMAL (15,10) 85.9, etc.
Green shaded rows represent primary keys.
Table 11. Earthquake table (Earthquake)
Attribute Abbreviated attribute name Data type Examples
Earthquake ID EarthquakeID INT 1
Earthquake name EarthquakeName VARCHAR (45) 2014 Napa
Valley Earthquake
Earthquake date EarthquakeDate DATE 8/24/2014
Epicenter longitude EpiLong DECIMAL (15,10) 2122.31
Epicenter latitude EpiLat DECIMAL (15,10) 38.217
Magnitude Magnitude DECIMAL (3,2) 6.9
Green shaded rows represent primary keys.
Table 9. Permit information table (PermitInfo)
Attribute Abbreviated attribute name Data type Examples
Permit ID PermitID INT 1,2,3, etc.
Permit description Permit Desc LONGTEXT Repair Foundation
Permit type PermitType VARCHAR (45) Foundation
Date applied DateApplied DATE 12/12/2012, etc.
Date received DateRecvd DATE 12/12/2012, etc.
Date completed DateComp DATE 12/12/2012, etc.
Date expired DateExpired DATE 12/12/2012, etc.
Building ID BuildingID INT 1,2,3, etc.
Green shaded rows represent primary keys and blue shaded rows signify foreign keys.
14 Earthquake Spectra 00(0)
database. Figure 8 shows the script used to import the database for query through Jupyter
Hub on DesignSafe.
The first query is used to generate site information for the red-tagged buildings. The
associated script and a subset of the generated data are shown in Figure 9. With this type
of query, one can examine the Vs30, basin depth, and location associated with the most
severely damaged buildings.
Figure 10 shows the script used to query the observed recovery time information for all
buildings that have ATC-20 tags and the subsequent results. With the data generated by
this query, the effect of the level of damage on the various recovery time parameters
(inspection, permit, and repair) can be investigated.
The next query is used to find the shaking intensities associated with red-tagged build-
ings. The association between the extent of damage and shaking intensity can be inferred
from this type of query. The associated script and a subset of the generated data are shown
Figure 11.
The final query is used to find the repair times associated with specific sociodemo-
graphics for all buildings. The script used for the DesignSafe query and part of the
Figure 8. Jupyter Hub script used to import and query the database on DesignSafe.
Figure 9. Jupyter Hub script and results from the query seeking site properties associated with the red-
tagged buildings.
Omoya et al. 15
Figure 10. Jupyter Hub script and results from the query seeking the observed recovery information
and ATC-tags of all buildings.
Figure 11. Jupyter Hub script and results from the query seeking the shaking intensity information for
all red-tagged buildings.
16 Earthquake Spectra 00(0)
generated data are shown in Figure 12. This type of data can provide insight into dispari-
ties in the pace of recovery based on factors such as race, income, and tenure (owner or
renter occupied).
Summary of prior applications of the data set
The authors have already used different parts of the database in several studies. Kang et al.
(2018) used the time-to-permit and repair times for 456 buildings affected by the 2014
South Napa earthquake to validate and extend a post-earthquake recovery simulation
methodology. This data subset was used to establish an observed recovery trajectory for
the buildings. First, the following three building-level recovery states were defined: pre-
construction, during-construction (or ‘‘construction’’), and post-construction (or ‘‘com-
plete’’). The observed recovery trajectory was then constructed by assigning recovery levels
to each state and using the permit and repair times from the data subset. A time-based sto-
chastic process model was used to perform a ‘‘blind’’ simulation where a replication of the
observed trajectory was attempted without using the empirical data from the 2014 earth-
quake. This simulation was able to capture the overall shape of the observed trajectory,
including the sharp increase during the period immediately following the event and slower
pace of recovery in the later stages. However, the blind model also overpredicted the recov-
ery level by as much as a factor of 2.6. As expected, using the mean permit and repair times
from the empirical data significantly improved the replication of the observed recovery. To
generalize the time-based stochastic process model, the Random Forests (RF) algorithm
was used to link the time-to-permit and repair time to 12 predictors related to the level of
damage, general characteristics of the building (e.g. age, occupancy, property value), and
census-block-level sociodemographic variables (e.g. percentage of ethnic minorities).
Another subset of the database has been used to explore the effectiveness of different
machine learning techniques in predicting earthquake damage to buildings (Mangalathu
Figure 12. Jupyter Hub script and results from querying the sociodemographics and repair times for all
buildings.
Omoya et al. 17
et al., 2020). Classification models for predicting the ATC-20 tag were developing using
the discriminant analysis, k-nearest neighbors, decision trees, and RF algorithms and a
data set comprised of 2276 buildings. The features or model inputs included seismic para-
meters such as Sa0:3s,Rjb, and Vs30 as well as variables related to the building vulnerability,
including age, number of stories, the presence of plan irregularities, and the value (in dol-
lars) and total floor area of the building. For all four algorithms, the machine learning
model was trained using 70% of the data and evaluated using the remaining 30% (i.e. the
testing set). The RF algorithm had the best performance, predicting the ATC-tags in the
testing set with an overall accuracy of 66%. The skewed distribution of the tags (i.e. low
number of red tags relative to green and yellow) was the main challenge in developing the
model. For example, the recall (percentage of the actual tags that are correctly assigned)
of the RF red-tag predictions was only 13% compared with 52% and 79% for the green
and yellow tags, respectively. It was also noted that an RF model with 65% classification
accuracy was achieved using only Sa0:3s,Rjb, and the building age as the predictors.
Mangalathu and Burton (2019) also utilized a subset of the Napa data to explore the use
of text-based descriptions (the ones generated by in-person inspections) as the features in
predicting building damage. Using written descriptions from 3423 buildings, they trained a
long-term short memory (LSTM) deep learning model to predict the distribution of ATC-
20 tags. The LSTM model achieved an accuracy of 86% on the testing set. Similar to the
other study (Mangalathu et al., 2020), the lowest recall was associated with the red tag pre-
dictions (63% compared with 94% and 84% for green and yellow tags, respectively).
Possible future applications and extensions of the data set
There are several additional potential future applications of the assembled database that
can be undertaken by other researchers. All of the studies described in the previous section
are based on a single data set from the 2014 South Napa earthquake. Because of this, the
findings cannot be generalized. As such, additional studies that utilize data sets that are
diverse in terms of the scale of earthquake damage and target region are much needed.
For the study by Kang et al. (2018), the effect of lifeline damage and restoration on post-
earthquake building recovery was not considered. This again points to the need for future
studies that leverage integrated data on building and lifeline damage and recovery in simula-
tion modeling. The Kang et al. (2018) study incorporated the building permit data by using
the mean observed permit and repair time values in the recovery simulation model. This
approach significantly biases the resulting model toward a single event and therefore limits
its generality. A more systematic and balanced approach would be to utilize Bayesian infer-
encing to update the prior temporal parameters using the data from the 2014 event. The
recovery literature will also benefit from Longitudinal studies that utilize data acquired
from recurring visits to affected buildings are needed to benchmark the timestamps pro-
vided by the permit data to functional restoration. Finally, the empirical data assembled by
this study and future studies can complement or update the damage and recovery-related
recommendations that are based on expert opinion. In this regard, recovery models can be
developed using machine learning algorithms in combination with expert opinion.
The studies that developed machine learning models to predict building damage utilized
data from a single event. Despite the use of the training-testing data split in the development
of these models, the extent to which they can be generalized is questionable at best. The valid-
ity of these machine learning–based damage prediction models, which have been developed
using a single-event data set, can be investigated by evaluating them using data from a
18 Earthquake Spectra 00(0)
different event. For example, the model developed using the Napa data can be tested against
observations from the 1994 Northridge earthquake. The performance of such a model can
also be benchmarked against others that have been developed using multi-event data sets. As
a standalone data set, the Napa data can also be used to develop multimodal machine learn-
ing models, which combine features from different modalities (e.g. text, images, categorical or
continuous variables) into a single algorithm (Baltrusaitis et al., 2018).
Conclusion
A relational database to support post-earthquake building damage and recovery assess-
ment is developed using MySQL and made publicly available through the DesignSafe
cyberinfrastructure. In its current form, the database includes 3695 buildings impacted by
the 2014 South Napa, California, earthquake. The information provided in the database is
categorized into earthquake properties, building and site characteristics, damage and
recovery assessment (permitting) information, and census-block-level demographics. Most
of the buildings in the Napa data set are single- and multifamily dwellings with one or two
stories. Included in the building damage information are brief descriptions of the physical
impacts and the ATC-20 tags assigned during the in-person field inspections. The shaking
intensity at each site is documented in terms of peak ground acceleration and spectral
acceleration at a period of 0.3 s. The recovery-related data include the dates corresponding
to the completion of inspection, permitting, and repairs (based on the building department
certification). These data types have already been shown to be useful for developing or
evaluating the efficacy of post-earthquake damage and recovery assessment models.
There are several opportunities to expand and enhance the current database. One obvious
extension is to include additional data sets that comprise buildings affected by other earth-
quakes. The fidelity of future data sets relative to the current one could also be improved. For
example, in the Napa data, the repair duration is inferred from the dates associated with permit
acquisition and completion. However, in many instances, the repairs may be completed long
before the work is certified by the building department. In addition, while fields were included
for the building construction type, lateral force resisting system, and number of occupants, they
have not been populated in the current version of the Napa data set. Expansion of the data-
base to include buildings affected by multiple events will create opportunities for developing
more generalizable predictive models (damage and recovery). Also, data on the time it takes to
acquire recovery financing from different sources (e.g., loans, local and federal assistance)
would make a valuable addition to the database. It is also important to note that the set of
sociodemographic factors that are included in the data set is by no means comprehensive and
additional data should be assembled that better reflects the extensive social science literature on
this topic. Finally, while the current database can only accommodate data sets related to build-
ings affected by earthquakes, the overall structure can be adapted to consider other types of
natural hazards (e.g., hurricanes, floods, and wildfires) and infrastructure (e.g., lifelines).
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/
or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/
or publication of this article: This research is supported by National Science Foundation Award No.
1554714.
Omoya et al. 19
ORCID iDs
Mohsen Zaker Esteghamati https://orcid.org/0000-0002-2144-2938
Scott Brandenberg https://orcid.org/0000-0003-2493-592X
Chukuebuka C Nweke https://orcid.org/0000-0002-8939-571X
References
Almufti I and Willford M (2013) REDi
ä
:Resilience-Based Earthquake Design Initiative (REDi
ä
)
Rating System. London: Arup Group.
Applied Technology Council (ATC) (1995) Procedures for Post-Earthquake Building Safety
Evaluation Procedures (ATC-20). Redwood City, CA: ATC.
Baltrusaitis T, Ahuja C and Morency LP (2018) Multimodal machine learning: A survey and
taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41: 423–443.
Boatwright J, Blair JL, Aagaard BT and Wallis K (2015) The distribution of red and yellow tags in
the City of Napa. Seismological Research Letters 86(2A): 361–368.
Boore DM, Stewart JP, Seyhan E and Atkinson GM (2014) NGA-West2 equations for predicting
PGA, PGV, and 5% damped PSA for shallow crustal earthquakes. Earthquake Spectra 30(3):
1057–1085.
Brocher TM, Aagaard BT, Simpson RW and Jachens RC (2006) The USGS 3D seismic velocity
model for northern California. Paper presented at the 2006 fall meeting (Abstract id S51B-1266),
American Geophysical Union (AGU), San Francisco, CA, 11–15 December.
Burton HV, Kang H, Miles SB, Nejat A and Yi Z (2019) A framework and case study for integrating
household decision-making into post-earthquake recovery models. International Journal of
Disaster Risk Reduction 37: 101167.
Center for Engineering Strong Motion Data (CESMD) (2018) Data for latest earthquakes. Available
at: https://www.strongmotioncenter.org (accessed on January 2021).
Chen P and Lee EJ (2017) UCVM 17.3.0 documentation. Available at: http://hypocenter.usc.edu/
research/ucvm/17.3.0/docs/index.html (accessed on January 2021).
City of Napa (2020a) City of Napa assessor parcel data. Available at: https://
www.countyofnapa.org/150/Assessor-Parcel-Data (accessed on January 2021).
City of Napa (2020b) City of Napa community development department. Available at: https://
etrakit.cityofnapa.org/etrakit/Search/permit.aspx (accessed on January 2021)
Earthquake Engineering Research Institute (EERI) (2016) 2014 South Napa earthquake data.
Available at: http://eqclearinghouse.org/map/2014-08-24-south-napa/ (accessed on January 2021).
Elliott JR (2015) Natural hazards and residential mobility: General patterns and racially unequal
outcomes in the United States. Social Forces 93(4): 1723–1747.
Federal Emergency Management Agency (FEMA) (2003a) Multi-Hazard Loss Estimation
Methodology—Earthquake Model: HAZUS MH-MR4 Technical Manual. Washington, DC:
Department of Homeland Security, FEMA.
Federal Emergency Management Agency (FEMA) (2003b) NHERP Recommended Provisions for
Seismic Regulations for New Buildings and Other Structures (FEMA 450-1). Washington, DC:
Department of Homeland Security, FEMA.
Federal Emergency Management Agency (FEMA) (2012) Seismic Performance Assessment of
Buildings. Redwood City, CA: Applied Technology Council (ATC).
Kang H, Burton HV and Miao H (2018) Replicating the recovery following the 2014 South Napa
earthquake using stochastic process models. Earthquake Spectra 34(3): 1247–1266.
Mangalathu S and Burton HV (2019) Deep learning-based classification of earthquake-impacted
buildings using textual damage descriptions. International Journal of Disaster Risk Reduction 36:
101111.
Mangalathu S, Sun H, Nweke CC, Yi Z and Burton HV (2020) Classifying earthquake damage to
buildings using machine learning. Earthquake Spectra 36: 183–208.
Manson S, Schroeder J, Van Riper D and Ruggles S (2017) IPUMS National Historical Geographic
Information System: Version 12.0 (Database). Minneapolis, MN: University of Minnesota, p. 39.
20 Earthquake Spectra 00(0)
Miles SB and Chang SE (2007) A simulation model of urban disaster recovery and resilience:
Implementation for the 1994 Northridge earthquake. Technical report MCEER-07-0014, 7
September. Buffalo, NY: Multidisciplinary Center for Earthquake Engineering Research,
University at Buffalo.
MySQL (2020) Open source database. Available at: https://www.mysql.com/ (accessed on January
2021).
Omoya M, Ero I, Zaker Esteghamati M, Burton HV, Brandenberg S and Nweke C (2021) Relational
Database for Post-Earthquake Damage and Recovery Assessment: 2014 South Napa Earthquake
(DesignSafe-CI). Available at: https://doi.org/10.17603/ds2-3nvj-4127 (accessed on January
2021).
Rathje EM, Dawson C, Padgett JE, Pinelli JP, Stanzione D, Adair A, Arduino P, Brandenberg SJ,
Cockerill T, Dey C, Esteva M, Haan FL, Hanlon M, Kareem A, Lowes L, Mock S and
Mosqueda G (2017) DesignSafe: New cyberinfrastructure for natural hazards engineering.
Natural Hazards Review 18(3): 06017001.
Seyhan E and Stewart JP (2014) Semi-empirical nonlinear site amplification from NGA-West2 data
and simulations. Earthquake Spectra 30(3): 1241–1256.
United States Census (2020) United States Census Bureau. Available at: https://www.census.gov/
quickfacts/napacountycalifornia (accessed on January 2021).
United States Geological Survey (USGS) (2020) The United States Geological Survey hazard maps.
Available at: https://earthquake.usgs.gov/ws/designmaps/asce7-16.html (accessed on January
2021).
Wills CJ, Gutierrez CI, Perez FG and Branum DM (2015) A next generation VS30 map for
California based on geology and topography. Bulletin of the Seismological Society of America
105(6): 3083–3091.
Zaker Esteghamati M, Lee J, Musetich M and Flint MM (2020) INSSEPT: An open-source
relational database of seismic performance estimation to aid with early design of buildings.
Earthquake Spectra 36: 2177–2197.
Zhang Y and Peacock WG (2009) Planning for housing recovery? Lessons learned from Hurricane
Andrew. Journal of the American Planning Association 76(1): 5–24.
Omoya et al. 21
... The emergence of xBD has demonstrably motivated researchers to develop more comprehensive datasets for pre-and post-disaster building damage assessment. One such example is the database compiled by M. Omoya et al., which encompasses information on over 3600 buildings impacted by the 2014 South Napa earthquake, including building characteristics, site attributes, seismic intensity, damage and repair details, and demographic statistics [87]. Additionally, Zhe Dong et al. collected post-disaster drone imagery following the 2021 Yunan Dali Yangbi earthquake and determined damage levels by referencing pre-disaster images from Google Earth [88]. ...
Article
Full-text available
After a disaster, ascertaining the operational state of extensive infrastructures and building clusters on a regional scale is critical for rapid decision-making and initial response. In this context, the use of remote sensing imagery has been acknowledged as a valuable adjunct to simulation model-based prediction methods. However, a key question arises: how to link these images to dependable assessment results, given their inherent limitations in incompleteness, suboptimal quality, and low resolution? This article comprehensively reviews the methods for post-disaster building damage recognition through remote sensing, with particular emphasis on a thorough discussion of the challenges encountered in building damage detection and the various approaches attempted based on the resultant findings. We delineate the process of the literature review, the research workflow, and the critical areas in the present study. The analysis result highlights the merits of image-based recognition methods, such as low cost, high efficiency, and extensive coverage. As a result, the evolution of building damage recognition methods using post-disaster remote sensing images is categorized into three critical stages: the visual inspection stage, the pure algorithm stage, and the data-driven algorithm stage. Crucial advances in algorithms pertinent to the present research topic are comprehensively reviewed, with details on their motivation, key innovation, and quantified effectiveness as assessed through test data. Finally, a case study is performed, involving seven state-of-the-art AI models, which are applied to sample sets of remote sensing images obtained from the 2024 Noto Peninsula earthquake in Japan and the 2023 Turkey earthquake. To facilitate a cohesive and thorough grasp of these algorithms in their implementation and practical application, we have deliberated on the analytical outcomes and accentuated the characteristics of each method through the practitioner’s lens. Additionally, we propose recommendations for improvements to be considered in the advancement of advanced algorithms.
... Post-earthquake recovery models rely on empirical data, such as duration-based parameters that define recovery activities. Omoya et al. developed a relational database to compile the recovery efforts of 3695 buildings after the 2014 Napa earthquake [44]. The relational database provides empirical data on building general topology, site, observed damage, and duration-based recovery measures (such as timestamps for initiation and completion of permitting or repair. ...
... In recent decades, the number of multi-story buildings has exponentially increased due to the urbanization and availability of modern technology and construction materials. Post-earthquake damage assessment of buildings is necessary for seismic risk mitigation and resilience planning [1]. With recent progress in signal processing and sensor technology, damage detection using data collected from the sensors attached to buildings is becoming very popular in structural health monitoring (SHM) [2,3]. ...
Article
Full-text available
The structure is said to be damaged if there is a permanent shift in the post-event natural frequency of a structure as compared with the pre-event frequency. To assess the damage to the structure, a time-frequency approach that can capture the pre-event and post-event frequency of the structure is required. In this study, to determine these frequencies, a local maximum synchrosqueezing transform (LMSST) method is employed. Through the simulation results, we have shown that the traditional methods such as the Wigner distribution, Wigner–Ville distributions, pseudo-Wigner–Ville distributions, smoothed pseudo-Wigner–Ville distribution, and synchrosqueezing transforms are not capable of capturing the pre-event and post-event frequency of the structure. The amplitude of the signal captured by sensors during those events is very small compared with the signal captured during the seismic event. Thus, traditional methods cannot capture the frequency of pre-event and post-event, whereas LMSST employed in this work can easily identify these frequencies. This attribute of LMSST makes it a very attractive method for post-earthquake damage detection. In this study, these claims are qualitatively and quantitatively substantiated by comprehensive numerical analysis.
... In recent decades the number of multi-story buildings has exponentially increased due to the urbanization and availability of modern technology and construction material. Post-earthquake damage assessment of buildings is necessary for seismic risk mitigation and resilience planning [1]. With recent progress in signal processing and sensor technology, damage detection using data collected from the sensors attached to buildings is becoming very popular in structural health monitoring (SHM) [2,3]. ...
Preprint
Full-text available
The natural frequency of buildings decreases during a strong-motion earthquake, and the structure loses its stiffness. As a result, understanding the damaging process in the structure owing to changes in structural properties is critical during a seismic excitation. The time-frequency technique can detect the damaged building’s time-varying frequency contents. Wigner distributions (WD), Wigner-Ville distributions (WVD), pseudo-Wigner-Ville distributions (PWVD), smoothed pseudo-Wigner-Ville distributions (SPWVD), and synchrosqueezing transforms (SST) have all become popular in recent years for a variety of earthquake engineering applications, including building damage detection. This study proposes the local maximum synchrosqueezing transform (LMSST) for detecting frequency shifts in buildings during strong earthquakes. The data presented in the research show that the suggested method outperforms as compared to the conventional time-frequency methods for detecting frequency shifting in earthquake-damaged structures.
... The relational database introduced in this article aims to serve as a medium for public use. Within the earthquake engineering community, relational databases have been used to organize data related to the seismic performance of buildings (Esteghamati et al., 2020), ground motion liquefaction (Brandenberg et al., 2020), post-earthquake damage and recovery of buildings (Omoya et al., 2022) and subduction ground motions (Mazzoni et al., 2022). Figure 9 is the schema for the relational database, which provides a graphical visualization of how it is organized. ...
Article
Full-text available
Historical data play a primary role in reconstructing building seismic responses and assessing damage in near-real time. While these types of data sets exist, they are often fragmented, difficult to access, and require significant manual manipulation on the part of the user to obtain useful information. This article introduces a relational database that comprises historical data from 216 buildings subjected to [Formula: see text] earthquakes, spanning a 36-year period in California. It includes comprehensive information about the events and accelerometer-equipped buildings, parameters that are often used in post-earthquake impact assessments, and time-series structural response data recorded during real earthquakes. The database is designed to facilitate incremental updating as new events occur, and the associated data becomes available. It is also paired with a Python-based tool that reduces the barrier to user access with a few simple inputs. The long-term goal is to expand the database to include additional seismic events and spur the development of similar single- or multi-hazard repositories.
... There have been a few attempts to use RDBs to structure performance data. (Omoya et al., 2022). ...
Conference Paper
Full-text available
Achieving resilient and sustainable infrastructure urges developing computational tools to explicitly consider performance objectives in all design and construction stages. The majority of critical decisions are made at earlier stages of the design. Early design can be substantially improved by incorporating quantitative methods to evaluate the consequences of these decisions. However, implementing quantitative methods poses several challenges, including imprecision of design variables and time-and effort-intensiveness of such assessments. This paper presents a modular framework to select suitable candidate structural systems, characterize their design parameters range, and communicate their expected hazard and environmental performance during their life cycles. The framework leverages a machine-learning-assisted workflow that performs mapping between crude design-and topology-related parameters and global hazard and environmental performance indicators. Next, a sequence of surrogate models with varying fidelity aids in performing the convergence-divergence cycle of early design. Lastly, a deep learning architecture with a customized loss function maps the result of simpler static analysis to the detailed description of seismic performance, linking early design to the next design stages. A case study is presented to illustrate the application of the framework to evaluate the embodied carbon and seismic-related repair cost of an inventory of 720 multi-story concrete frames with varying topologies in Charleston, South Carolina.
... Post-earthquake recovery models rely on empirical data, such as duration-based parameters that define recovery activities. Omoya et al. developed a relational database to compile the recovery efforts of 3695 buildings after the 2014 Napa earthquake [44]. The relational database provides empirical data on building general topology, site, observed damage, and duration-based recovery measures (such as timestamps for initiation and completion of permitting or repair. ...
Preprint
Full-text available
The increasing vulnerability of communities to natural hazards motivates novel design and assessment methods to ensure that the built environment performs optimally during its lifetime. The current design methodologies do not account for life-cycle impacts across multiple performance domains such as economy and environment. Therefore, low-effort and designer-centric computational methods are needed to support a multi-objective performance-based design, from the conceptual stage to design development. This study presents a framework for a holistic performance-based seismic design of buildings. The proposed framework leverages machine learning techniques to extract the implicit, and highly complex, relationship between design parameters, geometric configuration, and performance measures. At early design, data-driven surrogate models (trained on performance inventories) are used to identify candidate structural systems and their approximate design parameters. At the detailed design stage, a deep learning-based engine generates seismic risk estimates based on simpler nonlinear static analysis on the candidate systems or their equivalent low-order dynamic models. A case study illustrates the framework’s application for performance-based seismic design of multistory commercial buildings in Charleston, SC.
Article
Full-text available
Effective relief reduces damages and protects people during natural disasters, such as earthquakes. This research proposes a data-driven model based on sustainability, taking into account the pre and post-crisis simultaneously. Real data was used to validate the model in various earthquake scenarios. The study addresses questions regarding the amount and allocation of relief goods during earthquakes. This research is carried out in two phases: simulation and modeling. The purpose of the simulation phase is to estimate the number of relief goods in different scenarios. Additionally, in the modeling phase, a data-based multi-objective model is presented, considering sustainability, to minimize the lack of relief goods, the number of untreated wounded, and supply chain costs. Using the dynamic simulation system, and after designing the structure of the earthquake effects on urban infrastructure, the actions and effects of the earthquake on vital arteries are investigated in different scenarios, and scenarios with a higher degree of risk are identified. The results showed that the highest and lowest demands for relief goods were related to the “Mosha-day fault” and “North Tehran-night fault” scenarios, respectively.
Article
Full-text available
Performance-based earthquake engineering (PBEE) assessments are data-, effort and time-intensive, usually requiring a detailed structural model and limiting their integration with early design. Decades of research have produced an abundance of PBEE assessments for different structural systems and building taxonomies. The results of these PBEE studies can be assimilated to approximately represent the seismic design space for new structures and to identify possibly optimal systems with low effort. This paper introduces an open-source relational database, Inventory of Seismic Structural Evaluations, Performance Functions and Taxonomies for Buildings (INSSEPT) that contains PBEE assessment of 222 buildings from literature and is freely available to the public in a natural hazard repository. INSSEPT is organized to provide a curated building taxonomy and PBEE data to readily serve as a resource for early design or PBEE-derived regional seismic risk analysis.
Article
Full-text available
The ability to rapidly assess the spatial distribution and severity of building damage is essential to post-event emergency response and recovery. Visually identifying and classifying individual building damage requires significant time and personnel resources and can last for months after the event. This paper evaluates the feasibility of using machine learning techniques such as discriminant analysis, K-nearest neighbors, decision trees and random forests, to rapidly predict earthquake-induced building damage. Data from the 2014 Napa earthquake is used for the study where building damage is classified based on the assigned ATC-20 tag (red, yellow and green). Spectral acceleration at a period of 0.3 s, fault distance and several building specific characteristics (e.g. age, floor area, presence of plan irregularity) are used as features or predictor variables for the machine learning models. A portion of the damage data from the Napa earthquake is used to obtain the forecast model and the performance of each machine learning technique is evaluated using the remaining (test) data. It is noted that the random forest algorithm can accurately predict the assigned tags for 66% of the buildings in the test dataset.
Method
Full-text available
The REDi guidelines provide owners, architects and engineers a framework for resilience-based earthquake design to achieve "beyond-code" resilience objectives..
Article
Full-text available
Our experience of the world is multimodal - we see objects, hear sounds, feel texture, smell odors, and taste flavors. Modality refers to the way in which something happens or is experienced and a research problem is characterized as multimodal when it includes multiple such modalities. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals together. Multimodal machine learning aims to build models that can process and relate information from multiple modalities. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Instead of focusing on specific multimodal applications, this paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy. We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning. This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research.
Article
Full-text available
Correlations between geologic units and shear wave velocity form the basis of a series of maps developed over the past 15 years to estimate average shear-wave velocity to 30m (Vs30). Wills et al.’s (2000) site-condition map for California was found to correlate with seismic amplification (Field, 2000) and was adopted as a standard depiction for many applications of seismic shaking estimates (ShakeMap for example). Wills and Clahan (2006) modified that map to show simplified geologic units and corresponding Vs30 values. Preparation of this map raised a number of questions on how best to distinguish units within younger alluvium. Wills and Gutierrez (2011) found that a simple system based on surface slope could be used to subdivide the younger alluvium into three classes that have distinct Vs30 ranges. The classes defined by slope have approximately the same variability in Vs30 as the previously defined classes, but the total number of classes is reduced and the system can be easily applied to other tectonically active areas. We have now applied the system of Wills and Gutierrez (2011) to create a new map of California using the most detailed available geologic maps. Use of more detailed geologic maps, from 1:250,000 scale to 1:24,000 for much of California, results in a much more detailed and accurate depiction of the surficial geology and, we anticipate, a more detailed and accurate depiction of seismic amplification due to the near-surface materials.
Article
Prior empirical research has demonstrated that the decisions of affected populations can significantly influence housing recovery outcomes following a natural hazard event. The current study seeks to develop an integrated post-earthquake recovery model that explicitly accounts for household decision-making. An empirical probabilistic utility-based decision model is developed using data from a survey of Los Angeles households. The results from a multinomial logistic regression showed that the time in residence, neighborhood evacuation level, physical damage to residence, duration of utility disruption and loss of access to the building, household income and earthquake insurance coverage had a statistically significant association with homeowners' decisions. For renter decision-making, only physical damage to the residence and duration of utility disruption are found to be statistically significant. In addition to household decision-making, the integrated model incorporates probabil-istic building performance assessment and a discrete-state stochastic process representation of post-earthquake housing recovery. The results from a case study incorporating three Los Angeles neighborhoods (Koreatown, East Hollywood and Lomita) show that the influence of household decision-making on occupancy-based recovery trajectories is amplified as the scale of damage increases.
Article
In order to make informed decisions and have a well-coordinated response to a major hazard event, a rapid assessment of the spatial distribution and severity of building damage is needed. Towards this end, the long short-term memory (LSTM) deep learning method is applied to classify building damage based on textual descriptions of damage. The damaged state of an individual building is classified using the ATC-20 tags (red, yellow and green). The application of the LSTM approach is demonstrated using building damage descriptions recorded following the 2014 South Napa earthquake in California. The damage set, which consists of 3423 points (1552 green tags, 1674 yellow tags, 197 red tags), is divided randomly into training and test subsets. A predictive model is established during the LSTM and the performance of the model is evaluated using the test set. The LSTM model is shown to have an overall accuracy of 86% in identifying the ATC-20 tags for the test set. Despite some noted limitations, the study highlighted the overall potential of using the LSTM method to rapidly assess building cluster damage using textual information, which can be generated by engineering experts, stakeholders or social media platforms.
Article
Post-earthquake recovery models can be used as decision support tools for pre-event planning. However, due to a lack of available data, there have been very few opportunities to validate and/or calibrate these models. This paper describes the use of building damage, permitting, and repair data from the 2014 South Napa Earthquake to evaluate a stochastic process post-earthquake recovery model. Damage data were obtained for 1,470 buildings, and permitting and repair time data were obtained for a subset (456) of those buildings. A “blind” prediction is shown to adequately capture the shape of the recovery trajectory despite overpredicting the overall pace of the recovery. Using the mean time to permit and repair time from the acquired data set significantly improves the accuracy of the recovery prediction. A generalized model is formulated by establishing statistical relationships between key time parameters and endogenous and exogenous factors that have been shown to influence the pace of recovery.
Article
Natural hazards engineering plays an important role in minimizing the effects of natural hazards on society through the design of resilient and sustainable infrastructure. The DesignSafe cyberinfrastructure has been developed to enable and facilitate transformative research in natural hazards engineering, which necessarily spans across multiple disciplines and can take advantage of advancements in computation, experimentation, and data analysis. DesignSafe allows researchers to more effectively share and find data using cloud services, perform numerical simulations using high performance computing, and integrate diverse datasets so that researchers can make discoveries that were previously unattainable. This paper describes the design principles used in the cyberinfrastructure development process, introduces the main components of the DesignSafe cyberinfrastructure, and illustrates the use of the DesignSafe cyberinfrastructure in research in natural hazards engineering through various examples.
Article
The M 6.0 South Napa earthquake that occurred on 24 August 2014, strongly shook the City of Napa, damaging residential and commercial buildings from Brown’s Valley through the historic downtown. The damage to wood‐frame houses occurred largely from broken and cracked chimneys, although a number of houses were shifted on their foundations, or suffered racking or failure of cripple walls. In the downtown area, many masonry buildings, both unreinforced and retrofitted, were damaged, including the 1870 Napa courthouse. When earthquakes damage municipalities, it is important for planners and residents to understand the factors that condition this damage, because these factors can determine where damage will recur in future earthquakes. For areas subjected to strong ground motion, that is, peak accelerations exceeding 25% g and peak velocities exceeding 40 cm/s, municipal‐tagging data can provide a spatial detail of building damage that greatly exceeds what can be obtained from ground‐motion recordings or intensity surveys. In this paper, we use the red‐ and yellow‐tagging data assembled by the City of Napa to identify zones of extensive damage. We compare these zones with the pre‐1950 development of Napa, the recent alluvial geology of the Napa Valley (Witter et al. , 2006), and the depth of the underlying sedimentary basin, as inferred from the Bouguer gravity anomaly (Jachens et al. , 2006; Aagaard et al. , 2010). The extent of the central damage zone is strongly correlated with the pre‐1950 buildings and with the underlying sedimentary basin, but poorly correlated with the alluvial geology. The City of Napa, with the assistance of structural engineers who volunteered from many areas across California, tagged and retagged damaged buildings throughout the city (Earthquake Engineering Research Institute, 2014). The tagging data provide a complete municipal reporting of prohibited (red tags) or restricted (yellow tags) access to earthquake‐damaged structures. Some buildings received multiple …