ChapterPDF Available

Big Data Security: Challenges, Recommendations and Solutions

Authors:

Abstract

The value of Big Data is now being recognized by many industries and governments. The efficient mining of Big Data enables to improve the competitive advantage of companies and to add value for many social and economic sectors. In fact, important projects with huge investments were launched by several governments to extract the maximum benefit from Big Data. The private sector has also deployed important efforts to maximize profits and optimize resources. However, Big Data sharing brings new information security and privacy issues. Traditional technologies and methods are no longer appropriate and lack of performance when applied in Big Data context. This chapter presents Big Data security challenges and a state of the art in methods, mechanisms and solutions used to protect data-intensive information systems.
301
Copyright © 2015, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Chapter 14
DOI: 10.4018/978-1-4666-8387-7.ch014
Big Data Security:
Challenges, Recommendations
and Solutions
ABSTRACT
The value of Big Data is now being recognized by many industries and governments. The efficient min-
ing of Big Data enables to improve the competitive advantage of companies and to add value for many
social and economic sectors. In fact, important projects with huge investments were launched by sev-
eral governments to extract the maximum benefit from Big Data. The private sector has also deployed
important efforts to maximize profits and optimize resources. However, Big Data sharing brings new
information security and privacy issues. Traditional technologies and methods are no longer appropri-
ate and lack of performance when applied in Big Data context. This chapter presents Big Data security
challenges and a state of the art in methods, mechanisms and solutions used to protect data-intensive
information systems.
INTRODUCTION
The value of Big Data is now being recognized
by many industries and governments. In fact, the
efficient mining of Big Data enables to improve
the competitive advantage and to add value for
many sectors (economic, social, medical, scientific
research and so on).
Big Data is mainly defined by its 3Vs funda-
mental characteristics. The 3Vs include Velocity
(data are growing and changing in a rapid way),
Variety (data come in different and multiple
formats) and Volume (huge amount of data is
generated every second) (Wu, Zhu, Wu, & Ding,
2014). According to (Berman, 2013) these three
characteristics must coexist to confirm that a
source is a Big Data source. If one of these three Vs
does not apply, we cannot discuss about Big Data.
(Berman, 2013) and (Katal, Wazid, & Goudar,
2013) indicate that more Vs and other character-
Fatima-Zahra Benjelloun
Ibn Tofail University, Morocco
Ayoub Ait Lahcen
Ibn Tofail University, Morocco
302
Big Data Security
istics have been added by some Big Data actors
to better define it: Vision (the defined purpose of
Big Data mining), Verification (processed data
comply to some specifications), Validation (the
purpose is fulfilled), Value (pertinent information
can be extracted for the benefit of many sectors),
Complexity (it is difficult to organize and analyse
Big data because of evolving data relationships)
and Immutability (collected and stored Big Data
can be permanent if well managed).
Beside this, some argue when defining Big
Data, that any huge amount of digital data sets that
we can no longer collect and process adequately,
through the existing infrastructures and technolo-
gies, are by nature Big Data.
In this chapter, we are interested in security
challenges faced in Big Data context. We present
also a state of the art in several methods, mecha-
nisms and solutions used to protect information
systems that handle large data sets.
Big Data security has many common points
with the security of traditional information systems
(where data are structured). However, Big Data
security requires more powerful tools, appropriate
methods and advanced technologies for rapid data
analysis. It requires also a new security manage-
ment model that handles in parallel internal data
(data produced by internal systems and processes
within an organization) and external data (e.g.,
data collected from other companies or external
web sites). Regarding those points, many ques-
tions can be raised: i) How to manage and process
securely large, unstructured and heterogeneous
types of data sets? ii) How to integrate security
mechanisms into distributed platforms while en-
suring a good performance level (e.g., efficient
storage, rapid processing and real-time analysis)?
iii) How to analyse massive data streams without
compromising data confidentiality and privacy?
This chapter presents first these challenges
in detail. Then, it discusses various solutions
and recommendations proposed to protect data-
intensive information systems.
SECURITY CHALLENGES
IN BIG DATA CONTEXT
As mentioned by (Kim, Kim, & Chung, 2013),
security in Big Data context includes three main
aspects: information security, security monitoring
and data security. For (Lu et al., 2013), managing
security in a distributed environment means to
ensure Big Data management, system integrity
and cyberspace security.
Generally, Big Data security aims to ensure
a real-time monitoring to detect vulnerabilities,
security threats and abnormal behaviours; a granu-
lar role-based access control; a robust protection
of confidential information and a generation of
security performance indicators. It supports rapid
decision-making in a security incident case. The
following sections identify and explain a number
of challenges to achieve these goals.
Big Data Nature
Because of Big Data velocity and huge volumes,
it is difficult to protect all data. Indeed, adding
security layers may slow system performances
and affect dynamic analysis. Thus, access con-
trol and data protection are two “BIG” security
problems (Kim et al., 2013). Furthermore, it is
difficult to handle data classification and man-
agement of large digital disparate sources. Even
though that the cost by GB has diminished, Big
Data security requires important investments. In
addition to all that, Big Data is most of the time
303
Big Data Security
stored and transferred across multiple Clouds and
distributed worldwide systems. Sharing data over
many networks increase security risks.
The Need to Share Information
In globalization context, business models have to
face holistic competition across the world. Thus,
enterprises need to build a sustainable advantage
through collaboration with many entities, data
monetization, and appropriate dynamic data
sharing.
For data sharing, digital ecosystems are based
on multiple heterogeneous platforms. Such eco-
systems aim to ensure real time data access for
many partners, clients, providers and employees.
They rely on multiple connections with different
levels of securities. Data sharing associated to
advanced analytics techniques brings multiple
security threats such us: discovering confidential
information (e.g., process and method of produc-
tions) or illegal access to networkstraffics. In fact,
by establishing relations between extracted data
from different sources, it is possible to identify
individuals in spite of data anonymization (e.g., by
using correlation attacks, arbitrary identification,
intended identification attacks, etc.).
For instance, in health sector, massive amounts
of medical data are shared between hospitals
and pharmaceutical laboratories for research
and analysis purposes. Such sharing may affect
patient’s privacy even if all the medical records
are anonymized, by finding for instance correla-
tions between medical records and mutual health
insurances (Shin, Sahama, & Gajanayake, 2013).
Multiple Security Requirements
In Big Data context, one challenge is to handle
information security while managing massive and
rapid data streams. Thus, security tools should be
flexible and easily scalable to simplify the inte-
gration of future technological evolutions and to
handle applications requirements’ changes.
There is a need to find a balance between mul-
tiple security requirements, privacy obligations,
system performance and rapid dynamic analysis
on divers large data sets (data in motion or static,
private and public, local or shared, etc.).
Inadequate Traditional Solutions
Traditional Security techniques, such as some
types of data encryption, slow the performance
and are time-consuming in Big Data context.
Furthermore, they are not efficient. Indeed, just
small data partitions are processed for security
purposes. So most of the time, security’s attacks
are detected after the spread of the damage (Lu et
al., 2013). Big Data platforms imply the manage-
ment of various applications and multiple parallel
computations. Therefore, the performance is a key
element for data sharing and real-time analysis in
such environments.
New Security Tools Lack of Maturity
The combination of multiple technologies may
bring hidden risks that are most of the time not
evaluated or under-estimated. In addition, new
security tools lack maturity. So, Big Data platforms
may incorporate new security risk and vulner-
abilities that are not fully assessed.
At the same time, data value is concentrated
on various clusters and data centres. Those rich
data mines are very attractive for commerce, gov-
ernments and industrials. They constitute a target
of several attacks and penetrations. Furthermore,
most of security risks come from employees,
partners and end-point users (more than a third-
part). Hence, it is important to deploy advanced
security mechanisms to protect Big Data clusters
304
Big Data Security
(Ring, 2013; Jensen, 2013). Regarding this point,
data owners have the responsibilities to set clear
security clauses and policies to be respected by
outsources.
Data Anonymization
To ensure data privacy and security, data anony-
mization should be achieved without affecting
system performance (e.g., real-time analysis) or
data quality. However, traditional anonymization
techniques are based on several iterations and time
consuming computations. Several iterations may
affects data consistency and slow down system
performance specially when handling huge het-
erogeneous data sets. In addition, it is difficult to
process and analyse anonymized Big Data (they
need costly computations).
Compatibility with Big
Data Technologies
Some security techniques are incompatible with
commonly used Big Data technologies like Ma-
pRecude paradigm. To ensure security and privacy
of Big Data, it is not enough just to choose powerful
technologies and security mechanisms. It is also
mandatory to verify their compatibility with the
organization Big Data requirements and existing
infrastructure components (Zhao et al., 2014).
Information Reliability and Quality
The reliability of data analysis results depends
on data quality and integrity (Alvaro, Pratyusa,
& Sreeranga, 2013). Therefore, it is important to
verify Big Data sources authenticity and integrity
before analysing data. Since the huge volumes of
data sets are generated every second, it is difficult
to assess the authenticity and integrity of all vari-
ous data sources.
In addition, to extract reliable and complete
information from Big Data sources, the analysts
have to deal with incomplete and heterogeneous
data streams coming from different sources in
different formats. They have to filter data (e.g.,
eliminating noises, errors, spams, and so on).
They have also to organize and contextualize data
(e.g., adding geo-location data) before performing
any analysis.
Compliance to Security Laws
Regulations and Policies
Private organizations and government agencies
have to respect many security laws and industrial
standards that aim to enhance the management of
digital data security and to protect confidentiality
(e.g., delete personal data if no more used, data
protection through its life cycle, transactions
archiving for legal purposes, citizens’ right to
access and modify their data). However, some
ICTs may involve entities across many countries.
So enterprises have to deal with multiple laws and
regulations (Tankard, 2012).
Furthermore, Big Data analytics may be in
conflict with some privacy principles. For ex-
ample, analysts can correlate many different data
sets from different entities to reveal personal or
sensible data even with anonymization techniques.
Consequently, such analysis may enable to identify
individuals or to discover confidential information
(Alvaro et al., 2013).
Need of Big Data Experts
In the era of Big Data, data analysis is a key factor
to prevent and detect security incidents. However,
several surveys confirm that a number of enter-
prises are not aware of the importance to recruit
data scientists for advanced security analysis. In
fact, to ensure Big Data security, organizations
305
Big Data Security
should rely on a multi-disciplinary team with
data scientists, mathematicians and security best
practices programmers (Constantine, 2014).
Big Data Security on
Social Networks
Huge amount of photos, videos, user’s comments
and clicks are generated from social networks
(SNs). They are usually the first source of infor-
mation for different entities (Sykora, Jackson,
O’Brien, & Elayan, 2013).
Big Data on SNs constitute a valuable mine for
governments to better manage national security
risks. Indeed, some governments analyse SNs
Big Data in order to supervise public opinions
(e.g., voting intentions, emotions, feelings about
a project or an event). They can prevent terrorist
and security attacks and assess citizens’ satisfac-
tion regarding public services.
In addition, dynamic analysis of SNs enables
crisis committees to optimize and ensure a rapid
crisis management like in disasters cases. The
goal is to detect rapidly abnormal patterns and to
ensure a real-time monitoring of alarming events.
BIG DATA SECURITY SOLUTIONS
Nowadays, with the spread of social networks,
distributed systems, multiple connections, mobile
devices, the security of a Big Data information
system become the responsibility of all actors
(e.g., managers, security chiefs, auditors, end-users
and customers). In fact, most of security threats
come from inside users and employees. Thus, it
is convenient to raise the security awareness of
all parties and to promote security best practices
of all the connected entities of the digital ecosys-
tem. It is not sufficient just to integrate security
technologies. The collaboration of all actors is
required to eliminate the weak link of the system
chain and to ensure compliance to security laws
and policies.
There exist various security models, mecha-
nisms and solutions for Big Data. However, most
of them are not well known or mature. Many
research projects are currently struggling to en-
hance their performances (Mahmood & Afzal,
2013). In the following sections, we present some
important ones.
Security Foundations for
Big Data Projects
For any Big Data project, it is important to con-
sider the strategic priorities related to security
and to establish clear organizational guidelines
for choosing associated technologies (in term
of reliability, performance, maturity, scalability,
overall cost including maintenance cost). It is
also important to consider the constraints related
to the integration, the existing infrastructure, the
available and planned budget for Big Data security
management.
The goal is to ensure agility across all the
security systems, solutions, processes and proce-
dures. Organizational agility is important to enable
organizations to face rapid changes in terms of
new security’s requirements: legal changes, new
partners and customers, environment and market’s
changes, technological updates and innovations,
new security risks and so on.
After the establishment of security values,
strategies and management models, it is important
to deduce and establish clear security policies,
guidelines, user agreements as well as security
contractual clauses to respect when outsourcing
Big Data services. All security strategies and poli-
cies should take in consideration many factors:
First of all, it is essential to establish Big Data
classification and management principles guided
by a long term vision. In fact, data classification
is mandatory to determine sensitive data and
valuable information to protect, to define data
owners with their security policies, requirements
and responsibilities. Then, the security strategy
should be based on the assessment of security
306
Big Data Security
risks related to the different Big Data management
process (e.g., data generation, storage, transfer, ex-
change, access, deletion, modification and so on).
Finally, a security level has to be determined for
each data category according to the organizational
strategy. In fact, (Bodei, Degano, Ferrari, Galletta,
& Mezzetti, 2012) recommends identifying data
attributes to protect and to encrypt at the begin-
ning of the system conception phase.
Furthermore, the organization has to keep
track of legal changes to update the organizational
policies and procedures. To ensure continuous
legal compliance, it is important to involve the
legal department in the development of Big Data
projects and in the upgrading of all the security
policies, including calendar duration of personal
information, access permission, data transfer
abroad, data access and exchange between many
stakeholders with different security requirements,
integration of contractual security requirements,
data conservation or destruction.
Risk Analysis Related to
Multiple Technologies
It is important to study and assess security risks
related to the mix of multiple technologies inside
a Big Data platform. It is not sufficient to evaluate
security risks related to each used technology. In
fact, the integration of disparate technologies for
multiple purposes may bring hidden risks and
unknown security threats.
In addition, with the increasing spread of the
Cloud and the BYOD (Bring Your Own Device),
it important to consider security threats related
to the distributed environments and the use of
non-normalized mobile and personal devices for
professional purposes. For this point, (Ring, 2013)
recommends to protect the multi-disparate end-
points with an extra security layer. Furthermore,
the mobile devices should be normalized to fulfil
organizational and industrial security standards.
Choosing Adequate
Security Solutions
To enhance Big Data security, organizations rely
on advanced dynamic security analysis. The goal
is to extract and analyse in real-time or nearly real-
time security events and related users actions in
order to enhance online and transactional security
and to prevent fraudulent attacks.
Such dynamic analysis on the generated Big
Data helps to detect timely security incidents, to
identify abnormal customersbehaviours, to moni-
tor security threats, to discover known and new
cyber-attack patterns and so on. Hence, dynamic
analysis on Big Data enables an improved preven-
tion and rapid reactivity to take good decisions for
security. In parallel, the analysis of the generated
statics from applications and programs allows to
produce security performance indicators and to
monitor and secure programs’ behaviours.
(Kim et al., 2013) recommends protecting the
data values instead of the data themselves. In fact,
it is too difficult, and nearly impossible to protect
huge data sets. Furthermore, Big Data security
analysis techniques are based on the attributes’
information. Thus, the data owners or operators
have the responsibility to define, select and protect
only important attributes that they consider valu-
able for their use cases. To protect such important
data attributes in Big Data context, (Kim et al.,
2013) presents the following:
Evaluate the importance of the attributes,
compare and evaluate the correlations be-
tween them.
Filter and define the valuable attributes to
protect.
Choose security mechanisms to protect the
relevant attributes according to the data
owner or the organization policies.
307
Big Data Security
Several analytical solutions are available to
secure Big Data such as Accenture, HP, IBM,
CISCO, Unisys, EADS security solutions. They
brings different level of performance and several
benefits like: enable Agile decision-making and
rapid reaction through real-time surveillance and
monitoring; detect dynamic attacks with enhanced
reliability (low false-positive rate) thanks to the
analysis of active and passive security informa-
tion; provide Full visibility of the network status
and applications’ security problems.
(Mahmood & Afzal, 2013) identifies the de-
ployment phases of Big Data Analytics solutions.
First of all, it recommends identifying strategic
priorities and Big Data security analysis goals.
Then, the organizational priorities should provide
guideline to develop a more detailed strategy for
Big Data analysis platform’s deployment. The
purpose is to optimize the selection of security
solutions according to the strategic goals, the triple
constraints (overall cost, quality, duration of the
implementation), the added value and the avail-
able features (performance, reliability, scalability
and so on). Regarding organizational strategy,
it recommends to adopt a centralized Big Data
management for better security outcomes.
Before Big Data analysis, one pre-requisite is
to consider the integrity and authenticity of data
sources. Indeed, Big Data sources may contain
errors, noises, incomplete data, or data without
context information. Consequently, it is essential
to filter and prepare data and to add context data
before applying analysis techniques.
Anonymization of Confidential
or Personal Data
Data anonymization is a recognized technique
used to protect data privacy across the Cloud
and the distributed systems. Several models and
solutions are used to implement this technique
such as: Sub-tree data anonymization, t-closeness,
m-invariance, k-anonymity and l-diversity.
Sub-tree techniques are based on two methods:
Top-Down Specialization (TDS) and Bottom-Up
Generalization (BUG). However, those methods
are not scalable. There is a lack of performance
when such methods are used for certain ano-
nymization parameters. They cannot scale when
applied to anonymize Big Data on distributed
systems.
In order to improve the anonymization of valu-
able information extracted from large data sets,
(Zhang et al., 2014) suggests a hybrid approach
that combines both anonymization techniques
TDS and BUG. This approach selects and applies
automatically one of the two techniques that are
suitable for the use case parameters. Thus, this
hybrid approach provides efficiency, performance
and scalability required to anonymize huge data-
bases. It is supported by newly adapted programs
to handle MapReduce paradigm. It enables to re-
duce computation time in the distributed systems
or the Cloud.
Big Data processing and analysis are based
most of the time on much iteration to have reliable
and precise results, which may slow computa-
tions and the performance of security solutions.
Regarding this, (Zhang et al., 2014) proposes
a method based on one-iteration for operations
generalization. The goal is to enhance parallelism
capacities, the performance and the scalability of
anonymization techniques.
Currently, many projects are working to de-
velop new techniques and to improve existing ones
to protect privacy. As an example, some projects
are based on privacy preservation aware analysis
and scheduling techniques of large data set.
Data Cryptography
Data Encryption is a common solution used to
ensure data and Big Data confidentiality. Many
researches were conducted to improve the perfor-
mance and the reliability of traditional techniques
or to create new ways for Big Data encryption
techniques.
308
Big Data Security
Unlike some traditional techniques for encryp-
tion, Homomorphic Cryptography enables com-
putation even on encrypted data. Consequently,
this technique ensures information confidentiality
while allowing extracting useful insight through
some possible analysis and computations on the
encrypted data.
Regarding this solution, (Chen & Huang, 2013)
proposes an adapted platform to handle MapRe-
duce computations in the case of Homomorphic
Cryptography. To ensure performance of the cryp-
tographic solutions in distributed environments,
(Liu et al., 2013) suggests a new approach for key
exchange called CBHKE (Cloud Background Hi-
erarchical Key Exchange). It is a secured solution
that is more rapid than its predecessor techniques
(IKE and CCBKE). It is based on an iterative
strategy to an Authenticate Key Exchange (AKE)
through two phases (layer by layer). However, new
approaches with enhanced performance are still
needed to improve the encryption of large data
sets on distributed systems.
Centralized Security Management
(Kasim, Hung, & Li, 2012) recommend storing
data on the Cloud rather than mobile devices. The
goal is to take advantage of the normalized and
standard compliance infrastructure and central-
ized security mechanisms of the Cloud. Indeed,
the Cloud platforms are regularly updated and
continuously monitored for an enhanced security.
However, “Zero risk” is hard to achieve. In
fact, data security relies on the hand of the Cloud
outsourcers and operators. In addition, the Cloud
is very attractive for attackers as it is a centralized
mine of valuable data. Data owners and managers
should be aware of the security risks and define
clear data access policies. They have to ensure
that the required security level is ensured when
outsourcing Big Data management, storage or
processing.
Furthermore, it is essential to change the tra-
ditional governance concept where only security
managers and chiefs, are accountable. It is more
convenient to adopt a centralized security gov-
ernance to meet the challenges of securing Big
Data sources on distributed environments. The
organization should involve all the stakeholders
connected to its ecosystem including employ-
ees, managers, ISR, operators, users, customers,
partners, suppliers, outsources and so on. The
goal is to make all the parties accountable for
security management to enhance the adoption of
security best practices and to ensure standard and
law compliance. Users should be aware threats,
regulations and policies.
As an example, partners have to respect data
access and confidentiality policies. Users have
to update regularly their systems, to make sure
that their mobile devices respect standards and
recommended security practices and regulations.
Users have also to avoid installing non reliable
components (e.g., counterfeit software, software
without a valid license). On the other hand, pro-
grammers, architectures and designers of Big
Data applications have to integrate security and
privacy requirements though out all the develop-
ment life cycle. The outsourcers should be made
accountable for Big Data security through clear
security clauses.
Data Confidentiality and
Data Access Monitoring
There is an increasing spread of security threats
because of the increasing data exchange over
distributed systems and the Cloud. To face these
security challenges, (Tankard, 2012) proposes
to enhance the control by integrating controls at
data level and during storage phase. In fact, it has
proved that controls at application and system
levels are not sufficient.
309
Big Data Security
In addition, access controls have to be well
granulated to limit the access by role and respon-
sibilities. There exist many techniques to ensure
access control and data confidentiality such as
ICP, certificates, smart-cards, federated identity
management, multi-factors authentication.
For example, Law Enforcement Agencies
(LEA) of USA have launched INDECT project in
order to implement a secured infrastructure for a
secured data exchange between agencies and other
members (Stoianov, Uruena, Niemiec, Machnik,
& Maestro, 2013). The solution includes:
A public Key Infrastructure (PKI) with
three levels (certification authority, users
and machines). The PKI provides access
control based on a multi-factor authentica-
tion and the security level required for each
data type. For instance, access to highly
confidential applications requires a valid
certificate and a password.
A Federated Identity Management is a
concept used by the INDECT platform to
enhance access control and security. This
type of federated management is delegat-
ed to an Identity Provider (IdP) within a
monitored trust domain. It is based on two
security tools: certificates and smart-cards.
Those tools are used to store user certifi-
cates issued by the PKI to encrypt and sign
documents and emails.
An INDECT Block Cipher IBC algorithm
is a new algorithm for asymmetric cryp-
tography. It was developed and used to en-
crypt databases, communication sessions
(TLS-SSL) and VPN tunnels. The goal is to
ensure a high level of data confidentiality.
Secured communications based on VPN
and TLS-SSL protocols. Those mecha-
nisms are used to protect access to Big
Data servers.
Authentication mechanisms are most of the
time, complex and heavy to handle across distrib-
uted clusters and large data sets. For this reason,
(Zhao et al., 2014) suggests a model for security
on G-Hadoop that integrates several security solu-
tions. It is based on Signe Sign On (SSO) concept
that simplifies users authentication and the compu-
tation of MapReduce functions. Thus, this model
enables users to access different clusters with the
same account identifier. Furthermore, privacy is
protected through encrypted connections based on
SSL protocol, Public Key Cryptography and valid
certificates for authentication. Hence, this model
offers an efficient access control and a protection
against hackers and attackers (e.g., MITM attacks,
version rollback, delay attack) and deny access to
fraudulent or untruthful entities.
(Mansfield-Devine, 2012) recommends to
involve not just security chiefs but also to make
responsible all end users for a better access control.
It suggests also combining different types of con-
trols inside multi-silo environments (e.g., archives,
data loss prevention, access control, logs).
Security Surveillance and Monitoring
It is important to ensure a continuous surveillance
in order to detect in real time security incidents,
threats and abnormal behaviours. To ensure Big
Data security surveillance, some solutions are
available such as: Data Loss Prevention (DLP),
Security Information and Event Management
(SIEM) and dynamic analysis of security events.
Such solutions are based on consolidation and cor-
relations methods between multiple data sources,
and on contextualization tools (to add context
as data attribute to the extracted data). It is also
important to conduct regular audits and to verify
the respect of security policies and recommended
best practices by users and employees.
310
Big Data Security
EXAMPLE: SECURITY OF
SMART GRID BIG DATA
This section presents security challenges and
solutions regarding Big Dada processed in Smart
Grid infrastructures.
Giving the growing power demand, a Smart
Grid gather and process huge data sets generated
daily (e.g., consumers’ behaviours and habits) to
ensure an efficient and cost-effective production
and distribution of electricity. Unlike traditional
electrical grid, Smart Grid is based on advanced
technologies and enables bidirectional power
flow between connected devices. However, Smart
Grid infrastructures are vulnerable and face many
security threats. In fact, data are transferred mas-
sively in this context and security attacks may have
serious and large scale impacts (e.g., regional or
national interruption in power supply, important
economic loss, disruption of public services, low
service quality in hospitals, etc.).
Hence, securing Big Data of Smart Grid infra-
structures is fundamental to protect grid system
performances, to ensure reliable coordination
between control centres and equipments, and to
enhance the safety of all system operations.
Smart Grid Security Challenges
Security challenges that are facing critical large
scale infrastructures can be classified according to
various parameters. (Pathan, 2014) recommends
a holistic multi-layered security approach to deal
with this issue at all grid levels (i.e., physical,
cybernetics and data). In fact, Smart Grid environ-
ment incorporates many distributed subsystems
and end-points, including distributed sensors and
actuators, ever-growing number of Intelligent
Electronic Devices (IED) with different power
requirements (e.g., electric vehicles and smart
homes), electric generators and control applica-
tions. In addition, these subsystems and end-points
have multiple bidirectional communications
between each other.
This complexity increases anomalies, human
errors and system vulnerabilities that can be
exploited by security attackers. As an example,
the governor control system (GC) of a Smart
Grid ensures the steady operation of all power
generators. It detects by sensors the speed and
the frequency deviation of generators and rapidly
adjusts their operations. Attackers may succeed to
access one point of network communications and
change values recorded by the generators sensors.
Thus, such type of communication intrusion and
malicious modification of data, could affect power
flow inside the grid and compromise the decision
making process supervised by the Optimal Power
Flow (OPF).
Moreover, attackers may use Grid utilities and
smart meters to get private information, which
compromises consumer privacy (Stimmel, 2014).
Since Smart Grid system incorporates mul-
tiple interconnected subsystems. There is a risk
of cascading failures. In fact, any attack to one
of the subsystems may compromise the security
of the other ones. Thus, it is crucial to secure the
overall Smart Grid system, including the physical
components, the cyber space and the data gener-
ated and transmitted through the grid networks.
Security Solutions for Smart Grid
Big Data security management in Smart Grid
environment aims to ensure data confidential-
ity, integrity and availability for efficient and
reliable grid operations. To increase Smart Grid
attacks-resilience and response to the previous
cited challenges, several solutions and research
efforts have been made.
Considering the cascading failure aspect and
the complexity of the Smart Grid system, it is
recommended to adopt multi-layered approach.
This helps to secure not only data layer but also
all the other layers: physicals, networks, hosts,
data stores, applications, policies and regulations.
For instance, at the physical layer, it is important
to ensure the security of equipments, consumer
311
Big Data Security
devices, substations, sensors, control centres and
so on. Concerning the cyber layer, it is recom-
mended to secure network communications and
eliminate weak points from the cyber topology
(e.g., bad intrusion detection algorithms). Regard-
ing the data layer, it is crucial to ensure granular
access control and granular audits (Alvaro A. C.,
Pratyusa K. M., & Sreeranga P. R., 2013).
A successful security management of Smart
Grid system should incorporate real-time security
monitoring. Big Data analytics algorithms are one
of the powerful solutions recommended to face
such issue. Those algorithms drive improved pre-
dictions and more precise analysis. They are often
based on Machine Learning techniques and use
not only traditional security logs and events, but
also performance and costumersdata to recognize
and prevent malicious behaviours (Khorshed, Ali,
& Wasimi, 2014).
Real-time monitoring can be part in mitiga-
tion strategy to help security decision making. It
assists security analysts to decide if a preventive
or remedial action should be taken (e.g., change
user roles or privileges, suspend suspicious ac-
cess, correct network configurations). The list
of actions depends on the nature of incident and
its impact. They can be implemented through
an automatically or semi-automatically process.
Ensuring continuous updates of such strategy is a
good practice. This helps integrating new security
solutions, new laws as well as emerging security
practices and models.
Considering the complexity and evolving
nature of cybercrimes, it is important to promote
continuous commitment and timely sharing of
security information between all Smart Grid
partners, utilities, and specialized organizations
in cybercrimes (Bughin, J., Chui, M., & Manyika,
J. 2010).
It is well known that security techniques are not
sufficient in any industry. They have to be guided
by legal actions and regulations in order to protect
the valuable data and other assets. Therefore,
security requirements, policies, regulations and
standards should be updated regularly to consider
evolving security issues.
CONCLUSION
Big Data applications promise interesting op-
portunities for many sectors. In fact, extracting
valuable insight and information from disparate
large data sources enables to improve the competi-
tive advantage of organizations. For instance, the
analysis of data streams or archives (e.g., using
predictive or identification models) can help to
optimize production processes, to enhance services
with added value and to adapt them to customers’
needs. However, Big Data sharing and analysis rise
many security issues and increase privacy threats.
This chapter presents some of the important Big
Data security challenges and describes related
solutions and recommendations. Because it is
nearly impossible to secure very large data sets, it
is more practical to protect the data value and its
key attributes instead of the data itself, to analyse
security risks of combining different evolving Big
Data technologies and to choose security tools
according to the goals of the Big Data project.
REFERENCES
Alvaro, A. C., Pratyusa, K. M., & Sreeranga, P.
R. (2013). Big data analytics for security. IEEE
Security and Privacy, 11(6), 74–76. doi:10.1109/
MSP.2013.138
Berman, J. J. (2013). Principles of big data:
Preparing, sharing, and analyzing complex infor-
mation. San Francisco, CA: Morgan Kaufmann
Publishers Inc.
312
Big Data Security
Bodei, C., Degano, P., Ferrari, G. L., Galletta, L.,
& Mezzetti, G. (2012). Formalising security in
ubiquitous and cloud scenarios. In A. Cortesi, N.
Chaki, K. Saeed, & S. Wierzchon (Eds.), Computer
information systems and industrial management:
Proceedings of the 11th IFIP TC 8 International
Conference (LNCS) (Vol. 7564, pp. 1-29). Berlin,
Germany: Springer. doi:10.1007/978-3-642-
33260-9_1
Bughin, J., Chui, M., & Manyika, J. (2010). Clouds,
big data, and smart assets: Ten tech-enabled
business trends to watch. Retrieved from http://
www.mckinsey.com/insights/high_tech_tele-
coms_internet/clouds_big_data_and_smart_as-
sets_ten_tech-enabled_business_trends_to_watch
Chen, X., & Huang, Q. (2013). The data protection
of mapreduce using homomorphic encryption. In
Proceedings of the 4th IEEE International Confer-
ence on Software Engineering and Service Science
(pp. 419-421). Beijing, China: IEEE.
Constantine, C. (2014). Big data: An informa-
tion security context. Network Security, 2014(1),
18–19. doi:10.1016/S1353-4858(14)70010-8
Jensen, M. (2013). Challenges of privacy protec-
tion in big data analytics. In Proceedings of IEEE
International Congress on Big Data (pp. 235-
238). Washington, DC: IEEE Computer Society.
doi:10.1109/BigData.Congress.2013.39
Kasim, H., Hung, T., & Li, X. (2012). Data value
chain as a service framework: For enabling data
handling, data security and data analysis in the
cloud. In Proceedings of the 18th IEEE Interna-
tional Conference on Parallel and Distributed Sys-
tems (pp. 804-809). Washington, DC: IEEE Com-
puter Society. doi:10.1109/ICPADS.2012.131
Katal, A., Wazid, M., & Goudar, R. (2013). Big
data: Issues, challenges, tools and good prac-
tices. In Proceedings of the Sixth International
Conference on Contemporary Computing (pp.
404-409). Noida, India: IEEE. doi:10.1109/
IC3.2013.6612229
Khorshed, M. T., Ali, A. B., & Wasimi, S. A.
(2014). Combating Cyber Attacks in Cloud Sys-
tems Using Machine Learning. In S. Nepal &
M. Pathan (Eds.), Security, Privacy and Trust in
Cloud Systems (pp. 407–431). Berlin, Germany:
Springer. doi:10.1007/978-3-642-38586-5_14
Kim, S. H., Kim, N. U., & Chung, T. M. (2013).
Attribute relationship evaluation methodology for
big data security. In Proceedings of the Interna-
tional Conference on IT Convergence and Security
(pp. 1-4). Macao, China: IEEE. doi:10.1109/
ICITCS.2013.6717808
Liu, C., Zhang, X., Liu, C., Yang, Y., Ranjan,
R., Georgakopoulos, D., & Chen, J. (2013). An
iterative hierarchical key exchange scheme for
secure scheduling of big data applications in
cloud computing. In Proceedings of the 12th IEEE
International Conference on Trust, Security and
Privacy in Computing and Communications (pp.
9-16). Washington, DC: IEEE Computer Society.
doi:10.1109/TrustCom.2013.65
Lu, T., Guo, X., Xu, B., Zhao, L., Peng, Y., &
Yang, H. (2013). Next big thing in big data: The
security of the ict supply chain. In Proceedings of
the International Conference on Social Computing
(pp. 1066-1073). Washington, DC: IEEE Com-
puter Society. doi:10.1109/SocialCom.2013.172
Mahmood, T., & Afzal, U. (2013). Security analyt-
ics: Big data analytics for cybersecurity: A review
of trends, techniques and tools. In Proceedings
of the 2nd National Conference on Information
Assurance (pp. 129-134). Rawalpindi, Pakistan:
IEEE. doi:10.1109/NCIA.2013.6725337
Mansfield-Devine, S. (2012). Using big data to
reduce security risks. Computer Fraud & Secu-
rity, (8): 3–4.
Pathan, A. S. K. (2014). The state of the art in
intrusion prevention and detection. Boca Raton,
FL: Auerbach Publications. doi:10.1201/b16390
313
Big Data Security
Ring, T. (2013). It’s megatrends: The security im-
pact. Network Security, 2013(7), 5–8. doi:10.1016/
S1353-4858(13)70080-1
Shin, D., Sahama, T., & Gajanayake, R. (2013).
Secured e-health data retrieval in daas and big data.
In Proceedings of the 15th IEEE international
conference on e-health networking, applications
services (pp. 255-259). Lisbon, Portugal: IEEE.
doi:10.1109/HealthCom.2013.6720677
Stimmel, C. L. (2014). Big data analytics strate-
gies for the smart grid. Boca Raton, FL: Auerbach
Publications. doi:10.1201/b17228
Stoianov, N., Uruena, M., Niemiec, M., Machnik,
P., & Maestro, G. (2012). Security Infrastructures:
Towards the INDECT System Security. Paper
presented at the 5th International Conference on
Multimedia Communication Services & Security
(MCSS), Krakow, Poland. doi:10.1007/978-3-
642-30721-8_30
Sykora, M., Jackson, T., O’Brien, A., & Elayan,
S. (2013). National security and social media
monitoring: A presentation of the emotive and
related systems. In Proceedings of the European
Intelligence and Security Informatics Confer-
ence (pp. 172-175). Uppsala, Sweden: IEEE.
doi:10.1109/EISIC.2013.38
Tankard, C. (2012). Big data security. Network
Security, (7): 5–8.
Wu, X., Zhu, X., Wu, G.-Q., & Ding, W. (2014).
Data mining with big data. IEEE Transactions on
Knowledge and Data Engineering, 26(1), 97–107.
doi:10.1109/TKDE.2013.109
Zhang, X., Liu, C., Nepal, S., Yang, C., Dou, W.,
& Chen, J. (2014). A hybrid approach for scal-
able sub-tree anonymization over big data using
mapreduce on cloud. Journal of Computer and
System Sciences, 80(5), 1008–1020. doi:10.1016/j.
jcss.2014.02.007
Zhao, J., Wang, L., Tao, J., Chen, J., Sun, W., &
Ranjan, R. etal. (2014). A security framework in
g-hadoop for big data computing across distrib-
uted cloud data centres. Journal of Computer and
System Sciences, 80(5), 994–1007.
KEY TERMS AND DEFINITIONS
Anonymization: Anonymization is the process
of protecting data privacy across information
systems. Several models and methods are used to
implement it such as: t-closeness, m-invariance,
k-anonymity and l-diversity.
Authentication: Authentication aims to test
and ensure, with a certain probability, that par-
ticular data are authentic, i.e., they have not been
changed.
Big Data: Big Data is mainly defined by its
3Vs fundamental characteristics. The 3Vs include
Velocity (data are growing and changing in a
rapid way), Variety (data come in different and
multiple formats) and Volume (huge amount of
data is generated every second).
Confidentiality: Confidentiality is a property
that ensures that data is not made disclosed to
unauthorized persons. It enforces predefined rules
while accessing the protected data.
Encryption: Encryption relies on the use
of encryption algorithms to transform data into
encrypted forms. The purpose is to make them
unreadable to those who do not possess the en-
cryption key(s).
Privacy: Privacy is the ability of individuals
to seclude information about themselves. In other
words, they selectively control its dissemination.
Security Management: Security management
is a part of the overall management system of an
organization. It aims to handle, implement, moni-
tor, maintain, and enhance data security.
... Big Data security shares numerous characteristics with traditional information system security (where data are structured). However, Big Data security necessitates the development of more powerful tools, proper procedures, and innovative technology for performing rapid data analysis [4]. Additionally, it necessitates a new security management paradigm that manages both internal data (data generated by an organization's internal systems and procedures) and external data (e.g., data collected from other firms or external web sites) [4]. ...
... However, Big Data security necessitates the development of more powerful tools, proper procedures, and innovative technology for performing rapid data analysis [4]. Additionally, it necessitates a new security management paradigm that manages both internal data (data generated by an organization's internal systems and procedures) and external data (e.g., data collected from other firms or external web sites) [4]. As a result of the aforementioned difficulties, it is necessary to highlight these difficulties and demonstrate how previous research has addressed this conundrum. ...
... The article discusses how a variety of data techniques, including Big Data, OLAP, large data, large data transfer, and large data privacy, are used in research and development in the subject of large research. [4] Big Data Security: Challenges, Recommendations and Solutions ...
Article
Big data is dubbed “today’s digital oil” and the “new raw resource of the twenty-first century”. BD is synonymous with the future of innovation, competition, and productivity. It can produce and find corporate value by analyzing data in ways that older methodologies could not. Regardless of its benefits, the development of big data continues to encounter various barriers, the most important of which are security and privacy concerns. As a result, this study is motivated by the need to address and evaluate big data challenges. Thus, by comparing and contrasting big data difficulties with available and potential solutions, users, developers, and businesses can find pertinent and timely responses to specific dangers, resulting in the best possible big data-based services. The objective of this essay is to highlight the inherent challenges of big data and some essential strategies for overcoming them.The purpose of this article was to extract and analyze significant works in order to contribute to the corpus of literature by emphasizing many critical difficulties in the big data domain and throwing light on how these challenges affect a range of domains, including users, sites, and business. Many issues such as data privacy, information sharing, failures in big data technologies and infrastructure, poor data quality management, managers and policymakers’ inability to learn and adapt, the absence of government policies and plans, the lack of successfully implemented big data analytics projects, and the lack of human experience are among the most frequently mentioned issues. Obstacles also exist in the areas of real-time data collection, as well as real-time data processing and visualization, among others.By combining previously identified solutions, this research addressed these concerns. The consequences for both researchers and practitioners have been discussed. Aiming to help scholars gain a comprehensive understanding of these issues and confirm the approaches used to address them, this study’s theoretical focus is broad. This study uses tried-and-true solutions to overcome these obstacles. Business and individuals using big data analytics systems will benefit from these solutions.
... However, this introduces multiple security threats such as the discovery of private or confidential information and unauthorised access to data at storage or data in motion. While multiple technologies and techniques have emerged, there is a need to find a balance between multiple security requirements, privacy obligations, system performance and rapid dynamic analysis on diverse large data sets [6]. ...
... Gathering, storing, searching, sharing, transferring, analysing and presenting data as per requirements are the major challenging task in big data [11]. In any big data platform, the strategic priorities related to security should be clearly defined and the guidelines for choosing the associated technologies in terms of reliability, performance, maturity, scalability and overall cost should be also clearly established to ensure that the design platform provide the necessary security mechanisms that include, among others, the anonymisation of confidential or personal data, the data cryptography, the centralised security management, the data confidentiality and data access monitoring [6]. At the same time, it should ensure their future evolutions will be easily integrated in the existing solution. ...
Preprint
Full-text available
The aviation industry as well as the industries that benefit and are linked to it are ripe for innovation in the form of Big Data analytics. The number of available big data technologies is constantly growing, while at the same time the existing ones are rapidly evolving and empowered with new features. However, the Big Data era imposes the crucial challenge of how to effectively handle information security while managing massive and rapidly evolving data from heterogeneous data sources. While multiple technologies have emerged, there is a need to find a balance between multiple security requirements, privacy obligations, system performance and rapid dynamic analysis on large datasets. The current paper aims to introduce the ICARUS Secure Experimentation Sandbox of the ICARUS platform. The ICARUS platform aims to provide a big data-enabled platform that aspires to become an 'one-stop shop' for aviation data and intelligence marketplace that provides a trusted and secure 'sandboxed' analytics workspace, allowing the exploration, integration and deep analysis of original and derivative data in a trusted and fair manner. Towards this end, a Secure Experimentation Sandbox has been designed and integrated in the ICARUS platform offering, that enables the provisioning of a sophisticated environment that can completely guarantee the safety and confidentiality of data, allowing to any interested party to utilise the platform to conduct analytical experiments in closed-lab conditions.
... According to a detailed report published by McKinsey, a world-renowned consulting firm, the impact, key technologies, and application areas of Big Data have been analyzed in depth. It is widely recognized that Big Data has the 3V characteristics of volume, variety, and real-time nature [11], and its value is crucial [12], the big data should have authenticity [13]. According to the 50th Statistical Report on China's Internet Development, as of June 2022, the number of Internet users in China reached 1.051 billion, and the Internet penetration rate reached 74.4%. ...
Article
This study examines the factors that impact deep and meaningful learning in blended learning environments and their connections. The sample included 397 college students from a university in Sichuan Province, selected through random sampling. Data was collected using a questionnaire based on Bandura's ternary interaction theory, encompassing learners, helpers, environment, and interaction dimensions. The following text should be remembered: "Hypotheses were developed based on existing literature, and a survey with established scales was created. Quantitative analysis was conducted using SPSS and AMOS software. The mean, standard deviation, Variance, skewness, and kurtosis values were within reasonable ranges. The model's latent variables showed strong convergent validity, with standardized factor loadings (SFL) ranging from 0.807 to 0.965, average Variance extracted (AVE) from 0.697 to 0.946, and composite reliability (C.R.) from 0.919 to 0.946. Model fit indices indicated acceptable fit (CMIN/DF: 2.303, NFI: 0.966, CFI: 0.980, RMSEA: 0.058, RMR: 0.008, PNFI: 0.789). The study optimized the model through path analysis, culminating in the final structural equation model (SEM)." Findings indicate (1) Learner, environmental, and interaction factors positively influence deep meaningful learning, while helper factors show a negative correlation; (2) learner, interaction, and helper factors mediate the environment's impact on deep, meaningful learning; and (3) environmental factors hold the most significant sway over helper factors, followed by interaction and learner factors. Helpers wield significant influence over learners, enhancing deep understanding. These insights guide effective, deep, meaningful learning strategies in blended learning
... According to the findings of other research, such as that conducted by Altman et al. (2018) and Benjelloun and Lahcen (2019), big data can be utilized to better comply with data privacy requirements by evaluating vast amounts of data about customers. There have been several studies that have investigated the broader influence that big data has had on the fintech business. ...
Article
Full-text available
This study explores the critical role of big data in the field of financial technology (FinTech) andits impact on financial inclusion. We conducted an extensive review of various sources, includingacademic papers, industry reports, and online content. Our research reveals that big data is keyto developing new financial products and services. It improves risk management and operationalefficiency, which in turn promotes financial inclusion. A major finding is that big data enablesdetailed analysis of customer behavior, crucial for creating inclusive financial services. However,we also identify challenges such as ensuring data privacy, security, and ethical use of algorithms.Our findings are particularly relevant for policymakers, regulators, and industry professionals.They highlight the importance of developing balanced regulations to use big data effectively andresponsibly. Overall, our study demonstrates the significant impact of big data in transformingFinTech, paving the way for a more inclusive financial environment
... Furthermore, data access control on top of these collected massive and rapidly evolving data must be properly addressed. Hence, it is also acknowledged that despite the constantly growing number of available technologies and techniques that have emerge, there is a real challenge on finding the proper balance between the effectiveness and performance of the dynamic analysis on diverse large data sets and the requirements for data integrity, data governance and data security and privacy [8]. Nevertheless, a promising opportunity arises from the latest developments and compelling features of the big data technologies to design and build a big data platform that capitalizes on these emerging offerings in order to build a novel data value chain in the aviation-related sectors that will enable data-driven innovation and collaboration across currently diversified and fragmented industry players, by effectively addressing the challenges imposed by the nature of big data and the requirements of the aviation industry's stakeholders for a trusted and secure data sharing and data analysis environment. ...
Preprint
Full-text available
The unprecedented volume, diversity and richness of aviation data that can be acquired, generated, stored, and managed provides unique capabilities for the aviation-related industries and pertains value that remains to be unlocked with the adoption of the innovative Big Data Analytics technologies. Despite the large efforts and investments on research and innovation, the Big Data technologies introduce a number of challenges to its adopters. Besides the effective storage and access to the underlying big data, efficient data integration and data interoperability should be considered, while at the same time multiple data sources should be effectively combined by performing data exchange and data sharing between the different stakeholders. However, this reveals additional challenges for the crucial preservation of the information security of the collected data, the trusted and secure data exchange and data sharing, as well as the robust data access control. The current paper aims to introduce the ICARUS big data-enabled platform that aims provide a multi-sided platform that offers a novel aviation data and intelligence marketplace accompanied by a trusted and secure analytics workspace. It holistically handles the complete big data lifecycle from the data collection, data curation and data exploration to the data integration and data analysis of data originating from heterogeneous data sources with different velocity, variety and volume in a trusted and secure manner.
... However, it is unclear if "sensitive information" or "private information" or "raw data" is equal to personal information privacy. To understand personal information privacy from a legal and ethical perspective, it is the right of an individual or group to seclude themselves, or information about themselves, and thereby express themselves selectively [8,20,93]. Similarly, privacy is seen as the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others [115]. In relation to controlling and protecting privacy, two definitions from legal literature state "Privacy, as a whole or in part, represents the control of transactions between person(s) and other(s), the ultimate aim of which is to enhance autonomy and/or to minimize vulnerability" [75] and "Privacy is to protect personal data and information related to a communication entity to be collected from other entities that are not authorized" [26]. ...
Article
Full-text available
Combining and analysing sensitive data from multiple sources offers considerable potential for knowledge discovery. However, there are a number of issues that pose problems for such analyses, including technical barriers, privacy restrictions, security concerns, and trust issues. Privacy-preserving distributed data mining techniques (PPDDM) aim to overcome these challenges by extracting knowledge from partitioned data while minimizing the release of sensitive information. This paper reports the results and findings of a systematic review of PPDDM techniques from 231 scientific articles published in the past 20 years. We summarize the state of the art, compare the problems they address, and identify the outstanding challenges in the field. This review identifies the consequence of the lack of standard criteria to evaluate new PPDDM methods and proposes comprehensive evaluation criteria with 10 key factors. We discuss the ambiguous definitions of privacy and confusion between privacy and security in the field, and provide suggestions of how to make a clear and applicable privacy description for new PPDDM techniques. The findings from our review enhance the understanding of the challenges of applying theoretical PPDDM methods to real-life use cases, and the importance of involving legal-ethical and social experts in implementing PPDDM methods. This comprehensive review will serve as a helpful guide to past research and future opportunities in the area of PPDDM.
Conference Paper
Connected objects are one of the most important vectors for the collection of personal data. With the increase in data volumes, we are observing an increase in network vulnerabilities and data breaches.Data-centric security (DCS) and its related protocols such as the NATO STANAG 4774 have become a suited approach to address diverse data protection and secure information exchange. Despite the novelty of the approach, it comes with a challenge regarding its implementation to ensure the integrity of data in real scenario. In this paper, we are evaluating the NATO STANAG 4774 protocol when securing smart home data. Then, we use Random Forest to detect cyber attacks based on malware injection. We conduct an empirical study to evaluate the performance of our approach and we show how a machine learning technique can be used to ensure the integrity of data when using a data-centric security protocol. In fact, our proposed approach has a recall of 0.781 —in other words, it correctly identifies more than 78% of all malicious data injection.
Article
Full-text available
Property marketing in Nigeria has frequently been characterised by unique challenges due to inadequate and unreliable data. In recent times, big data has presented a wide range of solutions to contemporary issues in different industries, including real estate. Notwithstanding, there is a paucity of practical studies on the role of big data in property marketing of many developing nations, including Nigeria. Hence, there is a need to investigate the role of big data in transforming property marketing in Nigeria. The aim of this study is therefore to investigate Estate Surveyors and Valuers’ (ESVs) perception of the role of big data in property marketing in Lagos, Nigeria. 82 questionnaires were administered to ESVs in the study area and 55 (67%) of them were found useful for analysis. The data were analysed using frequency, percentage, mean and relative importance index. The results revealed that out of 9 possible roles of big data in property marketing, the 2 most notable ones are that real estate firms can execute laser-focused marketing strategies; and they can match demand with supply. These outcomes distinctly disclosed the merits of adopting big data in marketing properties. The implication of these findings is that big data has the potential of reducing fraud due to the presence of sufficient and reliable data in the property market. This study concludes that big data is a worthy addition to the toolset of real estate practitioners, particularly with respect to property marketing, and wholly recommends its adoption.
Chapter
Over the last years, the impacts of the evolution of information integration, increased automation and new forms of information management are also evident in the aviation industry that is disrupted also by the latest advances in sensor technologies, IoT devices and cyber-physical systems and their adoption in aircrafts and other aviation-related products or services. The unprecedented volume, diversity and richness of aviation data that can be acquired, generated, stored, and managed provides unique capabilities for the aviation-related industries and pertains value that remains to be unlocked with the adoption of the innovative Big Data Analytics technologies. The big data technologies are focused on the data acquisition, the data storage and the data analytics phases of the big data lifecycle by employing a series of innovative techniques and tools that are constantly evolving with additional sophisticated features, while also new techniques and tools are frequently introduced as a result of the undergoing research activities. Nevertheless, despite the large efforts and investments on research and innovation, the Big Data technologies introduce also a number of challenges to its adopters. Besides the effective storage and access to the underlying big data, efficient data integration and data interoperability should be considered, while at the same time multiple data sources should be effectively combined by performing data exchange and data sharing between the different stakeholders that own the respective data. However, this reveals additional challenges related to the crucial preservation of the information security of the collected data, the trusted and secure data exchange and data sharing, as well as the robust access control on top of these data. The current paper aims to introduce the ICARUS big data-enabled platform that aims provide a multi-sided platform that offers a novel aviation data and intelligence marketplace accompanied by a trusted and secure “sandboxed” analytics workspace. It holistically handles the complete big data lifecycle from the data collection, data curation and data exploration to the data integration and data analysis of data originating from heterogeneous data sources with different velocity, variety and volume in a trusted and secure manner.
Conference Paper
Full-text available
We survey some critical issues arising in the ubiquitous computing paradigm, in particular the interplay between context-awareness and security. We then overview a language-based approach that addresses these problems from the point of view of Formal Methods. More precisely, we briefly describe a core functional language extended with mechanisms to express adaptation to context changes, to manipulate resources and to enforce security policies. In addition, we shall outline a static analysis for guaranteeing programs to securely behave in the digital environment they are part of.
Conference Paper
Full-text available
Big Data is one of rising IT trends such as cloud computing, social networking or ubiquitous computing. Big Data can offer beneficial scenarios in the e-health arena. However, one of the scenarios can be that Big Data needs to be kept secured for a long time in order to gain its benefits such as finding cures for infectious diseases and keeping patients' privacy. From this connection, it is beneficial to analyze Big Data to make meaningful information while the data are stored in a secure manner. Thus, the analysis of various database encryption techniques is essential. In this study, we simulated 3 types of technical environments such as Plain-text, Microsoft Built-in Encryption, and custom Advanced Encryption Standard using Bucket Index in Data-as-a-Service. The results showed that custom AES-DaaS has faster range query response time than MS built-in encryption. In addition, while carrying out the scalability test, we acknowledged there are performance thresholds according to physical IT resources. Therefore, for the purpose of efficient Big Data management in e-health, it is noteworthy to examine its scalability limits as well even if it is under cloud computing environment. Furthermore, when designing an e-health database, both patients' privacy and system performance needs to be dealt as top priorities.
Book
By implementing a comprehensive data analytics program, utility companies can meet the continually evolving challenges of modern grids that are operationally efficient, while reconciling the demands of greenhouse gas legislation and establishing a meaningful return on investment from smart grid deployments. Readable and accessible, Big Data Analytics Strategies for the Smart Grid addresses the needs of applying big data technologies and approaches, including Big Data cybersecurity, to the critical infrastructure that makes up the electrical utility grid. It supplies industry stakeholders with an in-depth understanding of the engineering, business, and customer domains within the power delivery market. The book explores the unique needs of electrical utility grids, including operational technology, IT, storage, processing, and how to transform grid assets for the benefit of both the utility business and energy consumers. It not only provides specific examples that illustrate how analytics work and how they are best applied, but also describes how to avoid potential problems and pitfalls. Discussing security and data privacy, it explores the role of the utility in protecting their customers' right to privacy while still engaging in forward-looking business practices. The book includes discussions of: SAS for asset management tools. The AutoGrid approach to commercial analytics. Space-Time Insight's work at the California ISO (CAISO). This book is an ideal resource for mid- to upper-level utility executives who need to understand the business value of smart grid data analytics. It explains critical concepts in a manner that will better position executives to make the right decisions about building their analytics programs. At the same time, the book provides sufficient technical depth that it is useful for data analytics professionals who need to better understand the nuances of the engineering and business challenges unique to the utilities industry.
Chapter
One of the crucial but complicated tasks is to detect cyber attacks and their types in any IT networking environment including recent uptake of cloud services.
Conference Paper
The rapid growth of the Internet has brought with it an exponential increase in the type and frequency of cyber attacks. Many well-known cybersecurity solutions are in place to counteract these attacks. However, the generation of Big Data over computer networks is rapidly rendering these traditional solutions obsolete. To cater for this problem, corporate research is now focusing on Security Analytics, i.e., the application of Big Data Analytics techniques to cybersecurity. Analytics can assist network managers particularly in the monitoring and surveillance of real-time network streams and real-time detection of both malicious and suspicious (outlying) patterns. Such a behavior is envisioned to encompass and enhance all traditional security techniques. This paper presents a comprehensive survey on the state of the art of Security Analytics, i.e., its description, technology, trends, and tools. It hence aims to convince the reader of the imminent application of analytics as an unparalleled cybersecurity solution in the near future.
Conference Paper
There has been an increasing interest in big data and big data security with the development of network technology and cloud computing. However, big data is not an entirely new technology but an extension of data mining. In this paper, we describe the background of big data, data mining and big data features, and propose attribute selection methodology for protecting the value of big data. Extracting valuable information is the main goal of analyzing big data which need to be protected. Therefore, relevance between attributes of a dataset is a very important element for big data analysis. We focus on two things. Firstly, attribute relevance in big data is a key element for extracting information. In this perspective, we studied on how to secure a big data through protecting valuable information inside. Secondly, it is impossible to protect all big data and its attributes. We consider big data as a single object which has its own attributes. We assume that a attribute which have a higher relevance is more important than other attributes.
Conference Paper
MapReduce is a programming model which can handle big data efficiently, but the security of intermediate data of MapReduce is not well protected, and the operations on ciphertexts were not allowed. To achieve a trustworthy MapReduce, there is a need to both protect the data, as well as to allow specific types of computations to be carried out on encrypted intermediate data. Traditional encryption solutions can protect the data from being divulged, but they can't be used to compute on encrypt data, however a novel encryption scheme, called fully homomorphic encryption (FHE), could compute over encrypted data without decrypting it. In this paper, we propose a modified framework of MapReduce using homomorphic encryption, in order to gain both the confidentiality of data, as well as the ability of computing on them.
Conference Paper
This paper provides an overview of the security infrastructures being deployed inside the INDECT project. These security infrastructures can be organized in five main areas: Public Key Infrastructure, Communication security, Cryptography security, Application security and Access control, based on certificates and smartcards. This paper presents the new ideas and deployed testbeds for these five areas. In particular, it explains the hierarchical architecture of the INDECT PKI, the different technologies employed in the VPN testbed, the INDECT Block Cipher (IBC) – a new cryptography algorithm that is being integrated in OpenSSL/OpenVPN libraries, and how TLS/SSL and X.509 certificates stored in smart-cards are employed to protect INDECT applications and to implement the access control of the INDECT Portal. All the proposed mechanisms have been designed to work together as the security foundation of all systems being developed by the INDECT project.