ChapterPDF Available

Big Data Security: Challenges, Recommendations and Solutions

June 2015

June 2015

DOI:10.4018/978-1-4666-8387-7.ch014

In book: Handbook of Research on Security Considerations in Cloud Computing
Publisher: IGI Global
Editors: Kashif Munir, Mubarak S. Al-Mutairi, Lawan A. Mohammed

Authors:

Fatima-Zahra Benjelloun

Université Ibn Tofail

Ayoub Ait Lahcen

National School of Applied Sciences

The value of Big Data is now being recognized by many industries and governments. The efficient mining of Big Data enables to improve the competitive advantage of companies and to add value for many social and economic sectors. In fact, important projects with huge investments were launched by several governments to extract the maximum benefit from Big Data. The private sector has also deployed important efforts to maximize profits and optimize resources. However, Big Data sharing brings new information security and privacy issues. Traditional technologies and methods are no longer appropriate and lack of performance when applied in Big Data context. This chapter presents Big Data security challenges and a state of the art in methods, mechanisms and solutions used to protect data-intensive information systems.

Content uploaded by Ayoub Ait Lahcen

Content may be subject to copyright.

301

Chapter 14

DOI: 10.4018/978-1-4666-8387-7.ch014

Big Data Security:

Challenges, Recommendations

and Solutions

ABSTRACT

The value of Big Data is now being recognized by many industries and governments. The eﬃcient min-

ing of Big Data enables to improve the competitive advantage of companies and to add value for many

social and economic sectors. In fact, important projects with huge investments were launched by sev-

eral governments to extract the maximum beneﬁt from Big Data. The private sector has also deployed

important eﬀorts to maximize proﬁts and optimize resources. However, Big Data sharing brings new

information security and privacy issues. Traditional technologies and methods are no longer appropri-

ate and lack of performance when applied in Big Data context. This chapter presents Big Data security

challenges and a state of the art in methods, mechanisms and solutions used to protect data-intensive

information systems.

INTRODUCTION

The value of Big Data is now being recognized

by many industries and governments. In fact, the

efficient mining of Big Data enables to improve

the competitive advantage and to add value for

many sectors (economic, social, medical, scientific

research and so on).

Big Data is mainly defined by its 3Vs funda-

mental characteristics. The 3Vs include Velocity

(data are growing and changing in a rapid way),

Variety (data come in different and multiple

formats) and Volume (huge amount of data is

generated every second) (Wu, Zhu, Wu, & Ding,

2014). According to (Berman, 2013) these three

characteristics must coexist to confirm that a

source is a Big Data source. If one of these three Vs

does not apply, we cannot discuss about Big Data.

(Berman, 2013) and (Katal, Wazid, & Goudar,

2013) indicate that more Vs and other character-

Fatima-Zahra Benjelloun

Ibn Tofail University, Morocco

Ayoub Ait Lahcen

Ibn Tofail University, Morocco

302

Big Data Security

istics have been added by some Big Data actors

to better define it: Vision (the defined purpose of

Big Data mining), Verification (processed data

comply to some specifications), Validation (the

purpose is fulfilled), Value (pertinent information

can be extracted for the benefit of many sectors),

Complexity (it is difficult to organize and analyse

Big data because of evolving data relationships)

and Immutability (collected and stored Big Data

can be permanent if well managed).

Beside this, some argue when defining Big

Data, that any huge amount of digital data sets that

we can no longer collect and process adequately,

through the existing infrastructures and technolo-

gies, are by nature Big Data.

In this chapter, we are interested in security

challenges faced in Big Data context. We present

also a state of the art in several methods, mecha-

nisms and solutions used to protect information

systems that handle large data sets.

Big Data security has many common points

with the security of traditional information systems

(where data are structured). However, Big Data

security requires more powerful tools, appropriate

methods and advanced technologies for rapid data

analysis. It requires also a new security manage-

ment model that handles in parallel internal data

(data produced by internal systems and processes

within an organization) and external data (e.g.,

data collected from other companies or external

web sites). Regarding those points, many ques-

tions can be raised: i) How to manage and process

securely large, unstructured and heterogeneous

types of data sets? ii) How to integrate security

mechanisms into distributed platforms while en-

suring a good performance level (e.g., efficient

storage, rapid processing and real-time analysis)?

iii) How to analyse massive data streams without

compromising data confidentiality and privacy?

This chapter presents first these challenges

in detail. Then, it discusses various solutions

and recommendations proposed to protect data-

intensive information systems.

SECURITY CHALLENGES

IN BIG DATA CONTEXT

As mentioned by (Kim, Kim, & Chung, 2013),

security in Big Data context includes three main

aspects: information security, security monitoring

and data security. For (Lu et al., 2013), managing

security in a distributed environment means to

ensure Big Data management, system integrity

and cyberspace security.

Generally, Big Data security aims to ensure

a real-time monitoring to detect vulnerabilities,

security threats and abnormal behaviours; a granu-

lar role-based access control; a robust protection

of confidential information and a generation of

security performance indicators. It supports rapid

decision-making in a security incident case. The

following sections identify and explain a number

of challenges to achieve these goals.

Big Data Nature

Because of Big Data velocity and huge volumes,

it is difficult to protect all data. Indeed, adding

security layers may slow system performances

and affect dynamic analysis. Thus, access con-

trol and data protection are two “BIG” security

problems (Kim et al., 2013). Furthermore, it is

difficult to handle data classification and man-

agement of large digital disparate sources. Even

though that the cost by GB has diminished, Big

Data security requires important investments. In

addition to all that, Big Data is most of the time

303

Big Data Security

stored and transferred across multiple Clouds and

distributed worldwide systems. Sharing data over

many networks increase security risks.

The Need to Share Information

In globalization context, business models have to

face holistic competition across the world. Thus,

enterprises need to build a sustainable advantage

through collaboration with many entities, data

monetization, and appropriate dynamic data

sharing.

For data sharing, digital ecosystems are based

on multiple heterogeneous platforms. Such eco-

systems aim to ensure real time data access for

many partners, clients, providers and employees.

They rely on multiple connections with different

levels of securities. Data sharing associated to

advanced analytics techniques brings multiple

security threats such us: discovering confidential

information (e.g., process and method of produc-

tions) or illegal access to networks’ traffics. In fact,

by establishing relations between extracted data

from different sources, it is possible to identify

individuals in spite of data anonymization (e.g., by

using correlation attacks, arbitrary identification,

intended identification attacks, etc.).

For instance, in health sector, massive amounts

of medical data are shared between hospitals

and pharmaceutical laboratories for research

and analysis purposes. Such sharing may affect

patient’s privacy even if all the medical records

are anonymized, by finding for instance correla-

tions between medical records and mutual health

insurances (Shin, Sahama, & Gajanayake, 2013).

Multiple Security Requirements

In Big Data context, one challenge is to handle

information security while managing massive and

rapid data streams. Thus, security tools should be

flexible and easily scalable to simplify the inte-

gration of future technological evolutions and to

handle applications requirements’ changes.

There is a need to find a balance between mul-

tiple security requirements, privacy obligations,

system performance and rapid dynamic analysis

on divers large data sets (data in motion or static,

private and public, local or shared, etc.).

Inadequate Traditional Solutions

Traditional Security techniques, such as some

types of data encryption, slow the performance

and are time-consuming in Big Data context.

Furthermore, they are not efficient. Indeed, just

small data partitions are processed for security

purposes. So most of the time, security’s attacks

are detected after the spread of the damage (Lu et

al., 2013). Big Data platforms imply the manage-

ment of various applications and multiple parallel

computations. Therefore, the performance is a key

element for data sharing and real-time analysis in

such environments.

New Security Tools Lack of Maturity

The combination of multiple technologies may

bring hidden risks that are most of the time not

evaluated or under-estimated. In addition, new

security tools lack maturity. So, Big Data platforms

may incorporate new security risk and vulner-

abilities that are not fully assessed.

At the same time, data value is concentrated

on various clusters and data centres. Those rich

data mines are very attractive for commerce, gov-

ernments and industrials. They constitute a target

of several attacks and penetrations. Furthermore,

most of security risks come from employees,

partners and end-point users (more than a third-

part). Hence, it is important to deploy advanced

security mechanisms to protect Big Data clusters

304

Big Data Security

(Ring, 2013; Jensen, 2013). Regarding this point,

data owners have the responsibilities to set clear

security clauses and policies to be respected by

outsources.

Data Anonymization

To ensure data privacy and security, data anony-

mization should be achieved without affecting

system performance (e.g., real-time analysis) or

data quality. However, traditional anonymization

techniques are based on several iterations and time

consuming computations. Several iterations may

affects data consistency and slow down system

performance specially when handling huge het-

erogeneous data sets. In addition, it is difficult to

process and analyse anonymized Big Data (they

need costly computations).

Compatibility with Big

Data Technologies

Some security techniques are incompatible with

commonly used Big Data technologies like Ma-

pRecude paradigm. To ensure security and privacy

of Big Data, it is not enough just to choose powerful

technologies and security mechanisms. It is also

mandatory to verify their compatibility with the

organization Big Data requirements and existing

infrastructure components (Zhao et al., 2014).

Information Reliability and Quality

The reliability of data analysis results depends

on data quality and integrity (Alvaro, Pratyusa,

& Sreeranga, 2013). Therefore, it is important to

verify Big Data sources authenticity and integrity

before analysing data. Since the huge volumes of

data sets are generated every second, it is difficult

to assess the authenticity and integrity of all vari-

ous data sources.

In addition, to extract reliable and complete

information from Big Data sources, the analysts

have to deal with incomplete and heterogeneous

data streams coming from different sources in

different formats. They have to filter data (e.g.,

eliminating noises, errors, spams, and so on).

They have also to organize and contextualize data

(e.g., adding geo-location data) before performing

any analysis.

Compliance to Security Laws

Regulations and Policies

Private organizations and government agencies

have to respect many security laws and industrial

standards that aim to enhance the management of

digital data security and to protect confidentiality

(e.g., delete personal data if no more used, data

protection through its life cycle, transactions

archiving for legal purposes, citizens’ right to

access and modify their data). However, some

ICTs may involve entities across many countries.

So enterprises have to deal with multiple laws and

regulations (Tankard, 2012).

Furthermore, Big Data analytics may be in

conflict with some privacy principles. For ex-

ample, analysts can correlate many different data

sets from different entities to reveal personal or

sensible data even with anonymization techniques.

Consequently, such analysis may enable to identify

individuals or to discover confidential information

(Alvaro et al., 2013).

Need of Big Data Experts

In the era of Big Data, data analysis is a key factor

to prevent and detect security incidents. However,

several surveys confirm that a number of enter-

prises are not aware of the importance to recruit

data scientists for advanced security analysis. In

fact, to ensure Big Data security, organizations

305

Big Data Security

should rely on a multi-disciplinary team with

data scientists, mathematicians and security best

practices programmers (Constantine, 2014).

Big Data Security on

Social Networks

Huge amount of photos, videos, user’s comments

and clicks are generated from social networks

(SNs). They are usually the first source of infor-

mation for different entities (Sykora, Jackson,

O’Brien, & Elayan, 2013).

Big Data on SNs constitute a valuable mine for

governments to better manage national security

risks. Indeed, some governments analyse SNs

Big Data in order to supervise public opinions

(e.g., voting intentions, emotions, feelings about

a project or an event). They can prevent terrorist

and security attacks and assess citizens’ satisfac-

tion regarding public services.

In addition, dynamic analysis of SNs enables

crisis committees to optimize and ensure a rapid

crisis management like in disasters cases. The

goal is to detect rapidly abnormal patterns and to

ensure a real-time monitoring of alarming events.

BIG DATA SECURITY SOLUTIONS

Nowadays, with the spread of social networks,

distributed systems, multiple connections, mobile

devices, the security of a Big Data information

system become the responsibility of all actors

(e.g., managers, security chiefs, auditors, end-users

and customers). In fact, most of security threats

come from inside users and employees. Thus, it

is convenient to raise the security awareness of

all parties and to promote security best practices

of all the connected entities of the digital ecosys-

tem. It is not sufficient just to integrate security

technologies. The collaboration of all actors is

required to eliminate the weak link of the system

chain and to ensure compliance to security laws

and policies.

There exist various security models, mecha-

nisms and solutions for Big Data. However, most

of them are not well known or mature. Many

research projects are currently struggling to en-

hance their performances (Mahmood & Afzal,

2013). In the following sections, we present some

important ones.

Security Foundations for

Big Data Projects

For any Big Data project, it is important to con-

sider the strategic priorities related to security

and to establish clear organizational guidelines

for choosing associated technologies (in term

of reliability, performance, maturity, scalability,

overall cost including maintenance cost). It is

also important to consider the constraints related

to the integration, the existing infrastructure, the

available and planned budget for Big Data security

management.

The goal is to ensure agility across all the

security systems, solutions, processes and proce-

dures. Organizational agility is important to enable

organizations to face rapid changes in terms of

new security’s requirements: legal changes, new

partners and customers, environment and market’s

changes, technological updates and innovations,

new security risks and so on.

After the establishment of security values,

strategies and management models, it is important

to deduce and establish clear security policies,

guidelines, user agreements as well as security

contractual clauses to respect when outsourcing

Big Data services. All security strategies and poli-

cies should take in consideration many factors:

First of all, it is essential to establish Big Data

classification and management principles guided

by a long term vision. In fact, data classification

is mandatory to determine sensitive data and

valuable information to protect, to define data

owners with their security policies, requirements

and responsibilities. Then, the security strategy

should be based on the assessment of security

306

Big Data Security

risks related to the different Big Data management

process (e.g., data generation, storage, transfer, ex-

change, access, deletion, modification and so on).

Finally, a security level has to be determined for

each data category according to the organizational

strategy. In fact, (Bodei, Degano, Ferrari, Galletta,

& Mezzetti, 2012) recommends identifying data

attributes to protect and to encrypt at the begin-

ning of the system conception phase.

Furthermore, the organization has to keep

track of legal changes to update the organizational

policies and procedures. To ensure continuous

legal compliance, it is important to involve the

legal department in the development of Big Data

projects and in the upgrading of all the security

policies, including calendar duration of personal

information, access permission, data transfer

abroad, data access and exchange between many

stakeholders with different security requirements,

integration of contractual security requirements,

data conservation or destruction.

Risk Analysis Related to

Multiple Technologies

It is important to study and assess security risks

related to the mix of multiple technologies inside

a Big Data platform. It is not sufficient to evaluate

security risks related to each used technology. In

fact, the integration of disparate technologies for

multiple purposes may bring hidden risks and

unknown security threats.

In addition, with the increasing spread of the

Cloud and the BYOD (Bring Your Own Device),

it important to consider security threats related

to the distributed environments and the use of

non-normalized mobile and personal devices for

professional purposes. For this point, (Ring, 2013)

recommends to protect the multi-disparate end-

points with an extra security layer. Furthermore,

the mobile devices should be normalized to fulfil

organizational and industrial security standards.

Choosing Adequate

Security Solutions

To enhance Big Data security, organizations rely

on advanced dynamic security analysis. The goal

is to extract and analyse in real-time or nearly real-

time security events and related users actions in

order to enhance online and transactional security

and to prevent fraudulent attacks.

Such dynamic analysis on the generated Big

Data helps to detect timely security incidents, to

identify abnormal customers’ behaviours, to moni-

tor security threats, to discover known and new

cyber-attack patterns and so on. Hence, dynamic

analysis on Big Data enables an improved preven-

tion and rapid reactivity to take good decisions for

security. In parallel, the analysis of the generated

statics from applications and programs allows to

produce security performance indicators and to

monitor and secure programs’ behaviours.

(Kim et al., 2013) recommends protecting the

data values instead of the data themselves. In fact,

it is too difficult, and nearly impossible to protect

huge data sets. Furthermore, Big Data security

analysis techniques are based on the attributes’

information. Thus, the data owners or operators

have the responsibility to define, select and protect

only important attributes that they consider valu-

able for their use cases. To protect such important

data attributes in Big Data context, (Kim et al.,

2013) presents the following:

• Evaluate the importance of the attributes,

compare and evaluate the correlations be-

tween them.

• Filter and deﬁne the valuable attributes to

protect.

• Choose security mechanisms to protect the

relevant attributes according to the data

owner or the organization policies.

307

Big Data Security

Several analytical solutions are available to

secure Big Data such as Accenture, HP, IBM,

CISCO, Unisys, EADS security solutions. They

brings different level of performance and several

benefits like: enable Agile decision-making and

rapid reaction through real-time surveillance and

monitoring; detect dynamic attacks with enhanced

reliability (low false-positive rate) thanks to the

analysis of active and passive security informa-

tion; provide Full visibility of the network status

and applications’ security problems.

(Mahmood & Afzal, 2013) identifies the de-

ployment phases of Big Data Analytics solutions.

First of all, it recommends identifying strategic

priorities and Big Data security analysis goals.

Then, the organizational priorities should provide

guideline to develop a more detailed strategy for

Big Data analysis platform’s deployment. The

purpose is to optimize the selection of security

solutions according to the strategic goals, the triple

constraints (overall cost, quality, duration of the

implementation), the added value and the avail-

able features (performance, reliability, scalability

and so on). Regarding organizational strategy,

it recommends to adopt a centralized Big Data

management for better security outcomes.

Before Big Data analysis, one pre-requisite is

to consider the integrity and authenticity of data

sources. Indeed, Big Data sources may contain

errors, noises, incomplete data, or data without

context information. Consequently, it is essential

to filter and prepare data and to add context data

before applying analysis techniques.

Anonymization of Confidential

or Personal Data

Data anonymization is a recognized technique

used to protect data privacy across the Cloud

and the distributed systems. Several models and

solutions are used to implement this technique

such as: Sub-tree data anonymization, t-closeness,

m-invariance, k-anonymity and l-diversity.

Sub-tree techniques are based on two methods:

Top-Down Specialization (TDS) and Bottom-Up

Generalization (BUG). However, those methods

are not scalable. There is a lack of performance

when such methods are used for certain ano-

nymization parameters. They cannot scale when

applied to anonymize Big Data on distributed

systems.

In order to improve the anonymization of valu-

able information extracted from large data sets,

(Zhang et al., 2014) suggests a hybrid approach

that combines both anonymization techniques

TDS and BUG. This approach selects and applies

automatically one of the two techniques that are

suitable for the use case parameters. Thus, this

hybrid approach provides efficiency, performance

and scalability required to anonymize huge data-

bases. It is supported by newly adapted programs

to handle MapReduce paradigm. It enables to re-

duce computation time in the distributed systems

or the Cloud.

Big Data processing and analysis are based

most of the time on much iteration to have reliable

and precise results, which may slow computa-

tions and the performance of security solutions.

Regarding this, (Zhang et al., 2014) proposes

a method based on one-iteration for operations

generalization. The goal is to enhance parallelism

capacities, the performance and the scalability of

anonymization techniques.

Currently, many projects are working to de-

velop new techniques and to improve existing ones

to protect privacy. As an example, some projects

are based on privacy preservation aware analysis

and scheduling techniques of large data set.

Data Cryptography

Data Encryption is a common solution used to

ensure data and Big Data confidentiality. Many

researches were conducted to improve the perfor-

mance and the reliability of traditional techniques

or to create new ways for Big Data encryption

techniques.

308

Big Data Security

Unlike some traditional techniques for encryp-

tion, Homomorphic Cryptography enables com-

putation even on encrypted data. Consequently,

this technique ensures information confidentiality

while allowing extracting useful insight through

some possible analysis and computations on the

encrypted data.

Regarding this solution, (Chen & Huang, 2013)

proposes an adapted platform to handle MapRe-

duce computations in the case of Homomorphic

Cryptography. To ensure performance of the cryp-

tographic solutions in distributed environments,

(Liu et al., 2013) suggests a new approach for key

exchange called CBHKE (Cloud Background Hi-

erarchical Key Exchange). It is a secured solution

that is more rapid than its predecessor techniques

(IKE and CCBKE). It is based on an iterative

strategy to an Authenticate Key Exchange (AKE)

through two phases (layer by layer). However, new

approaches with enhanced performance are still

needed to improve the encryption of large data

sets on distributed systems.

Centralized Security Management

(Kasim, Hung, & Li, 2012) recommend storing

data on the Cloud rather than mobile devices. The

goal is to take advantage of the normalized and

standard compliance infrastructure and central-

ized security mechanisms of the Cloud. Indeed,

the Cloud platforms are regularly updated and

continuously monitored for an enhanced security.

However, “Zero risk” is hard to achieve. In

fact, data security relies on the hand of the Cloud

outsourcers and operators. In addition, the Cloud

is very attractive for attackers as it is a centralized

mine of valuable data. Data owners and managers

should be aware of the security risks and define

clear data access policies. They have to ensure

that the required security level is ensured when

outsourcing Big Data management, storage or

processing.

Furthermore, it is essential to change the tra-

ditional governance concept where only security

managers and chiefs, are accountable. It is more

convenient to adopt a centralized security gov-

ernance to meet the challenges of securing Big

Data sources on distributed environments. The

organization should involve all the stakeholders

connected to its ecosystem including employ-

ees, managers, ISR, operators, users, customers,

partners, suppliers, outsources and so on. The

goal is to make all the parties accountable for

security management to enhance the adoption of

security best practices and to ensure standard and

law compliance. Users should be aware threats,

regulations and policies.

As an example, partners have to respect data

access and confidentiality policies. Users have

to update regularly their systems, to make sure

that their mobile devices respect standards and

recommended security practices and regulations.

Users have also to avoid installing non reliable

components (e.g., counterfeit software, software

without a valid license). On the other hand, pro-

grammers, architectures and designers of Big

Data applications have to integrate security and

privacy requirements though out all the develop-

ment life cycle. The outsourcers should be made

accountable for Big Data security through clear

security clauses.

Data Confidentiality and

Data Access Monitoring

There is an increasing spread of security threats

because of the increasing data exchange over

distributed systems and the Cloud. To face these

security challenges, (Tankard, 2012) proposes

to enhance the control by integrating controls at

data level and during storage phase. In fact, it has

proved that controls at application and system

levels are not sufficient.

309

Big Data Security

In addition, access controls have to be well

granulated to limit the access by role and respon-

sibilities. There exist many techniques to ensure

access control and data confidentiality such as

ICP, certificates, smart-cards, federated identity

management, multi-factors authentication.

For example, Law Enforcement Agencies

(LEA) of USA have launched INDECT project in

order to implement a secured infrastructure for a

secured data exchange between agencies and other

members (Stoianov, Uruena, Niemiec, Machnik,

& Maestro, 2013). The solution includes:

• A public Key Infrastructure (PKI) with

three levels (certiﬁcation authority, users

and machines). The PKI provides access

control based on a multi-factor authentica-

tion and the security level required for each

data type. For instance, access to highly

conﬁdential applications requires a valid

certiﬁcate and a password.

• A Federated Identity Management is a

concept used by the INDECT platform to

enhance access control and security. This

type of federated management is delegat-

ed to an Identity Provider (IdP) within a

monitored trust domain. It is based on two

security tools: certiﬁcates and smart-cards.

Those tools are used to store user certiﬁ-

cates issued by the PKI to encrypt and sign

documents and emails.

• An INDECT Block Cipher IBC algorithm

is a new algorithm for asymmetric cryp-

tography. It was developed and used to en-

crypt databases, communication sessions

(TLS-SSL) and VPN tunnels. The goal is to

ensure a high level of data conﬁdentiality.

• Secured communications based on VPN

and TLS-SSL protocols. Those mecha-

nisms are used to protect access to Big

Data servers.

Authentication mechanisms are most of the

time, complex and heavy to handle across distrib-

uted clusters and large data sets. For this reason,

(Zhao et al., 2014) suggests a model for security

on G-Hadoop that integrates several security solu-

tions. It is based on Signe Sign On (SSO) concept

that simplifies users authentication and the compu-

tation of MapReduce functions. Thus, this model

enables users to access different clusters with the

same account identifier. Furthermore, privacy is

protected through encrypted connections based on

SSL protocol, Public Key Cryptography and valid

certificates for authentication. Hence, this model

offers an efficient access control and a protection

against hackers and attackers (e.g., MITM attacks,

version rollback, delay attack) and deny access to

fraudulent or untruthful entities.

(Mansfield-Devine, 2012) recommends to

involve not just security chiefs but also to make

responsible all end users for a better access control.

It suggests also combining different types of con-

trols inside multi-silo environments (e.g., archives,

data loss prevention, access control, logs).

Security Surveillance and Monitoring

It is important to ensure a continuous surveillance

in order to detect in real time security incidents,

threats and abnormal behaviours. To ensure Big

Data security surveillance, some solutions are

available such as: Data Loss Prevention (DLP),

Security Information and Event Management

(SIEM) and dynamic analysis of security events.

Such solutions are based on consolidation and cor-

relations methods between multiple data sources,

and on contextualization tools (to add context

as data attribute to the extracted data). It is also

important to conduct regular audits and to verify

the respect of security policies and recommended

best practices by users and employees.

310

Big Data Security

EXAMPLE: SECURITY OF

SMART GRID BIG DATA

This section presents security challenges and

solutions regarding Big Dada processed in Smart

Grid infrastructures.

Giving the growing power demand, a Smart

Grid gather and process huge data sets generated

daily (e.g., consumers’ behaviours and habits) to

ensure an efficient and cost-effective production

and distribution of electricity. Unlike traditional

electrical grid, Smart Grid is based on advanced

technologies and enables bidirectional power

flow between connected devices. However, Smart

Grid infrastructures are vulnerable and face many

security threats. In fact, data are transferred mas-

sively in this context and security attacks may have

serious and large scale impacts (e.g., regional or

national interruption in power supply, important

economic loss, disruption of public services, low

service quality in hospitals, etc.).

Hence, securing Big Data of Smart Grid infra-

structures is fundamental to protect grid system

performances, to ensure reliable coordination

between control centres and equipments, and to

enhance the safety of all system operations.

Smart Grid Security Challenges

Security challenges that are facing critical large

scale infrastructures can be classified according to

various parameters. (Pathan, 2014) recommends

a holistic multi-layered security approach to deal

with this issue at all grid levels (i.e., physical,

cybernetics and data). In fact, Smart Grid environ-

ment incorporates many distributed subsystems

and end-points, including distributed sensors and

actuators, ever-growing number of Intelligent

Electronic Devices (IED) with different power

requirements (e.g., electric vehicles and smart

homes), electric generators and control applica-

tions. In addition, these subsystems and end-points

have multiple bidirectional communications

between each other.

This complexity increases anomalies, human

errors and system vulnerabilities that can be

exploited by security attackers. As an example,

the governor control system (GC) of a Smart

Grid ensures the steady operation of all power

generators. It detects by sensors the speed and

the frequency deviation of generators and rapidly

adjusts their operations. Attackers may succeed to

access one point of network communications and

change values recorded by the generators sensors.

Thus, such type of communication intrusion and

malicious modification of data, could affect power

flow inside the grid and compromise the decision

making process supervised by the Optimal Power

Flow (OPF).

Moreover, attackers may use Grid utilities and

smart meters to get private information, which

compromises consumer privacy (Stimmel, 2014).

Since Smart Grid system incorporates mul-

tiple interconnected subsystems. There is a risk

of cascading failures. In fact, any attack to one

of the subsystems may compromise the security

of the other ones. Thus, it is crucial to secure the

overall Smart Grid system, including the physical

components, the cyber space and the data gener-

ated and transmitted through the grid networks.

Security Solutions for Smart Grid

Big Data security management in Smart Grid

environment aims to ensure data confidential-

ity, integrity and availability for efficient and

reliable grid operations. To increase Smart Grid

attacks-resilience and response to the previous

cited challenges, several solutions and research

efforts have been made.

Considering the cascading failure aspect and

the complexity of the Smart Grid system, it is

recommended to adopt multi-layered approach.

This helps to secure not only data layer but also

all the other layers: physicals, networks, hosts,

data stores, applications, policies and regulations.

For instance, at the physical layer, it is important

to ensure the security of equipments, consumer

311

Big Data Security

devices, substations, sensors, control centres and

so on. Concerning the cyber layer, it is recom-

mended to secure network communications and

eliminate weak points from the cyber topology

(e.g., bad intrusion detection algorithms). Regard-

ing the data layer, it is crucial to ensure granular

access control and granular audits (Alvaro A. C.,

Pratyusa K. M., & Sreeranga P. R., 2013).

A successful security management of Smart

Grid system should incorporate real-time security

monitoring. Big Data analytics algorithms are one

of the powerful solutions recommended to face

such issue. Those algorithms drive improved pre-

dictions and more precise analysis. They are often

based on Machine Learning techniques and use

not only traditional security logs and events, but

also performance and costumers’ data to recognize

and prevent malicious behaviours (Khorshed, Ali,

& Wasimi, 2014).

Real-time monitoring can be part in mitiga-

tion strategy to help security decision making. It

assists security analysts to decide if a preventive

or remedial action should be taken (e.g., change

user roles or privileges, suspend suspicious ac-

cess, correct network configurations). The list

of actions depends on the nature of incident and

its impact. They can be implemented through

an automatically or semi-automatically process.

Ensuring continuous updates of such strategy is a

good practice. This helps integrating new security

solutions, new laws as well as emerging security

practices and models.

Considering the complexity and evolving

nature of cybercrimes, it is important to promote

continuous commitment and timely sharing of

security information between all Smart Grid

partners, utilities, and specialized organizations

in cybercrimes (Bughin, J., Chui, M., & Manyika,

J. 2010).

It is well known that security techniques are not

sufficient in any industry. They have to be guided

by legal actions and regulations in order to protect

the valuable data and other assets. Therefore,

security requirements, policies, regulations and

standards should be updated regularly to consider

evolving security issues.

CONCLUSION

Big Data applications promise interesting op-

portunities for many sectors. In fact, extracting

valuable insight and information from disparate

large data sources enables to improve the competi-

tive advantage of organizations. For instance, the

analysis of data streams or archives (e.g., using

predictive or identification models) can help to

optimize production processes, to enhance services

with added value and to adapt them to customers’

needs. However, Big Data sharing and analysis rise

many security issues and increase privacy threats.

This chapter presents some of the important Big

Data security challenges and describes related

solutions and recommendations. Because it is

nearly impossible to secure very large data sets, it

is more practical to protect the data value and its

key attributes instead of the data itself, to analyse

security risks of combining different evolving Big

Data technologies and to choose security tools

according to the goals of the Big Data project.

REFERENCES

Alvaro, A. C., Pratyusa, K. M., & Sreeranga, P.

R. (2013). Big data analytics for security. IEEE

Security and Privacy, 11(6), 74–76. doi:10.1109/

MSP.2013.138

Berman, J. J. (2013). Principles of big data:

Preparing, sharing, and analyzing complex infor-

mation. San Francisco, CA: Morgan Kaufmann

Publishers Inc.

312

Big Data Security

Bodei, C., Degano, P., Ferrari, G. L., Galletta, L.,

& Mezzetti, G. (2012). Formalising security in

ubiquitous and cloud scenarios. In A. Cortesi, N.

Chaki, K. Saeed, & S. Wierzchon (Eds.), Computer

information systems and industrial management:

Proceedings of the 11th IFIP TC 8 International

Conference (LNCS) (Vol. 7564, pp. 1-29). Berlin,

Germany: Springer. doi:10.1007/978-3-642-

33260-9_1

Bughin, J., Chui, M., & Manyika, J. (2010). Clouds,

big data, and smart assets: Ten tech-enabled

business trends to watch. Retrieved from http://

www.mckinsey.com/insights/high_tech_tele-

coms_internet/clouds_big_data_and_smart_as-

sets_ten_tech-enabled_business_trends_to_watch

Chen, X., & Huang, Q. (2013). The data protection

of mapreduce using homomorphic encryption. In

Proceedings of the 4th IEEE International Confer-

ence on Software Engineering and Service Science

(pp. 419-421). Beijing, China: IEEE.

Constantine, C. (2014). Big data: An informa-

tion security context. Network Security, 2014(1),

18–19. doi:10.1016/S1353-4858(14)70010-8

Jensen, M. (2013). Challenges of privacy protec-

tion in big data analytics. In Proceedings of IEEE

International Congress on Big Data (pp. 235-

238). Washington, DC: IEEE Computer Society.

doi:10.1109/BigData.Congress.2013.39

Kasim, H., Hung, T., & Li, X. (2012). Data value

chain as a service framework: For enabling data

handling, data security and data analysis in the

cloud. In Proceedings of the 18th IEEE Interna-

tional Conference on Parallel and Distributed Sys-

tems (pp. 804-809). Washington, DC: IEEE Com-

puter Society. doi:10.1109/ICPADS.2012.131

Katal, A., Wazid, M., & Goudar, R. (2013). Big

data: Issues, challenges, tools and good prac-

tices. In Proceedings of the Sixth International

Conference on Contemporary Computing (pp.

404-409). Noida, India: IEEE. doi:10.1109/

IC3.2013.6612229

Khorshed, M. T., Ali, A. B., & Wasimi, S. A.

(2014). Combating Cyber Attacks in Cloud Sys-

tems Using Machine Learning. In S. Nepal &

M. Pathan (Eds.), Security, Privacy and Trust in

Cloud Systems (pp. 407–431). Berlin, Germany:

Springer. doi:10.1007/978-3-642-38586-5_14

Kim, S. H., Kim, N. U., & Chung, T. M. (2013).

Attribute relationship evaluation methodology for

big data security. In Proceedings of the Interna-

tional Conference on IT Convergence and Security

(pp. 1-4). Macao, China: IEEE. doi:10.1109/

ICITCS.2013.6717808

Liu, C., Zhang, X., Liu, C., Yang, Y., Ranjan,

R., Georgakopoulos, D., & Chen, J. (2013). An

iterative hierarchical key exchange scheme for

secure scheduling of big data applications in

cloud computing. In Proceedings of the 12th IEEE

International Conference on Trust, Security and

Privacy in Computing and Communications (pp.

9-16). Washington, DC: IEEE Computer Society.

doi:10.1109/TrustCom.2013.65

Lu, T., Guo, X., Xu, B., Zhao, L., Peng, Y., &

Yang, H. (2013). Next big thing in big data: The

security of the ict supply chain. In Proceedings of

the International Conference on Social Computing

(pp. 1066-1073). Washington, DC: IEEE Com-

puter Society. doi:10.1109/SocialCom.2013.172

Mahmood, T., & Afzal, U. (2013). Security analyt-

ics: Big data analytics for cybersecurity: A review

of trends, techniques and tools. In Proceedings

of the 2nd National Conference on Information

Assurance (pp. 129-134). Rawalpindi, Pakistan:

IEEE. doi:10.1109/NCIA.2013.6725337

Mansfield-Devine, S. (2012). Using big data to

reduce security risks. Computer Fraud & Secu-

rity, (8): 3–4.

Pathan, A. S. K. (2014). The state of the art in

intrusion prevention and detection. Boca Raton,

FL: Auerbach Publications. doi:10.1201/b16390

313

Big Data Security

Ring, T. (2013). It’s megatrends: The security im-

pact. Network Security, 2013(7), 5–8. doi:10.1016/

S1353-4858(13)70080-1

Shin, D., Sahama, T., & Gajanayake, R. (2013).

Secured e-health data retrieval in daas and big data.

In Proceedings of the 15th IEEE international

conference on e-health networking, applications

services (pp. 255-259). Lisbon, Portugal: IEEE.

doi:10.1109/HealthCom.2013.6720677

Stimmel, C. L. (2014). Big data analytics strate-

gies for the smart grid. Boca Raton, FL: Auerbach

Publications. doi:10.1201/b17228

Stoianov, N., Uruena, M., Niemiec, M., Machnik,

P., & Maestro, G. (2012). Security Infrastructures:

Towards the INDECT System Security. Paper

presented at the 5th International Conference on

Multimedia Communication Services & Security

(MCSS), Krakow, Poland. doi:10.1007/978-3-

642-30721-8_30

Sykora, M., Jackson, T., O’Brien, A., & Elayan,

S. (2013). National security and social media

monitoring: A presentation of the emotive and

related systems. In Proceedings of the European

Intelligence and Security Informatics Confer-

ence (pp. 172-175). Uppsala, Sweden: IEEE.

doi:10.1109/EISIC.2013.38

Tankard, C. (2012). Big data security. Network

Security, (7): 5–8.

Wu, X., Zhu, X., Wu, G.-Q., & Ding, W. (2014).

Data mining with big data. IEEE Transactions on

Knowledge and Data Engineering, 26(1), 97–107.

doi:10.1109/TKDE.2013.109

Zhang, X., Liu, C., Nepal, S., Yang, C., Dou, W.,

& Chen, J. (2014). A hybrid approach for scal-

able sub-tree anonymization over big data using

mapreduce on cloud. Journal of Computer and

System Sciences, 80(5), 1008–1020. doi:10.1016/j.

jcss.2014.02.007

Zhao, J., Wang, L., Tao, J., Chen, J., Sun, W., &

Ranjan, R. etal. (2014). A security framework in

g-hadoop for big data computing across distrib-

uted cloud data centres. Journal of Computer and

System Sciences, 80(5), 994–1007.

KEY TERMS AND DEFINITIONS

Anonymization: Anonymization is the process

of protecting data privacy across information

systems. Several models and methods are used to

implement it such as: t-closeness, m-invariance,

k-anonymity and l-diversity.

Authentication: Authentication aims to test

and ensure, with a certain probability, that par-

ticular data are authentic, i.e., they have not been

changed.

Big Data: Big Data is mainly defined by its

3Vs fundamental characteristics. The 3Vs include

Velocity (data are growing and changing in a

rapid way), Variety (data come in different and

multiple formats) and Volume (huge amount of

data is generated every second).

Confidentiality: Confidentiality is a property

that ensures that data is not made disclosed to

unauthorized persons. It enforces predefined rules

while accessing the protected data.

Encryption: Encryption relies on the use

of encryption algorithms to transform data into

encrypted forms. The purpose is to make them

unreadable to those who do not possess the en-

cryption key(s).

Privacy: Privacy is the ability of individuals

to seclude information about themselves. In other

words, they selectively control its dissemination.

Security Management: Security management

is a part of the overall management system of an

organization. It aims to handle, implement, moni-

tor, maintain, and enhance data security.

Challenges and Obstacles Facing Data in the Big Data Environment

Article

May 2022

Big data is dubbed “today’s digital oil” and the “new raw resource of the twenty-first century”. BD is synonymous with the future of innovation, competition, and productivity. It can produce and find corporate value by analyzing data in ways that older methodologies could not. Regardless of its benefits, the development of big data continues to encounter various barriers, the most important of which are security and privacy concerns. As a result, this study is motivated by the need to address and evaluate big data challenges. Thus, by comparing and contrasting big data difficulties with available and potential solutions, users, developers, and businesses can find pertinent and timely responses to specific dangers, resulting in the best possible big data-based services. The objective of this essay is to highlight the inherent challenges of big data and some essential strategies for overcoming them.The purpose of this article was to extract and analyze significant works in order to contribute to the corpus of literature by emphasizing many critical difficulties in the big data domain and throwing light on how these challenges affect a range of domains, including users, sites, and business. Many issues such as data privacy, information sharing, failures in big data technologies and infrastructure, poor data quality management, managers and policymakers’ inability to learn and adapt, the absence of government policies and plans, the lack of successfully implemented big data analytics projects, and the lack of human experience are among the most frequently mentioned issues. Obstacles also exist in the areas of real-time data collection, as well as real-time data processing and visualization, among others.By combining previously identified solutions, this research addressed these concerns. The consequences for both researchers and practitioners have been discussed. Aiming to help scholars gain a comprehensive understanding of these issues and confirm the approaches used to address them, this study’s theoretical focus is broad. This study uses tried-and-true solutions to overcome these obstacles. Business and individuals using big data analytics systems will benefit from these solutions.

A Secure Experimentation Sandbox for the design and execution of trusted and secure analytics in the aviation domain

Preprint

Full-text available

Nov 2021

The aviation industry as well as the industries that benefit and are linked to it are ripe for innovation in the form of Big Data analytics. The number of available big data technologies is constantly growing, while at the same time the existing ones are rapidly evolving and empowered with new features. However, the Big Data era imposes the crucial challenge of how to effectively handle information security while managing massive and rapidly evolving data from heterogeneous data sources. While multiple technologies have emerged, there is a need to find a balance between multiple security requirements, privacy obligations, system performance and rapid dynamic analysis on large datasets. The current paper aims to introduce the ICARUS Secure Experimentation Sandbox of the ICARUS platform. The ICARUS platform aims to provide a big data-enabled platform that aspires to become an 'one-stop shop' for aviation data and intelligence marketplace that provides a trusted and secure 'sandboxed' analytics workspace, allowing the exploration, integration and deep analysis of original and derivative data in a trusted and fair manner. Towards this end, a Secure Experimentation Sandbox has been designed and integrated in the ICARUS platform offering, that enables the provisioning of a sophisticated environment that can completely guarantee the safety and confidentiality of data, allowing to any interested party to utilise the platform to conduct analytical experiments in closed-lab conditions.

Research on the Influencing Factors of College Students' Deep Meaningful Learning in Blended Learning Mode

Article

May 2024

Shu Li

This study examines the factors that impact deep and meaningful learning in blended learning environments and their connections. The sample included 397 college students from a university in Sichuan Province, selected through random sampling. Data was collected using a questionnaire based on Bandura's ternary interaction theory, encompassing learners, helpers, environment, and interaction dimensions. The following text should be remembered: "Hypotheses were developed based on existing literature, and a survey with established scales was created. Quantitative analysis was conducted using SPSS and AMOS software. The mean, standard deviation, Variance, skewness, and kurtosis values were within reasonable ranges. The model's latent variables showed strong convergent validity, with standardized factor loadings (SFL) ranging from 0.807 to 0.965, average Variance extracted (AVE) from 0.697 to 0.946, and composite reliability (C.R.) from 0.919 to 0.946. Model fit indices indicated acceptable fit (CMIN/DF: 2.303, NFI: 0.966, CFI: 0.980, RMSEA: 0.058, RMR: 0.008, PNFI: 0.789). The study optimized the model through path analysis, culminating in the final structural equation model (SEM)." Findings indicate (1) Learner, environmental, and interaction factors positively influence deep meaningful learning, while helper factors show a negative correlation; (2) learner, interaction, and helper factors mediate the environment's impact on deep, meaningful learning; and (3) environmental factors hold the most significant sway over helper factors, followed by interaction and learner factors. Helpers wield significant influence over learners, enhancing deep understanding. These insights guide effective, deep, meaningful learning strategies in blended learning

Big Data and Financial Technology (Fintech) Towards Financial Inclusion

Article

Full-text available

Jan 2024

David Mhlanga

This study explores the critical role of big data in the field of financial technology (FinTech) andits impact on financial inclusion. We conducted an extensive review of various sources, includingacademic papers, industry reports, and online content. Our research reveals that big data is keyto developing new financial products and services. It improves risk management and operationalefficiency, which in turn promotes financial inclusion. A major finding is that big data enablesdetailed analysis of customer behavior, crucial for creating inclusive financial services. However,we also identify challenges such as ensuring data privacy, security, and ethical use of algorithms.Our findings are particularly relevant for policymakers, regulators, and industry professionals.They highlight the importance of developing balanced regulations to use big data effectively andresponsibly. Overall, our study demonstrates the significant impact of big data in transformingFinTech, paving the way for a more inclusive financial environment

A big data intelligence marketplace and secure analytics experimentation platform for the aviation industry

Preprint

Full-text available

Nov 2021

The unprecedented volume, diversity and richness of aviation data that can be acquired, generated, stored, and managed provides unique capabilities for the aviation-related industries and pertains value that remains to be unlocked with the adoption of the innovative Big Data Analytics technologies. Despite the large efforts and investments on research and innovation, the Big Data technologies introduce a number of challenges to its adopters. Besides the effective storage and access to the underlying big data, efficient data integration and data interoperability should be considered, while at the same time multiple data sources should be effectively combined by performing data exchange and data sharing between the different stakeholders. However, this reveals additional challenges for the crucial preservation of the information security of the collected data, the trusted and secure data exchange and data sharing, as well as the robust data access control. The current paper aims to introduce the ICARUS big data-enabled platform that aims provide a multi-sided platform that offers a novel aviation data and intelligence marketplace accompanied by a trusted and secure analytics workspace. It holistically handles the complete big data lifecycle from the data collection, data curation and data exploration to the data integration and data analysis of data originating from heterogeneous data sources with different velocity, variety and volume in a trusted and secure manner.

A systematic review on privacy-preserving distributed data mining

Article

Full-text available

Oct 2021

Combining and analysing sensitive data from multiple sources offers considerable potential for knowledge discovery. However, there are a number of issues that pose problems for such analyses, including technical barriers, privacy restrictions, security concerns, and trust issues. Privacy-preserving distributed data mining techniques (PPDDM) aim to overcome these challenges by extracting knowledge from partitioned data while minimizing the release of sensitive information. This paper reports the results and findings of a systematic review of PPDDM techniques from 231 scientific articles published in the past 20 years. We summarize the state of the art, compare the problems they address, and identify the outstanding challenges in the field. This review identifies the consequence of the lack of standard criteria to evaluate new PPDDM methods and proposes comprehensive evaluation criteria with 10 key factors. We discuss the ambiguous definitions of privacy and confusion between privacy and security in the field, and provide suggestions of how to make a clear and applicable privacy description for new PPDDM techniques. The findings from our review enhance the understanding of the challenges of applying theoretical PPDDM methods to real-life use cases, and the importance of involving legal-ethical and social experts in implementing PPDDM methods. This comprehensive review will serve as a helpful guide to past research and future opportunities in the area of PPDDM.

On Securing Communications Between Connected Objects Using a Data-Centric Security Approach

Conference Paper

Jul 2023

Connected objects are one of the most important vectors for the collection of personal data. With the increase in data volumes, we are observing an increase in network vulnerabilities and data breaches.Data-centric security (DCS) and its related protocols such as the NATO STANAG 4774 have become a suited approach to address diverse data protection and secure information exchange. Despite the novelty of the approach, it comes with a challenge regarding its implementation to ensure the integrity of data in real scenario. In this paper, we are evaluating the NATO STANAG 4774 protocol when securing smart home data. Then, we use Random Forest to detect cyber attacks based on malware injection. We conduct an empirical study to evaluate the performance of our approach and we show how a machine learning technique can be used to ensure the integrity of data when using a data-centric security protocol. In fact, our proposed approach has a recall of 0.781 —in other words, it correctly identifies more than 78% of all malicious data injection.

Enhancing vulnerability scoring for information security in intelligent computers

Article

Sep 2023

Qingkun Zhu

Estate Surveyors and Valuers’ Perception of the Role of Big Data in Property Marketing in Lagos, Nigeria

Article

Full-text available

Jun 2022

Property marketing in Nigeria has frequently been characterised by unique challenges due to inadequate and unreliable data. In recent times, big data has presented a wide range of solutions to contemporary issues in different industries, including real estate. Notwithstanding, there is a paucity of practical studies on the role of big data in property marketing of many developing nations, including Nigeria. Hence, there is a need to investigate the role of big data in transforming property marketing in Nigeria. The aim of this study is therefore to investigate Estate Surveyors and Valuers’ (ESVs) perception of the role of big data in property marketing in Lagos, Nigeria. 82 questionnaires were administered to ESVs in the study area and 55 (67%) of them were found useful for analysis. The data were analysed using frequency, percentage, mean and relative importance index. The results revealed that out of 9 possible roles of big data in property marketing, the 2 most notable ones are that real estate firms can execute laser-focused marketing strategies; and they can match demand with supply. These outcomes distinctly disclosed the merits of adopting big data in marketing properties. The implication of these findings is that big data has the potential of reducing fraud due to the presence of sufficient and reliable data in the property market. This study concludes that big data is a worthy addition to the toolset of real estate practitioners, particularly with respect to property marketing, and wholly recommends its adoption.

A Big Data Intelligence Marketplace and Secure Analytics Experimentation Platform for the Aviation Industry

Chapter

Apr 2021

Over the last years, the impacts of the evolution of information integration, increased automation and new forms of information management are also evident in the aviation industry that is disrupted also by the latest advances in sensor technologies, IoT devices and cyber-physical systems and their adoption in aircrafts and other aviation-related products or services. The unprecedented volume, diversity and richness of aviation data that can be acquired, generated, stored, and managed provides unique capabilities for the aviation-related industries and pertains value that remains to be unlocked with the adoption of the innovative Big Data Analytics technologies. The big data technologies are focused on the data acquisition, the data storage and the data analytics phases of the big data lifecycle by employing a series of innovative techniques and tools that are constantly evolving with additional sophisticated features, while also new techniques and tools are frequently introduced as a result of the undergoing research activities. Nevertheless, despite the large efforts and investments on research and innovation, the Big Data technologies introduce also a number of challenges to its adopters. Besides the effective storage and access to the underlying big data, efficient data integration and data interoperability should be considered, while at the same time multiple data sources should be effectively combined by performing data exchange and data sharing between the different stakeholders that own the respective data. However, this reveals additional challenges related to the crucial preservation of the information security of the collected data, the trusted and secure data exchange and data sharing, as well as the robust access control on top of these data. The current paper aims to introduce the ICARUS big data-enabled platform that aims provide a multi-sided platform that offers a novel aviation data and intelligence marketplace accompanied by a trusted and secure “sandboxed” analytics workspace. It holistically handles the complete big data lifecycle from the data collection, data curation and data exploration to the data integration and data analysis of data originating from heterogeneous data sources with different velocity, variety and volume in a trusted and secure manner.

Formalising Security in Ubiquitous and Cloud Scenarios

Conference Paper

Full-text available

Sep 2012

We survey some critical issues arising in the ubiquitous computing paradigm, in particular the interplay between context-awareness and security. We then overview a language-based approach that addresses these problems from the point of view of Formal Methods. More precisely, we briefly describe a core functional language extended with mechanisms to express adaptation to context changes, to manipulate resources and to enforce security policies. In addition, we shall outline a static analysis for guaranteeing programs to securely behave in the digital environment they are part of.

Secured e-health data retrieval in DaaS and Big Data

Conference Paper

Full-text available

Oct 2013

Big Data is one of rising IT trends such as cloud computing, social networking or ubiquitous computing. Big Data can offer beneficial scenarios in the e-health arena. However, one of the scenarios can be that Big Data needs to be kept secured for a long time in order to gain its benefits such as finding cures for infectious diseases and keeping patients' privacy. From this connection, it is beneficial to analyze Big Data to make meaningful information while the data are stored in a secure manner. Thus, the analysis of various database encryption techniques is essential. In this study, we simulated 3 types of technical environments such as Plain-text, Microsoft Built-in Encryption, and custom Advanced Encryption Standard using Bucket Index in Data-as-a-Service. The results showed that custom AES-DaaS has faster range query response time than MS built-in encryption. In addition, while carrying out the scalability test, we acknowledged there are performance thresholds according to physical IT resources. Therefore, for the purpose of efficient Big Data management in e-health, it is noteworthy to examine its scalability limits as well even if it is under cloud computing environment. Furthermore, when designing an e-health database, both patients' privacy and system performance needs to be dealt as top priorities.

Big Data Analytics Strategies for the Smart Grid

Book

Apr 2016

Carol L. Stimmel

By implementing a comprehensive data analytics program, utility companies can meet the continually evolving challenges of modern grids that are operationally efficient, while reconciling the demands of greenhouse gas legislation and establishing a meaningful return on investment from smart grid deployments. Readable and accessible, Big Data Analytics Strategies for the Smart Grid addresses the needs of applying big data technologies and approaches, including Big Data cybersecurity, to the critical infrastructure that makes up the electrical utility grid. It supplies industry stakeholders with an in-depth understanding of the engineering, business, and customer domains within the power delivery market. The book explores the unique needs of electrical utility grids, including operational technology, IT, storage, processing, and how to transform grid assets for the benefit of both the utility business and energy consumers. It not only provides specific examples that illustrate how analytics work and how they are best applied, but also describes how to avoid potential problems and pitfalls. Discussing security and data privacy, it explores the role of the utility in protecting their customers' right to privacy while still engaging in forward-looking business practices. The book includes discussions of: SAS for asset management tools. The AutoGrid approach to commercial analytics. Space-Time Insight's work at the California ISO (CAISO). This book is an ideal resource for mid- to upper-level utility executives who need to understand the business value of smart grid data analytics. It explains critical concepts in a manner that will better position executives to make the right decisions about building their analytics programs. At the same time, the book provides sufficient technical depth that it is useful for data analytics professionals who need to better understand the nuances of the engineering and business challenges unique to the utilities industry.

Combating Cyber Attacks in Cloud Systems Using Machine Learning

Chapter

Jun 2014

One of the crucial but complicated tasks is to detect cyber attacks and their types in any IT networking environment including recent uptake of cloud services.

Clouds, big data, and smart assets: Ten tech-enabled business trends to watch

Article

Jan 2010

Security Analytics: Big Data Analytics for cybersecurity: A review of trends, techniques and tools

Conference Paper

Dec 2013

The rapid growth of the Internet has brought with it an exponential increase in the type and frequency of cyber attacks. Many well-known cybersecurity solutions are in place to counteract these attacks. However, the generation of Big Data over computer networks is rapidly rendering these traditional solutions obsolete. To cater for this problem, corporate research is now focusing on Security Analytics, i.e., the application of Big Data Analytics techniques to cybersecurity. Analytics can assist network managers particularly in the monitoring and surveillance of real-time network streams and real-time detection of both malicious and suspicious (outlying) patterns. Such a behavior is envisioned to encompass and enhance all traditional security techniques. This paper presents a comprehensive survey on the state of the art of Security Analytics, i.e., its description, technology, trends, and tools. It hence aims to convince the reader of the imminent application of analytics as an unparalleled cybersecurity solution in the near future.

Principles of big data: preparing, sharing, and analyzing complex information

Book

Jan 2013

J.J.Berman

Attribute Relationship Evaluation Methodology for Big Data Security

Conference Paper

Dec 2013

There has been an increasing interest in big data and big data security with the development of network technology and cloud computing. However, big data is not an entirely new technology but an extension of data mining. In this paper, we describe the background of big data, data mining and big data features, and propose attribute selection methodology for protecting the value of big data. Extracting valuable information is the main goal of analyzing big data which need to be protected. Therefore, relevance between attributes of a dataset is a very important element for big data analysis. We focus on two things. Firstly, attribute relevance in big data is a key element for extracting information. In this perspective, we studied on how to secure a big data through protecting valuable information inside. Secondly, it is impossible to protect all big data and its attributes. We consider big data as a single object which has its own attributes. We assume that a attribute which have a higher relevance is more important than other attributes.

The data protection of mapreduce using homomorphic encryption

Conference Paper

May 2013

MapReduce is a programming model which can handle big data efficiently, but the security of intermediate data of MapReduce is not well protected, and the operations on ciphertexts were not allowed. To achieve a trustworthy MapReduce, there is a need to both protect the data, as well as to allow specific types of computations to be carried out on encrypted intermediate data. Traditional encryption solutions can protect the data from being divulged, but they can't be used to compute on encrypt data, however a novel encryption scheme, called fully homomorphic encryption (FHE), could compute over encrypted data without decrypting it. In this paper, we propose a modified framework of MapReduce using homomorphic encryption, in order to gain both the confidentiality of data, as well as the ability of computing on them.

Security Infrastructures: Towards the INDECT System Security

Conference Paper

May 2012

This paper provides an overview of the security infrastructures being deployed inside the INDECT project. These security infrastructures can be organized in five main areas: Public Key Infrastructure, Communication security, Cryptography security, Application security and Access control, based on certificates and smartcards. This paper presents the new ideas and deployed testbeds for these five areas. In particular, it explains the hierarchical architecture of the INDECT PKI, the different technologies employed in the VPN testbed, the INDECT Block Cipher (IBC) – a new cryptography algorithm that is being integrated in OpenSSL/OpenVPN libraries, and how TLS/SSL and X.509 certificates stored in smart-cards are employed to protect INDECT applications and to implement the access control of the INDECT Portal. All the proposed mechanisms have been designed to work together as the security foundation of all systems being developed by the INDECT project.

Big Data Security: Challenges, Recommendations and Solutions

Abstract

Recommended publications

Big Data Security: Challenges, Recommendations and Solutions

An Overview of Big Data Opportunities, Applications and Tools

Challenges and Opportunities of Big Data in Moroccan Context: A Research Agenda

Big Data : le quoi, le pourquoi et le comment