ArticlePDF Available

Effective and Scalable Data Access Control in Onedata Large Scale Distributed Virtual File System

December 2017
Procedia Computer Science 108:445-454

December 2017
108:445-454

DOI:10.1016/j.procs.2017.05.054

License
CC BY-NC-ND 4.0

Authors:

Łukasz Opioła

AGH University of Science and Technology in Kraków

Show all 7 authorsHide

Nowadays, as large amounts of data are generated, either from experiments, satellite imagery or via simulations, access to this data becomes challenging for users who need to further process them, since existing data management makes it difficult to effectively access and share large data sets. In this paper we present an approach to enabling easy and secure collaborations based on the state of the art authentication and authorization mechanisms, advanced group/role mechanism for flexible authorization management and support for identity mapping between local systems, as applied in an eventually consistent distributed file system called Onedata.

Overview of a typical Onedata deployment.

…

An overview of authorization flow for a native client of Onedata.

…

Simplified entity graph with pre-calculated effective members and their privileges.

…

Figures - uploaded by Jacek Kitowski

Content may be subject to copyright.

Content uploaded by Jacek Kitowski

Content may be subject to copyright.

Content uploaded by Jacek Kitowski

Content may be subject to copyright.

Available via license: CC BY-NC-ND 4.0

Content may be subject to copyright.

ScienceDirect

Available online at www.sciencedirect.com

Procedia Computer Science 108C (2017) 445–454

Peer-review under responsibility of the scientiﬁc committee of the International Conference on Computational Science

10.1016/j.procs.2017.05.054

International Conference on Computational Science, ICCS 2017, 12-14 June 2017,

Zurich, Switzerland

10.1016/j.procs.2017.05.054 1877-0509

Peer-review under responsibility of the scientic committee of the International Conference on Computational Science

This space is reserved for the Procedia header, do not use it

Eﬀective and Scalable Data Access Control in Onedata

Large Scale Distributed Virtual File System

Michal Wrzeszcz1,2, Lukasz Opiola1,2, Konrad Zemek2, Bartosz Kryza1, Lukasz

Dutka1, Renata Slota2, and Jacek Kitowski1,2

1Academic Computer Centre CYFRONET-AGH, University of Science and Technology, Krakow,

Poland

2AGH University of Science and Technology, Faculty of Computer Science, Electronics and

Telecommunications, Department of Computer Science, Krakow, Poland

kito@agh.edu.pl, rena@agh.edu.pl

Abstract

Nowadays, as large amounts of data are generated, either from experiments, satellite imagery

or via simulations, access to this data becomes challenging for users who need to further process

them, since existing data management makes it diﬃcult to eﬀectively access and share large data

sets. In this paper we present an approach to enabling easy and secure collaborations based

on the state of the art authentication and authorization mechanisms, advanced group/role

mechanism for ﬂexible authorization management and support for identity mapping between

local systems, as applied in an eventually consistent distributed ﬁle system called Onedata.

Keywords: big data, open data, data management, authorization, security

1 Introduction

Today, more and more research and commercial applications rely heavily on distributed access

to large data sets, including data collected from physical experiments as well as data obtained

through pure simulations or statistical data collected from web applications. Such data sets

are created in distributed infrastructures, by various organizations using heterogeneous storage

systems and are often too large to be completely transferred between data centers for process-

ing. These issues lead to several requirements that are necessary for a modern distributed large

scale data management system, i.e.: transparent data access from any machine, access to large

data sets without completely transferring them to the computational nodes, ﬂexible metadata

support enabling data discovery, support for single- and multi-tenant deployment, secure and

easy data sharing, advanced group and role mechanisms for large groups of collaborators, sup-

port for open data publishing and data access using standard interfaces and protocols including

POSIX and CDMI (Cloud Data Management Interface) [21].