Platform overview with the three phases of the protocol: ​ data-preparation phase​ , ​ data discovery phase​ , and ​ data access phase​ .

Platform overview with the three phases of the protocol: ​ data-preparation phase​ , ​ data discovery phase​ , and ​ data access phase​ .

Source publication
Preprint
Full-text available
The growing number of health-data breaches, the use of genomic databases for law enforcement purposes and the lack of transparency of personal-genomics companies are raising unprecedented privacy concerns. To enable a secure exploration of genomic datasets with controlled and transparent data access, we propose a novel approach that combines crypto...

Contexts in source publication

Context 1
... proposed platform provides these privacy and security guarantees and enables data-discovery analyses and, eventually, access to individual-level genetic data through a secure protocol that synergistically combines all the above-mentioned techniques. This protocol consists of three phases that involve all the parties depicted in Figure 3. ...
Context 2
... data preparation phase consists of four steps, also represented by the sequence diagram in Figure S3. For the sake of the presentation and without loss of generality, we describe the different steps of this phase by considering only a single data provider and a single storage unit. ...

Similar publications

Article
Full-text available
With the advent of cloud computing, the low-cost and high-capacity cloud storages have attracted people to move their data from local computers to the remote facilities. People can access and share their data with others at anytime, from anywhere. However, the convenience of cloud storages also comes with new problems and challenges. This paper inv...

Citations

... Moreover, blockchain technology is being used in many industries from supply chain management to the media and entertainment business. Naturally, there is interest in the capabilities of this technology in genomics as blockchain technology can establish verified and public proof of data ownership [5,6]. However, no one has yet figured out how to store a large amount of data, such as the read stack observed in genome sequencing, in a blockchain. ...
Article
Full-text available
There are major efforts underway to make genome sequencing a routine part of clinical practice. A critical barrier to these is achieving practical solutions for data ownership and integrity. Blockchain provides solutions to these challenges in other realms, such as finance. However, its use in genomics is stymied due to the difficulty in storing large-scale data on-chain, slow transaction speeds, and limitations on querying. To overcome these roadblocks, we developed a private blockchain network to store genomic variants and reference-aligned reads on-chain. It uses nested database indexing with an accompanying tool suite to rapidly access and analyze the data.
... At present costs are also incurred by moving vast amounts of data around. It appears likely that this will be replaced by a growing emphasis on moving algorithms to the data rather than the present system of moving data to the algorithms [108]. As the size of applications is generally much smaller than the data this is in theory much more efficient. ...
Technical Report
Full-text available
Report for the European Commission on technical aspects of digital sequence information. The report adopts a sustainable social ecological systems approach to the analysis of DSI and argues that biodiversity is not free but has to be paid for. On that basis any option on DSI should generate revenue for biodiversity. The report analyses the relationship between samples and sequence data using the US NCBI Biosamples database before turning to consideration of open licences for DSI, blockchain and the possibility of infrastructure based fees from cloud computing to generate income for biodiversity. The paper concludes by arguing for a framework approach to DSI that generates revenue for investment into biodiversity from multiple income generating options.
... Recently, there has been increasing interest in blockchain technology for use in genome privacy and security [125][126][127][128][129][130][131][132][133][134][135][136][137] and recently for the use of transcriptomic data in artificial intelligence (AI) models 109 . Blockchain has several key properties, including a decentralized, distributed architecture and cryptographic protocols that yield immutability, that is, data integrity and security 138 . ...
Article
Full-text available
The generation of functional genomics data by next-generation sequencing has increased greatly in the past decade. Broad sharing of these data is essential for research advancement but poses notable privacy challenges, some of which are analogous to those that occur when sharing genetic variant data. However, there are also unique privacy challenges that arise from cryptic information leakage during the processing and summarization of functional genomics data from raw reads to derived quantities, such as gene expression values. Here, we review these challenges and present potential solutions for mitigating privacy risks while allowing broad data dissemination and analysis. This Perspective highlights privacy issues related to the sharing of functional genomics data, including genotype and phenotype information leakage from different functional genomics data types and their summarization steps. The authors also review the techniques that will enable broad sharing and analysis while maintaining privacy.
Article
Full-text available
Multisite medical data sharing is critical in modern clinical practice and medical research. The challenge is to conduct data sharing that preserves individual privacy and data utility. The shortcomings of traditional privacy-enhancing technologies mean that institutions rely upon bespoke data sharing contracts. The lengthy process and administration induced by these contracts increases the inefficiency of data sharing and may disincentivize important clinical treatment and medical research. This paper provides a synthesis between 2 novel advanced privacy-enhancing technologies—homomorphic encryption and secure multiparty computation (defined together as multiparty homomorphic encryption). These privacy-enhancing technologies provide a mathematical guarantee of privacy, with multiparty homomorphic encryption providing a performance advantage over separately using homomorphic encryption or secure multiparty computation. We argue multiparty homomorphic encryption fulfills legal requirements for medical data sharing under the European Union’s General Data Protection Regulation which has set a global benchmark for data protection. Specifically, the data processed and shared using multiparty homomorphic encryption can be considered anonymized data. We explain how multiparty homomorphic encryption can reduce the reliance upon customized contractual measures between institutions. The proposed approach can accelerate the pace of medical research while offering additional incentives for health care and research institutes to employ common data interoperability standards.