ThesisPDF Available

A Hybrid Privacy-Preserving Deep Learning Approach for Object Classification in Very High-Resolution Satellite Images

Authors:

Abstract

Deep Learning (DL) has been applied to many Remote Sensing (RS) applications and has shown excellent performances. DL is becoming a necessary tool for the RS research community. Recently, many cloud infrastructures have been proposed to facilitate the use of DL architectures as a service. However, this opened the door to new challenges related to the privacy and security of data. The RS data used to train the DL algorithms have several privacy requirements. Some of them need a high level of confidentiality, such as satellite images related to public security with high spatial resolutions. Moreover, even if data are not confidential, they already have copyright issues, and the owner may strictly refuse to share them. Therefore, Privacy-Preserving Deep Learning (PPDL) techniques are receiving great interest. PPDL ensures that data used to train DL can only be accessed by authorized users. In this study, we propose a hybrid PPDL approach for object classification in very high satellite image resolution. We suggested an encryption scheme that combines Paillier Homomorphic Encryption(PHE) and the Somewhat Homomorphic Encryption (SHE). The purpose of this combination is to improve the encryption of satellite images, while ensuring a good runtime and high object detection accuracy. The method proposed to encrypt images is maintained by both public keys of PHE and SHE. Experiments are conducted on a set of real-world high-resolution satellite images acquired using SPOT6 and SPOT7 satellites, from which we extract a dataset of 28,776 image patches. We considered four different CNN architectures, namely ResNet50, InceptionV3, DenseNet169, and MobileNetV2. The results show that the loss in accuracy after applying our encryption algorithm ranges from 2% to 3.5%, with a best validation accuracy on the encrypted dataset attaining 92%.
REPUBLIC OF TUNISIA MINISTRY OF HIGHER EDUCATION AND SCIENTIFIC
RESEARCH UNIVERSITY OF JENDOUBA
FACULTY OF LAW, ECONOMIC AND MANAGEMENT SCIENCES OF JENDOUBA
Master Thesis
Presented to obtain the
Master degree in Data, Knowledge And
Distributed Systems
(Computer science)
by
Manel Khazri Khlifi
A Hybrid Privacy-Preserving Deep Learning
Approach for Object Classification in Very
High-Resolution Satellite Images
Defended on 06 May 2022, in front of the jury composed of:
Dr. Sami Zghal President FSJEGJ, Jendouba
Dr. Mokhtar Sellami Examiner FSJEGJ, Jendouba
Pr. Imed Riadh Farah Supervisor ISAMM, Manouba
Dr. Wadii Boulila Co-Supervisor ISAMM, Manouba
Abstract
Deep Learning (DL) has been applied to many Remote Sensing (RS) applications and
has shown excellent performances. DL is becoming a necessary tool for the RS research
community. Recently, many cloud infrastructures have been proposed to facilitate the use of
DL architectures as a service. However, this opened the door to new challenges related to
the privacy and security of data. The RS data used to train the DL algorithms have several
privacy requirements. Some of them need a high level of confidentiality, such as satellite
images related to public security with high spatial resolutions. Moreover, even if data are not
confidential, they already have copyright issues, and the owner may strictly refuse to share
them. Therefore, Privacy-Preserving Deep Learning (PPDL) techniques are receiving great
interest. PPDL ensures that data used to train DL can only be accessed by authorized users. In
this study, we propose a hybrid PPDL approach for object classification in very high satellite
image resolution. We suggested an encryption scheme that combines Paillier Homomorphic
Encryption(PHE) and the Somewhat Homomorphic Encryption (SHE). The purpose of this
combination is to improve the encryption of satellite images, while ensuring a good runtime
and high object detection accuracy. The method proposed to encrypt images is maintained
by both public keys of PHE and SHE. Experiments are conducted on a set of real-world
high-resolution satellite images acquired using SPOT6 and SPOT7 satellites, from which we
extract a dataset of 28,776 image patches. We considered four different CNN architectures,
namely ResNet50, InceptionV3, DenseNet169, and MobileNetV2. The results show that the
loss in accuracy after applying our encryption algorithm ranges from 2% to 3.5%, with a best
validation accuracy on the encrypted dataset attaining 92%.
Keywords: Privacy-Preserving Deep Learning; Deep Learning; Remote Sensing;
Privacy-Preservation; Convolutional Neural Network; Homomorphic Encryption; Paillier
Homomorphic Encryption; Somewhat Homomorphic encryption.
i
Dedication
With the manifestation of my gratefulness, I devote this unpretentious work to the people to
whom, whatever the words adopted, I could never express to symbolize my honest love to them.
A special feeling of gratitude to my loving parents. ***Habib*** and ***Zohra***, whose
words of encouragement and push never stopped over a number of years.
To my precious sisters ***Olfa*** and ***Nawel*** who never ceased to advise me,
embolden me, and supporting me. May God protect them and pave their way with happiness
and luck.
To my lovely brother who always finds a way to fetch joy and contentment for the family
members.
**Bassem***
To everyone, I’ve known so far: cousins, neighbors, and all my friends. Many thanks for their
immense love and support.
To all the people I love and those who love me.
Manel Khazri Khlifi
ii
Acknowledgement
First and foremost, I have to thank God for always looking after me . I am grateful for every
single thing you sent my way, to supplement this study.
I praise God... Thanks God ...
I would like to express my gratitude to my thesis supervisor, Pr. Imed Riadh Farah for his
support, his innumerable discussions, and his encouragement.
I would like to express my thanks to my Co-supervisor Dr. Wadii Boulila for his leadership,
time, orientation, support, availability, and advice during the master’s program journey.
Ultimately, I would prefer to acknowledge the people I met at the RIADI laboratory of ENSI.
I’m also thankful to the referees and other members of my master thesis defense jury for their
corrections, remarks, and relevant questions.
My thanks and appreciation also go to all the people who have helped me out with their
abilities.
Manel Khazri Khlifi
iii
Table of Contents
Abstract i
Dedication ii
Acknowledgement iii
Lists of Figures vi
Lists of Tables vii
Lists of Algorithms viii
List of Abbreviations ix
1 Chapter 1: Introduction 1
1.1 Introduction.................................... 2
1.2 ProblemStatment................................. 2
1.3 ResearchMotivation ............................... 3
1.4 ResearchObjectives................................ 3
1.5 Significance and Impact of Research . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 ResearchContribution .............................. 4
1.7 Structure of this Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Chapter 2: Background and Literature Review 6
2.1 Introduction.................................... 7
2.2 RemoteSensing.................................. 7
2.2.1 Remote Sensing Imagery Characteristics . . . . . . . . . . . . . . . . 7
2.3 Privacy-Preserving Machine Learning . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 DeepLearning .............................. 8
2.3.2 Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . 8
2.3.3 TransferLearning ............................ 10
2.3.4 Privacy-preserving Deep Learning . . . . . . . . . . . . . . . . . . . . 11
2.4 Comparaison Between PPDL Techniques . . . . . . . . . . . . . . . . . . . . 18
2.5 Relatedworks .................................. 19
2.6 Discussion..................................... 21
2.7 Conclusion .................................... 22
3 Chapter 3: Research Methodology 23
3.1 Introduction.................................... 24
3.2 Hybrid approach to Encrypt for Satellite Images Privacy . . . . . . . . . . . . 24
3.2.1 Paillier Homomorphic Encryption Schemes . . . . . . . . . . . . . . . 26
3.2.2 Somewhat Homomorphic Encryption Schemes . . . . . . . . . . . . . 27
3.2.3 Hybrid Encryption schemes . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.4 Image matrix encrypted by the proposed hybrid approach . . . . . . . . 31
iv
Table of Contents
3.2.5 Pre-trainedmodels ............................ 32
3.3 Conclusion .................................... 33
4 Chapter 4: Experimental Results and Analysis 35
4.1 Introduction.................................... 36
4.2 Study regions and dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.1 Studyregion ............................... 36
4.2.2 Dataset .................................. 38
4.3 ExperimentalSet-Up ............................... 40
4.4 Metrics ...................................... 40
4.4.1 Accuracy ................................. 41
4.4.2 Precision ................................. 41
4.4.3 Recall................................... 41
4.4.4 F1_score ................................. 41
4.5 Results....................................... 41
4.5.1 ImageEncryption............................. 42
4.5.2 Dataaugmentation ............................ 42
4.5.3 Application of transfer learning models . . . . . . . . . . . . . . . . . 43
4.5.4 Evaluation based on Security Parameters . . . . . . . . . . . . . . . . 47
4.5.5 Discussion ................................ 49
4.6 Conclusion .................................... 50
5 Chapter 5: Conclusion and Future Works 51
5.1 Conclusion .................................... 51
5.2 FutureWorks ................................... 51
A Appendix Figures 53
References 56
v
Lists of Figures
1-1 MemorysStructure................................ 5
2-1 Gamp of select Data Science techniques, Artificial Intelligence (AI), Machine
Learning (ML), Deep Learning (DL) and Artificial Neural Networks (ANN) . . 8
2-2 CNN Fadamental Architecture with main layers . . . . . . . . . . . . . . . . . 9
2-3 Different Approaches of PPDL . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2-4 Homomorphic Encryption Steps . . . . . . . . . . . . . . . . . . . . . . . . . 13
2-5 Classes and Properties of Homomorphic encryption . . . . . . . . . . . . . . . 14
2-6 Representation of the secret sharing between n parties. . . . . . . . . . . . . . 15
2-7 Secure Muti-Party Computation. . . . . . . . . . . . . . . . . . . . . . . . . . 15
2-8 Yao’s Garbled Circuit scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2-9 DifferentialPrivacy................................ 17
2-10 Dimensionality Reduction(DR) . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3-1 Proposed Hybrid Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3-2 Encryption pixel value steps . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4-1 StudyArea .................................... 37
4-2 Sampleofimages................................. 39
4-3 DatasetSplitting.................................. 40
4-4 Four Land cover images and their encrypted images . . . . . . . . . . . . . . . 42
4-5 Model architecture used for training on the plain and encrypted datasets. . . . . 43
4-6 Evolution of the validation accuracy during the training of plain and encrypted
images for the 4 CNN architectures. . . . . . . . . . . . . . . . . . . . . . . . 44
4-7 Confusion matrices obtained for the 4 different CNN architectures (from top
to bottom: ResNet50, InceptionV3, DenseNet169, MobileNetV2), on the
validation set, for both the plain (left) and encrypted (right) datasets. . . . . . . 45
4-8 Summary of the results of the four CNN models. The relative FPS corresponds
to the number of frames per second divided by the maximum obtained value
(632 for MobileNetV2). The color of inner sectors (representing algorithms)
corresponds to the average colors of outer sectors belonging to them. The
lighter the color, the better the results. . . . . . . . . . . . . . . . . . . . . . . 47
A-1 InceptionV3 model’s last layers. . . . . . . . . . . . . . . . . . . . . . . . . . 53
A-2 MobileNetV2 model’s last layers. . . . . . . . . . . . . . . . . . . . . . . . . 53
A-3 Resnet50 model’s last layers. . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
A-4 DenseNet169 model’s last layers. . . . . . . . . . . . . . . . . . . . . . . . . . 54
A-5 Sample of training output MobileNetV2 . . . . . . . . . . . . . . . . . . . . . 55
vi
List of Tables
2-1 Summary and Explanation of CNN’s layers . . . . . . . . . . . . . . . . . . . 10
2-2 Descriptions of Transfer learning Methods . . . . . . . . . . . . . . . . . . . . 11
2-3 Advantages And Drawbacks of PPDL Techniques . . . . . . . . . . . . . . . . 19
2-4 Comparison of Studies about hybrid PPML Techniques . . . . . . . . . . . . . 22
3-1 Layer stacking added on each model . . . . . . . . . . . . . . . . . . . . . . . 33
4-1 Number of samples of each land cover type. . . . . . . . . . . . . . . . . . . . 40
4-2 Performance of the 4 CNN models on the validation set of plain and encrypted
data. ....................................... 46
4-3 Comparison of proposed approach and (Alkhelaiwi et al., 2021) for security
parameters. .................................... 49
vii
List of Algorithms
1 : key-generation(p,q) .............................. 27
2 : Encryption(M,pk) ............................... 27
3 : Decryption(c,sk) ................................ 27
4 : SH.keyGenerate(λ) ............................... 28
5 : SH.Encryption(M,pk) ............................. 28
6 : SH.Decryption(c,sk) .............................. 28
7 : HybridKeyGenerate(λ,p,q).......................... 29
8 : HybridEncryption(λ,k1, M,k2) ........................ 29
viii
List of Abbreviations
Abbreviations Definition
AI Artificial Intelligence
ANN Artificial Neural Networks
CC Correlation Coefficient
C ciphertext
CNN Convolutional Neural Network
DA Data Augmentation
DL Deep Learning
DP Differential Privacy
DR Dimensionality Reduction
DT Decision Trees
FHE Fully Hommorphic Encryption
FL Federation Learning
GC Garbled Circuit
GPU Graphics Processing Unit
H Height
HE Hommorphic Encryption
M plaintext
ML Machine Learning
MSE Mean Square Error
NN Neural Networks
PCA Principal Component Analysis
PHE Partially Hommorphic Encryption
PK Public Key
PP Privacy-Preserving
PPDL Privacy-Preserving Deep Learning
PPML Privacy-Preserving Machine Learning
PSRN Peak Signal-to-Noise Ratio
PT Plaintext
RAM Random Access Memory
ix
List of Abbreviations
RS Remote Sensing
RSA Rivest, Shamir et Adleman
SHE Somewhat Hommorphic Encryption
SK Secret Key
SMPC/SMC Secure Multy-Party Computation
SS Secret Sharing
SSIM Structural SIMiliarity index
SVM Support Vector Machines
TL Transfer Learning
UACI Unified Average Change Intensity
VHR Very High Resolution
W Width
x
Chapter 1: Introduction
Contents
1.1 Introduction .................................. 2
1.2 ProblemStatment ............................... 2
1.3 ResearchMotivation.............................. 3
1.4 ResearchObjectives.............................. 3
1.5 Significance and Impact of Research . . . . . . . . . . . . . . . . . . . . . 3
1.6 ResearchContribution ............................ 4
1.7 Structure of this Research . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1
Chapter 1: Introduction
1.1 Introduction
Satellite images are invaluable for many Remote Sensing (RS) applications(Bakaeva & Le,
2022; Ferchichi et al., 2017; Pan et al., 2022). Governments, as well as commercial companies,
heavily need them. The qualities and the viewpoints of these images are highly diverse and
depend on the satellite altitude, the camera sensor, and the RS application targeted (Xiao et al.,
2004).
In addition, the extraction of useful satellite data information is held by the satellite
image classification, which has an essential role in this field. The acquisition of their satellite
images is made by Remote Sensing (RS) that can be presented as the tool, which gathers the
data in the objects within the earth ground lacking in coming in contact with them (Ayadi et al.,
2022; Dhingra & Kumar, 2019).
Remote sensing imagery presents famous or important data which is helping the scientific
researches during their experiments and applications due to the useful information (Chebbi
et al., 2016). Among them, the Very High Resolution (VHR) Satellite images cover large
areas of the earth and contain massive data and information (Hajjaji et al., 2021). It represents
one of the most valuable types of images for information extraction. The processing of these
images requires a good architecture that can illustrate images’ feature representation. In late
years, Deep Learning techniques have proven high efficiency in many tasks (e.g., speech
recognition, medical imagery, agriculture, etc.) (Al-Sarem et al., 2021; Atitallah et al., 2020;
Boulila, Alzahem, et al., 2021). Since their appearance in the machine learning area, they have
actively demonstrated an attractive capability to learn patterns existing in the data. They can
work in a data-driven mode without hand-crafting features. Many architectures and models
are introduced for the different tasks that we can face in the remote sensing area, such as
classification of ground images, prediction of environmental characteristics, mapping of the
ground envelope, recovery of the natural changes, and the human activities on the ground,
information fusion, public safety, urban life enhancement, and data building and prediction
(Yuan et al., 2020). Classification of ground images remains one of the most targeted tasks.
This is because the need for automatically classifying ground surfaces into understandable
human classes is ubiquitous.
1.2 Problem Statment
The real constraint of DL algorithms is to possess the resources to run. Especially when
working on large data. Standard desktop and laptop computers fail in this mission due to the
high GPU and RAM characteristics needed. As an alternative, many cloud infrastructures have
been designed in recent years to allow us to train and test DL algorithms from any computer
remotely. The motivation is to free oneself from any environment setup and focus only on
the code. This method is becoming more and more popular among the machine learning
community. However, it necessitates that the client uploads the data to the cloud for the
training and testing phases. Moreover, here arises the matters related to privacy, confidentiality,
data protection, and copyright issues (Boulemtafes et al., 2020; Tanuwidjaja et al., 2019).
Addressing these problems, many researchers discover various techniques to private the
confidentiality of useful and sensitive information. Solving these issues, Privacy-Preserving
Deep Learning is a better solution that has many approaches to ensuring the privacy of useful
2
Chapter 1: Introduction
information. Nowadays, ML has been used in many fields as well as speech recognition images,
in medical, agriculture highlighting the advancement of these approaches. Furthermore, DL
presented one of these approaches which were utilized for many applications of their area
for her capacity to train big data. Besides, the capability of DL to predict and to have a high
accuracy without visual information.
Some datasets are highly confidential and strictly forbidden to disclose, like images related to
military areas or showing private information. Other datasets may not be confidential, but need
high effort to collect, and the user may be reluctant to share them. In most cases, the users
uploading their datasets to the cloud refuse to disclose them, and the reason behind that may
differ. Hence, this explains the necessity to adopt Privacy-Preserving Deep Learning (PPDL).
1.3 Research Motivation
One main approach for PPDL is to adopt Homomorphic Encryption (HE) under two conditions.
The first condition is that the data is encrypted locally on the user’s machine before sending
them to the server for training. The second condition is that the server does not decrypt them
at all. This applies to the encrypted data during the training and inference phases. Thus, the
data will not be shared in any exploitable format. The data in its original format exists only
on the user’s machine. Two main techniques for HE exist already in the literature. The first is
the PHE. The second is the so-called SHE. These two techniques are always used separately
without combining them. Both of them have their Pros and Cons. For the first time in literature,
we propose an architecture that combines them to make a hybrid approach between PHE and
SHE. Then, we apply this hybrid approach to the classification task of VHR RS images.
1.4 Research Objectives
The principal contributions of our study can be recapped as underneath:
Propose an efficient technique that leverages PPDL-based techniques to assure the
privacy of big satellite images that will be processed in the public cloud.
Perform experiments to evaluate the performance and efficiency of the proposed
technique using a real-world dataset.
1.5 Significance and Impact of Research
With the progressed growth in satellite and network technologies, the processing and transfer
of VHR satellite images over the internet and distributed cloud environments have needed high
protection. This presents a new challenge concern to guarantee the security of the sensitive and
fundamental satellite images collected processed by cloud service providers during machine
learning training and to protect them when an enemy extracts delicate information and applies
them for illegitimate goals. Therefore, the biggest obstacle for the exploration of image big
data is the privacy and the safety of them that must be warranted when these are transmitted
over the internet and the cloud.
PPDL methods are currently utilized to preserve images that are uploaded to the public
cloud to use DL methods. This study aims to use a hybrid technique that combines partially
3
Chapter 1: Introduction
homomorphic encryption (PHE) with somewhat homomorphic encryption (SHE) for big
satellite images and support other researchers in the field to investigate the result of this
method on big satellite images.
1.6 Research Contribution
Our approach is the first proposed work in the literature, which carry out a hybrid PPDL for
satellite images’ data. In particular, the main contribution of our proposed study summarize as
follows :
We introduced a hybrid PPDL approach, which combines PHE and SHE for satellite
image classification. This combination will improve the security of encrypted images.
Using only PHE, such as the work proposed in Alkhelaiwi et al. (Alkhelaiwi et al., 2021),
can lead to many security issues (Xiong et al., 2018). However, integrating SHE in the
encryption scheme will ensure more robustness to encrypted images while maintaining
an excellent computational complexity and excellent runtime thanks to the shorter bits-
length of SHE.
To evaluate its efficiency, we applied the proposed hybrid PPDL approach to several DL-
based CNN models, namely ResNet50, InceptionV3, DenseNet169, and MobileNetV2.
Several experiments on real-world satellite image datasets are conducted to evaluate the
performance of the proposed approach in terms of accuracy, runtime, and security.
1.7 Structure of this Research
This research is divided as follows:
Chapter 1: Includes this introductory branch. Given the foreword of this research and
supplying a comprehensive description of the problem statement. Then, presents the research
motivation and objectives. After that, we provide significance, impact, and contribution of
research.
Chapter 2: Describes Remote Sensing (RS) with its characteristics. Therefore, after
presenting the comparison techniques, we present the backgrounds of PPML/PPDL techniques
and the fundamental architecture of CNN. In another subsection, we review the related works
with discussion.
Chapter 3: Provides the proposed approach of our research. Given an itemized description
of the hybrid proposed approach. Then, it provides the presentation of pre-trained models, its
goals, and the layers of a new classifier.
Chapter 4: Highlights the description of the study regions and dataset. Next, presents
the experiment hardware and software. Further, detail the different results of our experiments
and discuss the performance of the proposed method.
Chapter 5: Concludes this research and addresses prospectively works.
4
Chapter 1: Introduction
Figure 1-1:
Memory’s Structure
5
Chapter 2: Background and Literature Review
Contents
2.1 Introduction .................................. 7
2.2 RemoteSensing ................................ 7
2.2.1 Remote Sensing Imagery Characteristics . . . . . . . . . . . . . . . 7
2.3 Privacy-Preserving Machine Learning . . . . . . . . . . . . . . . . . . . . 7
2.3.1 DeepLearning ............................. 8
2.3.2 Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . 8
2.3.3 TransferLearning ........................... 10
2.3.4 Privacy-preserving Deep Learning . . . . . . . . . . . . . . . . . . . 11
2.4 Comparaison Between PPDL Techniques . . . . . . . . . . . . . . . . . . 18
2.5 Relatedworks ................................. 19
2.6 Discussion ................................... 21
2.7 Conclusion ................................... 22
6
Chapter 2: Background and Literature Review
2.1 Introduction
This chapter will present Remote Sensing (RS) with characteristics, the different approaches of
privacy-preserving machine learning, including the representation of the basic architecture of
CNN, and will provide the advantages and drawbacks of each method. Finally, it will provide
the different related works with discussion.
2.2 Remote Sensing
Remote Sensing (RS), "Remote" means distant or at a large distance, and "Sensing" signifies
smelling or sweeping, can be widely described as the science which accumulates and
interpolates spatial information on the Earth’s surface. In other words, without any physical
contact with the object, RS is the way that collects the information concerning this object
(as well as area, event)(Coops & Tooke, 2017; Sanderson, 2010). A few years ago, RS
presented an essential implement in many areas such as ecology, geography, geomatics, and
resource monitoring which be presented much important information on the Earth’s surface
spatial patterns that could be grown by the images captured by the air (Coops & Tooke, 2017).
In natural resource management, the remotely data will be an important tool that’s why the
capability of RS to know and control land surfaces and environmental conditions has extensive
greatly over the last several years (Boulila et al., 2017; Boulila et al., 2009; Boulila et al., 2010;
Ferchichi et al., 2017; Sanderson, 2010).
2.2.1 Remote Sensing Imagery Characteristics
The data collected by each satellite sensor references the three key characteristics, including the
size of single pixels, global spatial extent, the time interval between acquisitions (Sanderson,
2010).
Spatial resolution: corresponding to the area of the earth (the size of a pixel in the
satellite image of the ground surface) seen instantaneously by the sensor.
Spectral resolution: means the ability of a satellite sensor to measure the number and
width of spectral bands in the detection means.
Temporal resolution: or repeatability means the time between two scans when the
sensor is returned to the same area of the Earth’s surface, depending on the satellite’s
orbit.
2.3 Privacy-Preserving Machine Learning
Today, Machine Learning(ML) is developed on almost a century of promotion in science and
technology. In many fields (playing video games and artificial intelligence), it is a discipline
of study which empowers computers to learn from data without explicitly programming (Al-
Rubaie & Chang, 2019). Some of its most significant advancements were under-appreciated in
the past. Nonetheless, the regular usage of ML/DL does not protect data privacy. That is why
researchers try to find a privacy-preserving way of training and testing ML/DL models.
However, Privacy-preservation (PP) is among the most important subjects of study in
data security, and it has become a severe preoccupation with the increasing awareness about
7
Chapter 2: Background and Literature Review
personal data protection in recent years (Dhanalakshmi & Sankari, 2014).
Privacy-preserving machine learning is among the most advantageous application domains for
cryptographic safeness of computer operations. ML model training requires a large dataset that
may include confidential information, especially private data. Besides, the model parameters
should only be accessible to the model owner. Data and model owners’ privacy is essential to
implement privacy-preserving machine learning protocols (Zapechnikov, 2020).
2.3.1 Deep Learning
DL is a young sub-class of Artificial Intelligence, which is concentrated on artificial Neural
Networks comprising nodes (similar to cellular bodies) interacting with other nodes through
the means of connections (similar to both axons and dendrites). Hereafter, DL is a field of ML
that describes as a section within the AI, as addressed within the figure below (Choi et al., 2020;
Gruson et al., 2019; Tanuwidjaja et al., 2020). This method represents the range of ML Models
that describe data at diverse grades of abstraction through several processing layers (Pritt &
Chern, 2017).
Figure 2-1:
Gamp of select Data Science techniques, Artificial Intelligence (AI), Machine Learning (ML),
Deep Learning (DL) and Artificial Neural Networks (ANN)
2.3.2 Convolutional Neural Network
In literature, there are multiple architectures of Deep Learning utilizing for data processing,
and CNN represented the approach that is the extensively used collective kinds for image-based
data processing (Boulemtafes et al., 2020).
Making predictions, classifying, or regressing datum, CNN is the DL model which is
employed (Marcano et al., 2019). This approach is specialized in feedforward Artificial Neural
Networks (ANNs) to obtain elaborate pieces in the input data to be analyzed. CNN supplies
the best choice for image classification, which utilized delicate information.
The high-level pieces from the raw data features are extracted automatically by CNN,
this characteristic can surpass the human-designed features. For instance of the field such
8
Chapter 2: Background and Literature Review
as image segmentation and recognition, this class of deep learning has carried significant
improvements (Feng et al., 2020).
2.3.2.1 Convolutional Neural Network Architecture
CNN’s fundamental architecture includes ve types of layers; input, convolutional, pooling
(subsampling layers), fully connected, and output layers. Only CNN has characteristically
numerous convolutional layers (You et al., 2019).
Stack of CNN layers changes an input layer (existing input to classify) into an outcome
layer, containing the label scores. Leaving out the input layer, every neuron of one layer is the
outcome of a function involved in the neurons of the prior layer. Not many well-defined sets of
layers are generally related such as the layers: fully connected, convolutional, activation, and
pooling (Chabanne et al., 2017). Additionally, the CNN’s layers are described in the Table 2-1
Figure 2-2:
CNN Fadamental Architecture with main Layers.
9
Chapter 2: Background and Literature Review
Table 2-1: Summary and Explanation of CNN’s layers
Noun of
layers
Explanation
Convolutional The convolutional layer is the first layer that perform the input convolve (Pandya et al., 2019).
The several features from the input image are extracted by this first layer. CL transforms
the image into a matrix. The output of this layer is named Feature map that is generated
by applying Filters to the input data (image). This is used by the next layer due to having
information about this image.
Pooling The pooling layer presents another construction layer of a CNN. It is employed to decline the
sizes of the Feature Maps. Therefore, it lowers the digit of parameters to let know and the
computation quantity accomplished in the network.
Fully
Connected
The Fully Connected Layer is usually located before the output layer and set the last few
layers of a CNN Architecture. Neurons in this layer hold full connections to all output
neurons layer.
Activation The Non-Linear function is defined as the activation layer. It practices the mathematical
means on the convolutional layer’s outcome. There exist numerous functions used as like
Rectified Linear Unit (ReLU), sigmoid, and tanh (Tanuwidjaja et al., 2020).
2.3.3 Transfer Learning
Transfer learning is a DL method utilizes to transfer knowledge needed from one to resolve
another, which prevents overfitting (Kandel & Castelli, 2020). Due to the problem of collecting
a large amount of data for classification in CNN’s model and the real-world issue of difficulty
to matching training and testing data, transfer learning presented the familiarity needed to
resolve these problems (Khamparia et al., 2020). TL is used in models trained on small
and large datasets (Ng et al., 2015; Serra et al., 2018). Thus, the different transfer learning
techniques using in this work are DenseNet169, MobileNetV2, InceptionV3, and Resnet50,
which are described in Table 2-2.
10
Chapter 2: Background and Literature Review
Table 2-2: Descriptions of Transfer learning Methods
TL methods Input size Dataset’s
base
Description
DenseNet169
(Huang et al.,
2017)
224x224x3 ImageNet DenseNet-169 is one of the Densely Connected
Convolutional Network models proposed by Huang
et al. This CNN architecture has a convolution and
clustering layer at the first, 3 transition layers, 4 dense
blocks, and the classification layer at the end. Further,
the concatenation of the input feature maps of every old
under-block is used as well as the input feature map of
some under-block. This type of TL method assistance
solves leakage gradient issues and reduces the number of
parameters.
MobileNetV2
(Sandler et
al., 2018)
224x224x3 ImageNet MobileNetV2 was established by Sandler et al in 2018.
It is a NN architecture that is founded for limited mobile
devices and resource environments. The target of this
architecture is to diminish the numeral of operations
and the store requirement while maintaining the same
precision in the vision models of electronic devices for
mobile. The filters of the first fully convolution layer are
32 in number. This layer is succeeded by 19 residual
bottleneck layers.
InceptionV3
(Szegedy
et al., 2016)
229x229x3 ImageNet Inception was nominated by GoogleNet in 2014, then
Inception-V3 was invited in 2015 and its architecture
was supported by 48 layers. The reduction parameter in
this pre-trained model is done by factorized convolutional
layers.
Resnet50
(Mahmood
et al., 2020)
224x224x3 ImageNet ResNet50 presented a variant of ResNet architecture. This
architecture is formed of 5 phases, each stage with a
convolution and identification block. The construction of
each convolution block is composed of three convolution
layers, and the components of each identification block are
also three convolution layers. The pre-trained network can
order the dataset into 1000 object categories.
2.3.4 Privacy-preserving Deep Learning
A diversity of PP techniques is used to address data privacy in DL models (during training
and testing phases). These techniques can be organized into three groups: Cryptographic
approaches, Perturbation approaches, and Hybrid techniques. They are succinctly presented
in Figure 2-3.
11
Chapter 2: Background and Literature Review
Figure 2-3:
Different Approaches of PPML
2.3.4.1 Privacy Preserving Cryptographic Approaches
Cryptography is the science to secure information in ML training/testing. The amount of
data needed of ML, it protects by this technology. The following sub-section describes each
cryptographic technique used to preserve privacy in Deep Learning models.
A. Homomorphic Encryption (HE)
Homomorphic Encryption(HE) is defined by a cryptographic technique/approach that can
secure communication between multi parties. The encryption and decryption of the message
are made differently in parties (Wood et al., 2020). These are cryptosystems that can be
calculated on encrypted data. This enables encrypting the users’ data for generating cipher
texts that allow the achievement of clear text using arithmetic operation (Domingo-Ferrer
et al., 2019). Due to the importance of HE for the secure transport, storage, and processing
of encrypted data, financial and medical industrial fields are becoming some of the different
fields, which can present the success of this approach (Ogburn et al., 2013). Due to its ability
to calculate cipher data and its safety and privacy that guarantees to the data owner, a large
HE’s applications existed in different fields such as healthcare, medical applications, financial
sector, forensic applications, social networking advertisements, and smart vehicles, whither
preserving users’ confidentiality is of paramount importance (Shrestha & Kim, 2019).
Homomorphic Encryption Steps
HE schemes comprise four steps, namely Key Generation, Encryption, Decryption, and
Evaluation.They are explained as follows (Kaaniche et al., 2020; Parmar et al., 2014) and
illustrated in Figure 2-4.
12
Chapter 2: Background and Literature Review
Figure 2-4:
Homomorphic Encryption Steps.
Key Generation: the customer will produce the public parameter and the secret and
public key which are (PK, SK).
Encryption: the customer will generate ciphertext(C) using PT and the SK, and the C
will be put to the server.
Evaluation: the evaluation of C is made by the server’s function that is executed using
the PK.
Decryption: the customer will obtain the original text(M) by decrypting the generated
evaluation using the SK.
In the inference phase (Shrestha & Kim, 2019):
1) The encryption and storage of the user data are done in the cloud.
2) The consumer sends information about the demand to the Cloud server.
3) Not needing to know the contents of encrypted data, the data encrypted by HE can be fed
to the model in the Cloud server, which transfers back the encrypted result to the consumer.
Through the secret key, the user is able to decrypt the result. Thus, the data’s security and
privacy are preserved.
Homomorphic encrypted classes
According to the allowed set of mathematical operations, HE is subdivided into three groups:
PHE, Fully Homomorphic Encryption (FHE), and SHE (Domingo-Ferrer et al., 2019; Shrestha
& Kim, 2019). Their relationship is presented in Figure 2-5:
The Figure 2-5 presents the differents subclasses of HE and its properties, which are decribed
bellow :
Pallier Homomorphic Encryption: The scheme of PHE sustains only a single arithmetic
operation ( addition or multiplication operation) that is allowed on the encrypted message
with a limitless range.
Somewhat Homomorphic Encryption : The SHE scheme supports the two arithmetic
operations (addition and multiplication operation) which are permitted on the encrypted
message with a limited digit of methods.
13
Chapter 2: Background and Literature Review
Figure 2-5:
Classes and Properties of Homomorphic encryption
Fully Homomorphic Encryption : In FHE, an unbounded digit of methods, both
additions, and multiplications have been used to encrypt a message.
B. Secret Sharing (SS)
Secret Sharing(SS) is one of the master’s approaches to protect essential data from getting
wasted, ruined, or into the wrong hands. The first scheme of SS was submitted by Shamir
and Blakley in 1979 (Pang & Wang, 2005). Afterward, Shamir’s Secret Sharing allowed for
distributing a secret between k shares with no information about the original secret (Zhou et al.,
2020). The scheme of Secret Sharing is composed of pair functions: Share and Reconstruct.
The reconstruct function of an SS requires k shares generated by the Share function (Zhou et
al., 2020).
Share : is a randomized algorithm that carries a secret s, a threshold t h, and nthat is an
input and produces nshare of s with th n
S(s,th,n) {s0,s1, ..., sn}(2.1)
Reconstruct : is a deterministic algorithm that has threshold th, a subgroup of the share
of grith mas inputs for recovering the secret s with mth .
Figure 2-6 illustrates the division of secret between several shareholder that must cooperate for
reconstructing it.
14
Chapter 2: Background and Literature Review
Figure 2-6:
Representation of the secret sharing between n parties. From (n.A, 2021)
C. Secure Multi-Party Computation(SMPC)
The concept of secure multi-party computation is a sub-domain of cryptography that allows for
confidential computation over sensitive data. It corresponds to the computation of a function
by several participants in a dispenser environment. It is used in various applications, like cloud
computing, mobile computing, and the Internet of Things, to guarantee security and privacy
(Zhang et al., 2018).
Figure 2-7:
Secure Muti-Party Computation. From (Zhao et al., 2019)
15
Chapter 2: Background and Literature Review
D. Garbled Circuit (GC)
In 1986, the Garbled circuit was invented by Yao. This method is one of the methods for
safe multi-party accounting. Thus, its requirements need only communication round but all
functions are presented as a Boolean circuit. This method of PP is used where two parties will
maintain the execution of the function computed for parties’ private data. Oblivious transfer
can be used to share the circuit between sender and receiver of a message. It is used to transfer
the garbled circuit obtained by Alice’s conversion function and Bob’s result of the required
function (Al-Rubaie & Chang, 2019; Qayyum et al., 2020; Tanuwidjaja et al., 2020).
Figure 2-8:
Yao’s Garbled Circuit scheme.From (Ambrosin et al., 2017)
2.3.4.2 Perturbation Approaches
Perturbation approaches are based on modifying input values to maintain individual record
confidentiality. They are subdivided into two main techniques.
A. Differential Privacy (DP)
DP was formalized by C. Dwork in 2006 (Tanuwidjaja et al., 2020). A randomized algorithm
protects the data by adding perturbation into datasets. The main principle of differential
privacy is that for any dataset, the algorithm’s output is unaffected by the containment or
exception of any individual register. Where A provides ε-differential privacy if for all data sets
D1and D2varying on at most one component, and all SRang(A),
Pr[(A(D1)) S]exp(ε)Pr[(A(D2)) S](2.2)
where εis a privacy parameter (Qayyum et al., 2020; Tanuwidjaja et al., 2020).
16
Chapter 2: Background and Literature Review
Figure 2-9:
Differential Privacy1
B. Dimensionality Reduction (DR)
Dimensionality reduction (DR) diminishes the number of variables under consideration in the
feature space by extracting a set of principal variables using lossy-encoding.
1https://www.winton.com/research/using-differential-privacy-to-protect-personal-data
17
Chapter 2: Background and Literature Review
Figure 2-10:
Dimensionality Reduction(DR)2
2.3.4.3 Hybrid Approaches
The use of one PP method can be insufficient to guarantee data privacy. Hence, hybrid
approaches combine several Privacy-Preserving Machine Learning (PPML) techniques to
enhance data protection.
Privacy-preserving federation learning (Truex et al., 2019) is a combination between SMPC and
DP. For the training, this approach uses Decision Trees (DT), convolutional neural networks
(CNN), and linear Support Vector Machines (SVM). This method guarantees data protection
against adversaries and collusion threats.
The explanation of the other hybrid approaches is present in the section 2.5.
2.4 Comparaison Between PPDL Techniques
There subsists an increasing of privacy-preserving methodologies, which are used to secure
sensitive information in machine learning. This section presents the different privacy-
preserving approaches that have developed in this field with its advantages and drawbacks.
They are illustrated in Table 2-3.
In general, HE has a meaningful role to play in guaranteeing the confidentiality of sensitive
information. When we use HE to secure sensitive data, it ensures that all information is
presented without any loss. However, it is computationally expensive, and this approach
has bandwidth and latency issues. One of its sub-methods, which is FHE, is still inefficient
in many cases. Compared to HE, using SMC can protect against computationally strong
adversaries, and it is inexpensive and less computationally complex compared with FHE. But,
this approach requires significant communication overhead. Furthermore, the major drawback
of DP is the waste of information due to the use of a large dataset. Hereafter, the storage space
2https://medium.com/analytics-vidhya/importance-of-dimensionality-reduction-d6a4c7289b92
18
Chapter 2: Background and Literature Review
and execution time are two things that are reduced by DR. Finally, SS is more computationally
complex.
Table 2-3: Advantages And Drawbacks of PPDL Techniques
Methodologies Advantages Drawbacks References
Homomorphic
encryption
(HE)
-Performing inference on encrypted
data. The model owner has no
access to the client’s private
information and subsequently
cannot leak it or abuse it.
-Higher standard of sensitive data
-No loss of information.
-FHE supports any type of
operation.
-Computationally pricey which
affects running time.
- Bandwidth and latency concerns.
-PHE and SHE are limited to
specific types of calculations.
-Increasing the total cost of
ownership.
-FHE is still ineffective and under
experimentation.
(Byun, 2019;
Lopardo A
et al., (n.d);
Medhi, 2019)
Multi-Party
computation
(SMPC)
-No need for a trusted third party.
-Sensitive information is not
revealed to any party.
-Inference is performed on
encrypted data.
-The parties get only the resulting
analysis or model.
-Protects against computationally
powerful adversaries.
-Less computationally costly and
complex than FHE.
-Computationally intensive.
-Important communication
overhead.
-Assumptions must be made about
the proportions of malevolent
coordinating parties in the
calculation.
(A Lapardo
et al., 2020;
Medhi, 2019)
Differential
Privacy (DP)
-Formal mathematical proof.
-Privacy guarantee.
-The user can set the suitable level
of safety.
-When datasets are bulky, noise and
loss of information may occur.
(Medhi,
2019)
Dimensionality
Reduction
(DR)
-Reduced storage space and
execution time.
-The suppression of
multicollinearity improves the
interpretation of the ML model
parameters.
-Reducing data to very low
dimensions such as 2D or 3D,
makes it easier to visualize.
-Partial data loss.
-PCA fails when the mean and
covariance are not sufficient to
specify datasets.
((n.A), (n.d);
Singh, 2020)
Secret sharing
(SS)
-Provides the best efficiency.
-Individual shares can be easily
modified without changing other
shares.
-Shares can be modified while
keeping the same secret.
-Supplying more than one share per
person.
-Computationally complex. (Al-Rubaie &
Chang, 2019;
Kasar, 2016)
2.5 Related works
We are interested in hybrid approaches related to PPML, therefore we review the existing
works. This section also presents a comparison between our research and earlier works.
Approaches based on SMPC and DP
19
Chapter 2: Background and Literature Review
Truex et al. (Truex et al., 2019) developed a technique that combines DP and SMPC. When
the number of parties is vast with comparatively little quantities of data per, the utilization
of DP leads to low accuracy. At the same time, the use of SMPC poses the problem of
vulnerability in the inferencing phase. To attenuate these drawbacks, Truex et al. built an
enhanced Federated Learning (FL) system combining DP and SMPC. This combination is
scalable, ensures protection against adversary threats, and builds models with high accuracy.
Therefore, guaranteeing privacy without sacrificing accuracy, the authors developed protocols
based on the hybrid method, which uses DP and SMPC. Hence, this approach allows training
several ML models in an FL fashion for various trust scenarios.
Another hybrid approach was suggested by Chase et al (Chase et al., 2017), who built a private
collaborative framework for machine learning combining SMPC and DP. This method uses DP
for privacy and takes advantage of neural networks for the machine learning part.
Approaches based on HE and SS
Chen et al (Chen et al., 2021) have suggested a hybrid approach setup on homomorphic
encryption and secret sharing. Using HE alone poses potential security risks, while using SS
alone has efficiency issues in the case of high-dimensional sparse features. The amount of
transaction data between individual users and large merchants raise a data isolation problem.
Fulfilling the necessities of robustness and interpretability, the implementation of CAESAR
merges between HE and SS to construct a Secure Large-Scale Sparse Logistic Regression
Model (SLSSLRM) and achieve effectiveness and safeness bothy.
Approaches based on all properties of HE
El Makkaoui et al (El Makkaoui et al., 2016) studied hybrid approaches supporting all
homomorphic properties. They considered the subcategory methods of Homomorphic
encryption that used a limited number of operations (PHE) and developed a new hybrid
scheme that preserves the algebraic structure of a ring homomorphism and offers robustness
against confidentiality attacks.
Approaches based on SMPC and functional encryption
In Xu et al. (Xu et al., 2019), the authors proposed an approach entitled HybridAlpha. It is a
method for PP Federated Learning that uses an SMPC protocol and functional encryption. The
authors consider disclosing the training data used in parameter interaction and the resulting
model. The evaluation of HybridAlpha used the FL procedure to train a CNN on the MNIST
dataset while minimizing the training duration and data magnitude exchanged.
Remote sensing data
In (Alkhelaiwi et al., 2021), Alkhelaiwi et al. proposed to use PHE for satellite image
encryption. The authors evaluated their approach on a satellite image dataset, using plain and
encrypted images. They developed a CNN model that can extract the feature and predict the
outcome using several transfer learning models. In this research, the data owner encrypts their
data and then transmits it to the cloud server. The classification of all pre-training CNN models
is good due to the higher accuracy obtained. Hence, this approach is the first PPDL application
on remote sensing data. To the best of our knowledge, this is the only work that proposed
using PPDL in the context of RS. The main limitation of this work is the secret key, which the
adversarial user can steal, and the execution time due to the number of bits to generate the key.
20
Chapter 2: Background and Literature Review
2.6 Discussion
The previous works carried on the ability of the hybrid PPML techniques to preserve the
confidentiality of data utilized in processing ML data. After, DL is one of the widely models
used in ML. And based on the limitation of the related work (Alkhelaiwi et al., 2021), the goal
of our study is to propose a hybrid encryption scheme that ensures better security to encrypted
images while reducing the execution time of image encryption and maintaining a good accuracy
of DL algorithms. So, Table 2-4 depicts a comparison of different hybrid PPDL approaches
according to the application domain, PPDL method, and dataset.
21
Chapter 2: Background and Literature Review
Table 2-4: Comparison of Studies about hybrid PPML Techniques.
REF
(Truex et al.,
2019)
(Xu
et al.,
2019)
(Chen et
al., 2021)
(El
Makkaoui
et al., 2016)
(Chase et
al., 2017)
(Alkhelaiwi
et al., 2021)
This
proposed
research
Domain of application
•Nursery
School
Applications
•Gray-Scale
Images
•Feature
Selection
Challenge
•Gray-
Scale
Images
•Industry
•Sparse
feature
•Gray-
Scale
Images
•Vegetation,
road, bare
soil and
urban
images
•Building,
Vegetation,
Bare soil
and Road
images
PPML methods
HE
SS
SMC
GC
DP
DR
ML models
•Decision
Trees (DT)
•Convolutional
Neural
Networks
(CNN)
•Linear
Support
Vector
Machines
(SVM)
•CNN •logistic
regression
•Small
NN(3-
layer)
•Large
NN(4-
layer)
•CNN •CNN
Dataset
Nursery
dataset from
the UCI
Machine
Learning
Repository
•MNIST
dataset gisette
dataset used
for NIPS
2003 Feature
Selection
Challenge
•MNIST
dataset
•Real-
world
dataset
•MNIST
dataset
•Satellite
Dataset
•Satellite
Dataset
2.7 Conclusion
This chapter presents Remote Sensing (RS) with characteristics. Moreover, it provides
several privacy-preserving machine learning techniques and presents the background of the
fundamental architecture of CNN. Besides, it offers the advantages and drawbacks of each
method, and the different related works with discussion. The next chapter will describe the
proposed approach and present the pre-training models and their goals.
22
Chapter 3: Research Methodology
Contents
3.1 Introduction .................................. 24
3.2 Hybrid approach to Encrypt for Satellite Images Privacy . . . . . . . . . 24
3.2.1 Paillier Homomorphic Encryption Schemes . . . . . . . . . . . . . . 26
3.2.2 Somewhat Homomorphic Encryption Schemes . . . . . . . . . . . . 27
3.2.3 Hybrid Encryption schemes . . . . . . . . . . . . . . . . . . . . . . 28
3.2.4 Image matrix encrypted by the proposed hybrid approach . . . . . . . 31
3.2.5 Pre-trainedmodels ........................... 32
3.3 Conclusion ................................... 33
23
Chapter 3: Research Methodology
3.1 Introduction
Cryptography recreates a crucial role in the security of our daily data. However, the algorithms
used in practice have particular limitations, including the impossibility of modifying the cipher
data. So, homomorphic encryption is a solution that aims to secure the modification of digital
data by third-party services.
In this chapter, we will explain the use of two different homomorphic encryption approaches
to encrypt satellite images and provide the research design of our methodology which is a
combination between these two approaches. Then, we will detail each approach only by
its algorithms. After that, the proposed hybrid approach will be described, including its
subsections, and will be used to secure satellite image features and classify them. The last
section (before the conclusion) in this chapter is to present the pre-trained models, the goal
of these models, and to propose the different layers of our new classifier used to predict the
classes of encrypted images.
3.2 Hybrid approach to Encrypt for Satellite Images Privacy
The architecture of the proposed hybrid approach is illustrated in Figure 3-1. Therefore, in
this proposed hybrid privacy-preserving deep learning, CNN models can be trained and tested
on data encrypted by PHE and SHE to guarantee the security of the data. Both techniques
(partially homomorphic encryption and somewhat homomorphic encryption) or, in general, all
PPDL techniques preserve the image feature, allowing the CNN layers to know, learn these
features and show good accuracy based on these images that may be incomprehensible to
users. This approach is applied to satellite image that contain massive data/information used in
many applications (surveillance, monitoring, defense, security, etc.).This data can be secured
by employing the proposed data security technique effectively.
To encrypt this type of data, on the client-side, the client applies our hybrid proposed approach
based on PHE and SHE on this data that can be encrypted using two different public keys (the
generation key illustration is presented in Algorithm 7, which is based on two other algorithms
that are Algorithm 1 and Algorithm 4). The CNN-based model applies the training process on
the encrypted cipher-images.
Following the training of the CNN model, the testing process will be applied with new
encrypted images that are encoded, employing the same hybrid approach used in the training
phase. Lastly, the cloud server runs immediately on encipher-images. Privacy data is kept both
during training and testing, and illegal participants cannot decode it when we used this kind of
infrastructure which can solve the resource constraint of CNN.
Homomorphic Encryption is a PPDL approach that can ensure the security of the satellite
images. Nevertheless, this approach is advantageous over other PPDL approaches and it can
retain all image information. So, this technique is subdivided into three different sub-classes:
partially homomorphic encryption (PHE), somewhat homomorphic encryption (SHE), and
fully homomorphic encryption (FHE). Each of the sub-categories of HE differs from the
others and has its specific characteristics and advantages. In this research, we combined two
approaches that are PHE and SHE.
Firstly, PHE is illimited on the number of utilizations (Acar et al., 2018). Hereafter, Partially
HE is involved in a few applications according to their needs and the type of issues that can
24
Chapter 3: Research Methodology
be figured out. Then, due to its use of only one kind of process (addition or multiplication),
PHE schemes are generally more efficient than SHE and FHE (Mattsson, 2021), since it does
not necessitate much overhead computation (Morris, 2013). On the contrary, the FHE scheme
is inefficient in practice, and SHE allows some sample operations just a bounded number of
times(Acar et al., 2018). However, homomorphic cryptosystems are vulnerable to malware,
particularly implementing additively homomorphic encryption such as PHE (Morris, 2013).
Compared with PHE, SHE can diminish the computational complexity and accelerate the
computational duration (thanks to the shorter bits-length of SHE in the encrypted field)(Xiong
et al., 2018). Furthermore, due to the issue of security and the computational complexity,
we are required to combine this PHE with SHE to ensure data privacy and decrease the the
computational complexity.
The building of our hybrid approach is founded on two related approaches (each one has its
particular algorithms and parameters) described in sections 3.2.1 and 3.2.2.
25
Chapter 3: Research Methodology
Figure 3-1:
Proposed Hybrid Approach
3.2.1 Paillier Homomorphic Encryption Schemes
Partially Homomorphic Encryption (PHE) schemes are limited to only arithmetic operations
(addition or multiplication) on encoded texts. RSA is one example of multiplicative
homomorphic encryption (Morris, 2013; Paillier, 1999). Then, within RSA, with the
26
Chapter 3: Research Methodology
encryption function, the decryption function, and two plaintext messages. This type of
cryptosystem has the possibility to multiply ciphertexts between them by being able to
decipher. The deciphered result has the same value as the multiplication of the clear texts
between them.
The Paillier scheme is an asymmetric additive PHE encryption scheme that can verify
the same specificity as RSA being the additive homomorphism. In this scheme (n,g)is the
public key, where the generation of nis produced by two great prime values, pand q, which
have the same binary length, and grepresents an element of Zn2such that its order is a
multiple of n.pand qis used as well as secret-key. φ(n) = (p1)(q1)has an inverse
modulo n. The φ(n)(1)is indicated by λand applied as the secret-key(sk) (Muhammad et al.,
2018). The three steps of the PHE scheme are: key generation, encryption, and decryption, as
presented in Algorithms 1, 2, and 3, successively (Muhammad et al., 2018; Paillier, 1999) is
presented as the following.
Algorithm 1 : key-generation(p,q)
Input: Select two great prime numbers
p,qPrandomly and independently of each other
Output: two different keys are: the encryption key that is the public key (n,g)and the
decryption key that is the secret key (λ,µ)
1: if using length ( p) equals to length(q)
1.1: Calculate n=pqand g=n+1
1.2: Compute λ=φ(n)where φ(n)is the Euler totient function and
φ(n) = [(p1)(q1)]
1.3: Let µbe φ(n)1mod n
Algorithm 2 : Encryption(M,pk)
Input:Mis a Plaintext that is less than n, where MZn
Output:cis the Ciphertext where cZ
n2
1: Take rn, where gcd(r,n) equals to 1 and ris a randomly value Z
n2
2:cis equals to gMrnmod n2
Algorithm 3 : Decryption(c,sk)
Input:cis a Ciphertext
Output:Mis the Plaintext
1:Mequals to L(cλmodn2).(µmodn), where cn2
To address PHE drawbacks, we combine it with the SHE scheme described below.
3.2.2 Somewhat Homomorphic Encryption Schemes
SHE scheme is a subclass of Homomorphic Encryption which supports both multiplication
and additive homomorphisms with a limited number of operations.The DGHV was submitted
in 2010 as the second FHE scheme, which is an asymmetric cryptosystem scheme (Hariss et al.,
2017; Kulynych, 2015). This scheme is based on the homomorphic property but with a limited
27
Chapter 3: Research Methodology
number of operations. Therefore, it provides these properties of SHE. Various parameters are
necessary for the construction of this scheme. They administer the numeral of integers in the
public key and the bit-length of the different integers by the secret parameter λ. Particularly,
we call ηthe bit-length of the secret key, γthe bit-length of the integers in the public key, ρthe
bit-length of the noise, and the number of integers in the public key is presented by τ(Coron et
al., 2012; Pisa et al., 2012; Van Dijk et al., 2010; Yi et al., 2014). This scheme can be described
in three algorithms (Algorithm 4, Algorithm 5 and 6) that are detailed in the following:
Algorithm 4 : SH.keyGenerate(λ)
Input:λis the secret parameter
Output: Public-Key : pk = (x0,x1, ...xτ)and Private-Key : : sk =p
1: Choose randomly the private key pthat is an odd number where p[2η1
,2η)and
η=λ2
2: Generate an array of integer where qiZ,qi[0,2γ),qi=pand γ=λ5
3: Choose randomly riZand ri(2ρ
,2ρ), where ρ=2λand i=0, ..., τwith
τ=γ+λ
4: Define the function : xi=pqi+ri
5:x0renamed by the the largest and x0must be odd. Then, the remainder of must x0mod p
be even
Algorithm 5 : SH.Encryption(M,pk)
Input:Mpresents the message to encode
Output:cis the encrypted message
1: Take a randomly subset S(0,1, ..., τ)
3: Generate a random integer r(2ρ,2ρ)
3:c= (M+2r+2iSxi)mod x0
Algorithm 6 : SH.Decryption(c,sk)
Input :cis the encrypted message
Output:Mis the original message
1: Calculate M=(cmod sk) mod 2
3.2.3 Hybrid Encryption schemes
In this section, our novel Hybrid Encryption schema is detailed. Then, its building focused
on extracting some properties of PHE and SHE, which are described in detail in the section
above. Each one of PHE and SHE has its particular algorithms and parameters. However,
the use of the proposed approach is a vital mechanism to maintain the confidentiality of
any sensitive information included in satellite images.This will ensure better encryption and
therefore increase the privacy of satellite images, which will be useful in many contexts of
sensitive data applications.
The development of this new scheme requires the use of many original functions of two
classes of HE. Likewise, it employs two different public keys, one for PHE and the other for
SHE generated by their functions.
28
Chapter 3: Research Methodology
Let three parameters (λ,p,q) be used in the HybridKeyGenerate, where λis a parameter
employed to generate the key of SHE and pand qare used to generate the key of PHE.
Let four-tuple(λ,k1, M,k2) be a HybridEncryption algorithm, where λsymbolizes a secret
parameter and k2 is the public key included in the SHE schemes. k1 represents the public key
of PHE determined by (g,n), and Mrepresents the plaintext.
The format of mixed homomorphic encryption methods will be organized by several algorithms
which are dependent on the homomorphic encryption form.
Algorithm 7 : HybridKeyGenerate(λ,p,q)
Input:λis the secret parameter and pand qare two huge primes having the same
length in binary representation.
Output: two different public keys: one is generated by PHE scheme and the other is
developed by SHE scheme in Algorithm 1 and 4 respectively.
1:K1 = keyGeneration(p,q)
2:K2 = SH.keyGenerate(λ)
Algorithm 8 : HybridEncryption(λ,k1, M,k2 )
Input: Let Mbe the plain message to encrypt
Output : Let c be the encrypted message to decoded, and cZn2
Take rless than n, where gcd(r,n) equals to1 and ris a randomly value Z
n2
1: Generate a randomly subset S(0,1, ..., τ)
2: Compute : c= (M+2r+2iSxi)mod n2
Let c1 and c2 be two ciphertexts resulting from HybridEncry ption(M1)and
HybridE ncryption(M2), respectively, r1 and r2 two randomly value in Zn2, and PK1
and PK2 are two different public keys. To show that the properties of Homomorphic
Encryption are preserved in the hybrid approach:
Firstly, add two ciphertexts, where c=c1+c2, c= (M+2r+pPK)modn2, and PK =iSxi.
We can see that:
c modn2= (c1+c2)modn2
= ((M1+2r1+pPK1)modn2+ (M2+2r2+pPK2)modn2)modn2
= [[(M1+2r1+pPK1)+(M2+2r2+pPK2)] modn2]modn2
= [[(M1+M2) + 2(r1+r2) + p(PK1+PK2)] modn2]modn2
= [(M+2r+pPK)modn2]modn2
with M= (M1+M2),r= (r1+r2), and PK = (PK1+PK2).
Therefore,
(HybridE ncryption(M))modn2= (HybridE ncry ption(M1)+HybridE ncryption(M2))modn2
(3.1)
A numerical application:
Let M1=258, r1=53, PK1=39, M2=220, r2=101, PK2=30, n=17 where p=2,
29
Chapter 3: Research Methodology
gcd(53,17) = 1, and gcd(101,17) = 1.
HybridE ncryption(M1) = (M1+2r1+pPK1)modn2
= (258+2×53 + 2 ×39)mod(172)
=442 mod 289
=153
HybridE ncryption(M2) = (M2+2r2+pPK2)modn2
= (220 + 2 ×101 +2×30) mod(172)
=482 mod 289
=193
(HybridE ncryption(M))modn2= [[(M1+M2) + 2(r1+r2) + p(PK1+PK2)]modn2]modn2
= [[(258 +220) + 2(53 +101) + 2(39 +30)]mod172]mod172
= [[478 +308 +138]mod 289]mod 289 =57
(HybridE ncryption(M1) + HybridEncryption(M2))modn2=
[(153 +193)mod(172)]mod(172) = (346 mod 289)mod 289 =57
Secondly, when taking the product of two ciphertexts:
c modn2=(c1×c2) mod n2
= [(M1+2r1+pPK1)modn2×(M2+2r2+pPK2)modn2]modn2
= [[(M1+2r1+pPK1)×(M2+2r2+pPK2)]modn2]modn2
= [[M1×M2+M1×2r2+M1×pPK2+2r1×M2+2r1×2r2+2r1×pPK2
+pPK1×M2+pPK1×2r2+pPK1×pPK2]modn2]modn2
= [[(M1×M2) + 2(M1×r2+M2×r1+2×r1×r2)
+(pPK1×pPK2)(1+M1+2r1
pPK1+M2+2r2
pPK2)]modn2]modn2
= [[(M1×M2) + 2(M1×r2+M2×r1+2×r1×r2)
+(PK1×PK2)[p2(1+M1+2r1
pPK1+M2+2r2
pPK2)]]modn2]modn2
= [(M+2r+pPK)modn2]modn2
with M=M1×M2, r= (M1×r2+M2×r1+2×r1×r2),PK =PK1×PK2, and
p=p2(1+M1+2r1
pPK1+M2+2r2
pPK2)modn2
We deduct this formula:
(HybridE ncryption(M))modn2= (HybridE ncry ption(M1)×HybridE ncryption(M2))modn2
(3.2)
A numerical application:
we know that HybridEncry ption(M1) = 153, and HybridE ncryption(M1) = 193
(HybridE ncryption(M))modn2= [[(M1×M2) + 2(M1×r2+M2×r1+2×r1×r2)
+(PK1×PK2)[p2(1+M1+2r1
pPK1+M2+2r2
pPK2)]]modn2]modn2
= [[(258 ×220) + 2(258 ×101 +220 ×53 +2×53 ×101)+
30
Chapter 3: Research Methodology
(39 ×30)[22(1+258+2×53
2×39 +220+2×101
2×30 )]]mod172]mod172
= [[56760 +96848 +59436]mod289]mod289
= [213044 mod289]mod289
=51
(HybridE ncryption(M1)×HybridEncryption(M2))modn2= (153 ×193)mod172
=29529 mod 289 =51
The two formulas explicated above prove that the proposed hybrid approach has
homomorphism properties. Therefore, this hybrid approach takes advantage of more
security than the usage of PHE or SHE only while decreasing computation time due to the
usage of a shorter bit length.
The proposed hybrid approach described above presented a new cryptography approach
that presents a solution resolving the limitations of the PHE and SHE schemes. However,
when the image is encrypted by this approach, its privacy and features are preserved. In effect,
the user can upload his dataset to the cloud without being afraid that sensitive information will
be known through the cloud server. Hybrid privacy-preserving deep learning has the same
advantage as homomorphic encryption, which allows CNN models to learn image features.
3.2.4 Image matrix encrypted by the proposed hybrid approach
Each image is presented as a two-dimensional array or matrix (M and N are dimensions).
Therefore, the image matrix is an arrangement of the pixel value, and each pixel has its specific
position.
Afterward, the steps for encrypting any image are as follows: First, convert the image to a
matrix. Then, each pixel value is encrypted by applying our approach described in the Section
3.2.3. After that, the new image is generated and has the same name as the simple image. The
plain image matrix shown below shows an example of the original image matrix. To have the
encrypted matrix of this matrix, our approach is applied to each pixel value.
Plain images Matrix =
P
11,P
12,P
13,P
14,P
15,P
16, .................., P
1N,
P
21,P
22,P
23,P
24,P
25,P
26, .................., P
2N,
P
31,P
32,P
33,P
34,P
35,P
36, .................., P
3N,
.........
P
M1,P
M2,P
M3,P
M4,P
M5,P
M6, .................., P
MN .
However, the new/encrypted pixel value is given (generated) by Equation 3.4 and the process
of encrypting this value is depicted in Figure 3-2. Where P is a pixel value.
c= (P+2r+2
iS
xi)mod n2(3.4)
Taking M=256 and N=256 (256 ×256 images) as an example matrix of the image, the
determination of the encrypted image is done by applying the hybrid approach to each pixel
(each plain pixel value has the correspondent encrypted pixel value). Therefore, c11 is the
encrypted pixel value of P
11. Thus, each value presented in Cipher-Matrix is the encrypted
pixel value of the pixel value shown in the Plain image’s Matrix.
31
Chapter 3: Research Methodology
Figure 3-2:
Encryption pixel value steps
Cipher Matrix =
c11,c12,c13,c14,c15,c16, .................., c1256,
c21,c22,c23,c24,c25,c26, .................., c2256 ,
c31,c32,c33,c34,c35,c36, .................., c3256,
.........
c2561,c2562,c2563,c2564,c2565,c2566, .................., c256256 .
The new satellite image (encrypted image) is generated, it is incomprehensible for the owner of
the data although all its functionalities are conserved without any shedding of its information.
Hence, the main advantage of this approach is to let the DL methods learn those features.
3.2.5 Pre-trained models
The main objective of this research discussed before is to ensure the privacy of satellite images
when using Deep Learning as a Service (public DL models). So, we choose to use Pre-trained
models to test the proposed Hybrid Encryption’s performance applying to the satellite images
dataset.
Pre-trained models (PTMs) are models that scientists or large companies have developed.
These are models that are stored with their weight after training. In general, these types of
models are really deep neural networks. Its number of layers is greater than another significant
feature that is formed on very large data sets (e.g. ImageNet). Therefore, these networks
aim to facilitate the solution of image classification problems. PTMs demonstrate powerful
generalizability to images beyond the ImageNet dataset through TL.
32
Chapter 3: Research Methodology
This is in reality a fine-tuning of PTMs. In (Petrovska et al., 2018), the authors detailed as to
how this is achieved in three different manners :
Feature extraction: This is the case when a pre-formed network model is used as a
feature mining mechanism. In this case, the outcome layer is deleted and, for the new
dataset, the whole of the network can be used like a fixed feature extractor.
Use the Architecture of the PTM: In this context, all that is employed from the pre-
trained network is the architecture. All weight elements are initialized randomly. This
network is formed according to the interest dataset.
Train some layers while freezing others: The PTM is partially trained, which is
exposed in the third scenario. In this situation, the weights of the lowest layers of the
model remain the same, whereas we have re-trained the upper layers of the model.
The transfer learning models used in this research to evaluate our approach are described
in chapter 2 Section 2.3.3. Adopting these models to our dataset, the requirement for this
operation is to freeze 100 layers, remove the last dense layer and add our specific layers, which
can learn the more and more complex shapes. Thus, these models can adopt encrypted images
dataset that he would never have seen before during his basic training.
Table 3-1 depicts the new added layers (convolutional , ReLu activation function, dropout,
global average pooling, fully connected, Softmax activation function). Each model was trained
on encrypted images. Nevertheless, the prediction of classes is provided by Softmax (activation
function) that is placed in the final fully connected layer.
Table 3-1: Layer stacking added on each model
Layers Output Shape Dropout Rate
Convolutional 128×128 ×3 -
Activation(ReLU) - -
Dropout - 0.2
Global Average Pooling 128×1 -
Fully Connected 4 -
Activation(Softmax) - -
3.3 Conclusion
The proposed hybrid privacy-preserving deep learning is the first cryptosystem developed
based on some fundamentals properties of two Homomorphic Encryption approaches. On the
one hand, it aims to meet the limits of PHE and SHE at the same time. on the other hand, it
aims to respond to the securing of satellite images when these are uploaded to the cloud. It is a
type of cryptography that can open the door to further research to use in other areas.
The principal goal of this chapter is to describe the two related PPDL approaches, which each
one of them have specific characteristic. Then, detailing the proposed hybrid PPDL, which has
two different algorithms 7 and 8, and writing two equations with demonstration and numerical
application. After that, the image encryption steps is used. The last section in this chapter
supplies the explanation of the pre-trained models and PTMs’ goals.
33
Chapter 3: Research Methodology
The following chapter will give study regions, dataset description, experiment results,
and analysis. It will provide the different steps to evaluate the proposed approach based on
several metrics and the comparison results between cipher and plain datasets when they are
used in the process of Pre-trained models.
34
Chapter 4: Experimental Results and Analysis
Contents
4.1 Introduction .................................. 36
4.2 Study regions and dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.1 Studyregion .............................. 36
4.2.2 Dataset ................................. 38
4.3 ExperimentalSet-Up ............................. 40
4.4 Metrics ..................................... 40
4.4.1 Accuracy ................................ 41
4.4.2 Precision ................................ 41
4.4.3 Recall.................................. 41
4.4.4 F1_score ................................ 41
4.5 Results ..................................... 41
4.5.1 ImageEncryption............................ 42
4.5.2 Dataaugmentation ........................... 42
4.5.3 Application of transfer learning models . . . . . . . . . . . . . . . . 43
4.5.4 Evaluation based on Security Parameters . . . . . . . . . . . . . . . 47
4.5.5 Discussion ............................... 49
4.6 Conclusion ................................... 50
35
Chapter 4: Experimental Results and Analysis
4.1 Introduction
This chapter presents the dataset’s regions for the experiment. We first describe the study
regions and dataset. Next, we present the experiment hardware and software. Then, we detail
the different results of our experiments.
4.2 Study regions and dataset
The description of the study areas and the dataset is presented in this section, we also provide
the distribution of the dataset between the training and validation stages.
4.2.1 Study region
In this study, Figure 4-1 depicts the study area representing seven cities in Saudi Arabia. The
first city is Al Madinah, which is placed in the western cities of Saudi Arabia and presents
the second holiest city in Islam after Mecca. This city has a significant cultural and historical
heritage. The second region is Riyadh, which is the capital of Saudi Arabia, and is thought to
be one of the fastest-growing regions in the entire area of the Middle East. The third city is
Jeddah, which is the second-largest city in the western region. Jeddah is positioned in the lower
Hijaz Mountains and lies within the Red Sea Coast. The fourth region is Al Qassim, a region
that presents the wealthiest city per capita in the country and is both the seventh most populated
area in Saudi Arabia and the fifth densely populated region. The fifth area is Al Qatif, which
is one of the ancient territories in Eastern Arabia and represents an urban area. The sixth city
is Hail, an agricultural area that is located in north-western Saudi Arabia. Finally, the seventh
region is Dammam, which is the sixth-most populated city in Saudi Arabia, the capital of the
eastern province of Saudi Arabia and the center of the Saudi oil industry.
36
Chapter 4: Experimental Results and Analysis
Figure 4-1:
Study Area
37
Chapter 4: Experimental Results and Analysis
4.2.2 Dataset
The description and slicing (split between train and validation in TL models) of the dataset
used in our experiments will be presented in this section.
4.2.2.1 Dataset Description
In this study, experimentations are performed on several real-world high-resolution satellite
images acquired using 2 SPOT(French Satellite pour l’Observation de la Terre) satellite, which
are SPOT6 and SPOT7. These images have 2-m resolution multispectral bands and a 0.5-m
resolution panchromatic band. Satellite images used in this study were corrected with respect
to radiometry, sensor distortions, and acquisition effects. Additionally, these images were
orthorectified to eliminate the perspective effect on the ground. To prepare our dataset, satellite
images representing the seven regions are split into non-overlapping blocks of 256x256. A
semantic segmentation employing previous works(Boulila, 2019; Boulila, Ghandorh, et al.,
2021; Ghandorh et al., 2022) has been performed to extract four ground cover kinds within
these images namely building, vegetation, road, and bare soil. Images of each land cover type
are regrouped into a separate folder. Figure 4-2 illustrates sample images, where the white
indicates the ground cover category and black denotes the meanings of the other categories.
38
Chapter 4: Experimental Results and Analysis
Building: green building, industries and modern constructions
Vegetation: Trees, agricultures area and vegetation area
Bare soil: Desert, open land and sandy soil
Road: Large, private, and low capacity
Figure 4-2:
Sample of images
39
Chapter 4: Experimental Results and Analysis
4.2.2.2 Dataset Splitting
Experiments are carried out on a dataset comprising 28,776 satellite images of size 224x224
pixels and each containing only one ground cover type. As shown in Figure 4-3 the dataset
has divided into 90% for training and 10% for validation. Furthermore, Table 4-1 displays the
number of images composed in each category.
Table 4-1: Number of samples of each land cover type.
Land Cover Type No. of Training Samples No. of Validation Samples
BareSoil 6784 754
Building 7541 838
Road 5411 601
Vegetation 6162 685
Figure 4-3:
Dataset Splitting
4.3 Experimental Set-Up
Experiments are run on a server with a 64-bit operating system, an x64-based processor, a
2.30GHz Intel(R) Xeon(R) Gold 5218 CPU, and 512 GB RAM. The server has eight NVIDIA
Quadro RTX 8000 GPUs, each with 48 GB of memory, and runs Ubuntu 18.04. Python 3.7 is
used to program the DL networks. The Keras 2.6 library and the TensorFlow-GPU 2.3 backend
were both used.
4.4 Metrics
We assess the performance of our proposed hybrid PPDL by employing several transfer
learning models on the satellite dataset described above, before (plain images) and after
40
Chapter 4: Experimental Results and Analysis
encryption. The effectiveness of our PPDL method is demonstrated using several metrics,
namely accuracy, precision, recall, and F1_score. These metrics are defined based on the four
following quantities defined for each class C:
T P (True Positives): Number of images correctly predicted as belonging to class C.
T N (True Negatives): Number of images of other classes correctly predicted as not belonging
to class C.
FP (False Positives): Number of images wrongly predicted as belonging to class C.
FN (True Negatives): Number of images of class C predicted as belonging to other classes.
4.4.1 Accuracy
In classification problems, accuracy is the most commonly used metric for evaluating
performance and comparing models. It is expressed as the ratio of the number of accurately
classified images to the full availability images in the data set:
Acc =T P +T N
T P +T N +F P +FN (4.1)
The accuracy is measured on the whole dataset, while the three following metrics are measured
against a specific class C.
4.4.2 Precision
Precision is the ratio of correctly classified images to the number of images predicted as
belonging to class C:
Precision =T P
T P +FP (4.2)
4.4.3 Recall
Recall is the ratio of correctly classified images to the number of images really belonging to
class C:
Recall =T P
T P +FN (4.3)
4.4.4 F1_score
The F1 Score is a Harmonic mean between Precision and Recall:
F1=2×Precision ×Recall
Precision +Recall (4.4)
4.5 Results
In this section, we evaluate the performance of our hybrid PPDL. Our experiments are split
into three parts. In the first part, we present the encrypted images. The second level of our
experiments will analyze the results of the different DL models after training on plain and
encrypted data, using the metrics described above. In the last part, we will use several security
parameters to evaluate the performance of these images in terms of security.
41
Chapter 4: Experimental Results and Analysis
4.5.1 Image Encryption
Using the proposed encryption scheme, this part will display some of the encrypted images,
in which we cannot see any information. So, Figure 4-4 illustrates a sample of original and
encrypted images. The encrypted images are obtained using the proposed Hybrid encryption
algorithm (Algorithm 8), which was demonstrated to be efficient and secure. Then, the CNN
model is trained and tested without visual information. The original image is encrypted by a
hybrid public key(both public keys) which are generated by the HybridKeyGenerate algorithm
(Algorithm 7).
Land Cover Bare soil Building Road Vegetation
Original Image
Hybrid public key
Encrypted Image
Figure 4-4:
Four Land cover images and their encrypted images
4.5.2 Data augmentation
DA refers to a series of techniques used for enlarging the training dataset without gathering
more data. Most methods either append softly changed copies of currently existing data or
build synthetic data. So when training ML models, the increased data acts as a regularizer,
and the overfitting is decreased (Hernández-Garcıa & König, 2018; Shorten & Khoshgoftaar,
2019). DA techniques include rotation, horizontal, and vertical shift and zoom. The
development of these approaches improves the efficiency of convolutional neural networks
(Kassani & Kassani, 2019).
For this study, we used the following data augmentation strategies:
-A rotation range that equals 90-degree
-A zoom and a shear range equal to 20%
-A brightness scale between [0.2,.., 1.0]
-A shift range that equals to 20% in height and width
-A horizontal flip and a vertical flip
42
Chapter 4: Experimental Results and Analysis
4.5.3 Application of transfer learning models
The section’s goal is to evaluate the performance of transfer learning models on both plain
and encrypted satellite images. Four different transfer learning models are considered, namely
DenseNet169, MobileNetV2, InceptionV3, and Resnet50, all pre-trained on Imagenet.
Figure 4-5 depicts the model architecture used. We froze the first 100 layers of each pre-
trained model, removed their final dense layers, and replaced them with a convolutional layer
(128 filters, 3x3 kernel, 1x1 stride, ReLU activation function), a dropout layer (0.2 rates), a
global average pooling layer (output shape 128x1), and a fully connected layer (128 inputs and
4 output neurons, with a softmax activation function). We trained each model for 200 epochs
on both the plain and encrypted datasets.
Figure 4-5:
Model architecture used for training on the plain and encrypted datasets.
The graphical representation of the validation accuracy progress during training for these four
models on both the plain and encrypted datasets is shown in Figure 4-6. The algorithms are
slower to converge on the encrypted dataset, but the gap in accuracy is gradually reduced.
InceptionV3 converges faster than the three other models on both datasets. This suggests that
its pre-trained weights on ImageNet are incidentally closer to the optimum weight values on
our dataset. Figure 4-7 shows the confusion matrices between the real and predicted classes
for each model on the plain and encrypted datasets. For the four models, the encryption
entailed a little confusion between the road and vegetation classes (between 14% and 22%
of misclassifications), while these two classes were well distinguished on the plain dataset
(misclassification rate between 0.2% and 2% between these two classes). For ResNet50 and
InceptionV3, the encryption also entailed a higher confusion between the bare soil and building
classes (misclassification rate of 15% and 6%, respectively, compared to 7% and 3% on the
plain images). This suggests that the encryption process alters the distance between class
features. Nevertheless, the loss in overall accuracy for the four models between the plain
and encrypted datasets remains limited (from 2.0 to 3.5 percentage points). This shows an
acceptable trade-off between the usefulness and the confidentiality of the data.
43
Chapter 4: Experimental Results and Analysis
Figure 4-6:
Evolution of the validation accuracy during the training of plain and encrypted images for the 4 CNN
architectures.
44
Chapter 4: Experimental Results and Analysis
Figure 4-7:
Confusion matrices obtained for the 4 different CNN architectures (from top to bottom: ResNet50, InceptionV3,
DenseNet169, MobileNetV2), on the validation set, for both the plain (left) and encrypted (right) datasets.
45
Chapter 4: Experimental Results and Analysis
Table 4-2: Performance of the 4 CNN models on the validation set of plain and encrypted data.
CNN models
Weighted average on the 4 classes Nb.
of
param.
Nb. of
operations
(FLOPS)
Inference
Time
per
image
(ms)
Precision Recall F1-score
Plain Enc Loss Plain Enc Loss Plain Enc Loss
Resnet50 88.3% 84.5% 3.8% 86.0% 84.0% 2.0% 85.8% 83.9% 1.9% 25.9M 51.8M 1.9
InceptionV3 92.8% 91.1% 1.7% 92.0% 89.3% 2.7% 92.1% 89.0% 3.1% 24.1M 48.3M 1.8
DenseNet169 96.0% 93.1% 2.9% 95.5% 92.0% 3.5% 95.6% 91.8% 3.8% 14.6M 29.1M 2.3
MobileNetV2 94.1% 90.6% 3.5% 93.6% 88.4% 5.2% 93.7% 88.5% 5.2% 3.7M 7.4M 1.6
Table 4-2 illustrates several performance metrics on the validation set for both plain and
encrypted images. Precision, recall, and F1-score are averaged, with the number of images in
each class as weights. In all of these three metrics, and for both plain and encrypted datasets,
DenseNet169 shows the highest performance, while ResNet50 shows the lowest performance.
Nonetheless, this comes at the cost of a lower inference speed for DenseNet169 (44% and 21%
slower than MobileNetV2 and ResNet50, respectively). MobileNetV2 has the fastest inference
speed due to its reduced architecture designed to run on mobile devices with limited computing
capabilities. The inference speed does not depend on the type of images (plain or encrypted)
as long as they have the same input size (224x224). The inference time depends on the number
of floating-point operations (FLOPs) and other factors, such as parallel operations in the GPU
for each CNN architecture. This explains why DenseNet169 (with the most significant number
of layers among the four networks) necessitates fewer operations but a longer inference time.
On the other hand, the maximum loss in average precision, recall, and F1-score, when moving
from the plain to the encrypted images, is 3.8, 5.2, and 5.2 percentage points, respectively,
which is still an acceptable range, especially when moving from the plain to the encrypted
images for applications where data privacy is critical.
Figure 4-8 summarizes the performance of the four CNN models in terms of accuracy
and speed and the precision, recall, and F1-score per class. It is also clear that DenseNet169
provides the best overall performance on both plain and encrypted images, except for inference
speed, while MobileNetV2 offers a good trade-off between accuracy and speed. We notice that
the classes are not equally affected by the encryption process, as shown in Figure 4-8.
46
Chapter 4: Experimental Results and Analysis
Figure 4-8:
Summary of the results of the four CNN models. The relative FPS corresponds to the number
of frames per second divided by the maximum obtained value (632 for MobileNetV2). The
color of inner sectors (representing algorithms) corresponds to the average colors of outer
sectors belonging to them. The lighter the color, the better the results.
4.5.4 Evaluation based on Security Parameters
this section will explain each of the parameters used to evaluate our proposed hybrid approach
and the comparison analysis with PHE used in (Alkhelaiwi et al., 2021).
A. Correlation coefficient
Correlation coefficient(CC) presented the most fundamental method utilized to determine
the similarity between two images. The interval value of the CC included between -1.0000
and 1.0000; -1.000 exposes perfect negative correlation and -1.000 presentes ideal perfect
correlation but 0.0000 manifests no linear association the two images (Ganti, 2020) (Xue et al.,
47
Chapter 4: Experimental Results and Analysis
2021).
C.C=
1
NN
j=1(xjE(x))(yjE(y)
q1
N2N
j=1(xjE(x))2(yjE(y))2
(4.5)
Where E(x) = 1
NN
jxj,xand E(x)represent the plain images and yand E(y)represent the
encrypted images. The CC outcome of the plain image and its corresponding encrypted image
from the satellite image dataset has a value that is near to 0. This result shows the efficiency of
our proposed encrypted scheme and the feature of plain and encrypted images don’t have any
similarity.
B. Entropy
The unpredictability and randomness information is measured by Entropy that denotes the
most meaningful trait. And, this characteristic is a degree of uncertainty. Thereafter, a large
Entropy value symbolizes a higher level of security when assessing image encryption. Usually,
when this value is very near of the ideal value 8, it is considered to be the safeness from a
brute attack. Thus, the formula that can calculate the entropy is defined as follows (Ahmad &
Hwang, 2016):
H[X] =
n
i=0
P(xi)log2P(xi)(4.6)
where the probability of token is represented by P(xi). These values presented in the Table
shows that the information entropy attack can not deduct any information about the plaintext
image.
C. Mean Square Error
The Mean Square Error (MSE) is the heaped-up squared error between the pixels of plain and
cipher images. Thus, the more significant difference between the plain image and the treated
image is indicated by the increased value of MSE, and a small error is shown by a low value
(Ahmad & Hwang, 2016). Its formula is given by :
MSE = (W×H)1W1
i=0
H1
j=0
1
(O(i,j)C(i,j)2(4.7)
Where O(i,j)and C(i,j)are the pixels values of the original and encrypted image at grid (i,j)
(Wand Hrepresent the width and the height of the original images)respectively.
D. Peak signal-to-noise ratio
The Peak signal-to-noise ratio (PSNR) is the measurement of the rapport between original
and cipher images. This metric can be presented as the security evaluation parameter between
plain and cipher images (Ahmad & Hwang, 2016; Lu et al., 2022). A lower value of PSNR
explains that the original image is separated from the encrypted image which is, of course,
beneficial in any encryption scheme unlike a higher value of PSNR, which is given by
PSNR =20(log10(Imax)log10(MSE )(4.8)
Where Imax = (W H)×255 denotes the maximum gray scale value of a plaintext image (W
is the width and His the height) and Mean Square Error (MSE) is presented above.
E. Structural similarity index
48
Chapter 4: Experimental Results and Analysis
The Structural SIMilarity index (SSIM) is employed to calculate the similarity between
plain and ciphered images. The SSIM index presents a method as image quality evaluation.
Afterward, the value of SSIM is between -1 et 1 but in most cases, it is presented in the interval
[0,1]. Due to the lower value of SSIM, the encryption method is efficient (Dosselmann &
Yang, 2011). Thus, the Formula of SSIM is defined here:
SSIM = [l(x,y)]α[c(x,y)]β[s(x,y)]γ(4.9)
Where xand yrepresent the plain and encrypted images, respectively.
To evaluate the proposed hybrid encryption scheme, five parameters are used, which are
described above. The most important challenge in this work is to enhance the encryption
scheme in PPDL. Table 4-3 depicts that the proposed encryption method ensures better
encryption than the work proposed by (Alkhelaiwi et al., 2021) with regard to the five security
parameters mentioned earlier. The value of the correlation coefficient (0.0039) between the
original and its corresponding encrypted image shows the efficiency of our proposed hybrid
encryption (the best-encrypted images have the CC value close to 0), where the masking of
the visual features in the encrypted image is guaranteed. In fact,The ideal value is near to 8
of entropy, the entropy values of both studies mentioned in Table 4-3 are close to 8 but the
entropy of encrypted images by our proposed approach is 7.9701, which is higher than 0.0105
for encrypted images by the other work. 0.0574 ×104demonstrates a large difference between
MSE value of proposed approach and its (Alkhelaiwi et al., 2021) value. This result shows
that our approach encryption is more secure compared with the other research. Based on the
low PSNR, we can see the result in Table below the encrypted images by hybrid approach are
more secure than other cipher-images. The lowest value of SSIM presents the most secure
encrypted images where 0.0012 is the SSIM value of our encrypted images less than 0.0006
for (Alkhelaiwi et al., 2021) encrypted images.
Table 4-3: Comparison of proposed approach and (Alkhelaiwi et al., 2021) for security
parameters.
Security parameters Proposed Approach
(encrypted images)
(Alkhelaiwi et al., 2021)
(encrypted images)
CC 0.0039 -0.0041
Entropy 7.9701 7.9596
MSE 2.181 ×1042.1236 ×104
PSNR 4.2234 4.8601
SSIM 0.0012 0.0018
4.5.5 Discussion
With the emergence of using DL as a Service (DLaas), any unauthorized user can access
almost all information that can be sensitive. This opens the door to several types of threats, and
possible misuses, which must be addressed correctly. To ensure the privacy and confidentiality
of the data and restrict access to sensitive information, we can use cryptographic approaches.
This paper reviewed various PPDL methods to highlight their advantages and drawbacks.
The proposed PPDL approach is based on two classes of HE methods, which extract
many features from both PHE and SHE. While preserving the data confidentiality and denying
49
Chapter 4: Experimental Results and Analysis
access to unencrypted data (original images), a hybrid encryption scheme can apply many
operations immediately on the encrypted images (encoded images) without needing to decrypt
them. The performance of the proposed PPDL method can be assessed using two sides:
accurate decision and data security. For the first one, the evaluation of the proposed hybrid
method is tested using four different pre-trained CNN models. The loss in accuracy when
training the models on encrypted images compared to plain images is limited (less than 3.5%),
which is admissible for applications where security and privacy are critical. For the security
side, we can note that using a combination of PHE and SHE allows improving the security
of encrypted images according to five security parameters: correlation coefficient (CC),
entropy, mean square error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity
index (SSIM). It is also worth mentioning that the proposed hybrid method achieved a good
encryption runtime equal to 21.2s per image for a key size of 128bits. This is an important
criterion, especially when working on huge training data.
4.6 Conclusion
The first section of this chapter describes the studied regions and the dataset. Then, it presents
the experimental set-up, and the different metrics used to evaluate the performance of our
approach. Also, this chapter shows all the results of the experiments applied in this research
and discussion.
This study has three experimental, the first step is to encrypt the images using a hybrid
public key where one key is from PHE and the other from SHE. The second step is to
apply four transfer learning models, namely DenseNet169, MobileNetV2, InceptionV3, and
Resnet50 on encrypted and plain images and to analyzes the result based on more metrics.
The final step is to evaluate the security of our encrypted images based on several security
parameters and compare our hybrid privacy-preserving deep learning approach with the
privacy-preserving deep learning used in (Alkhelaiwi et al., 2021).The conclusion of this
research and the various future works will be detailed in the next chapter.
50
Chapter 5: Conclusion and Future Works
The goal of this chapter is to provide a conclusion that can summarize this thesis and give
future directions, which can be applied based on our proposed approach.
5.1 Conclusion
DL has emerged to be used as a service in many applications due to the low usage cost and the
flexibility of a wide range of DL tools and development solutions. However, DLaaS presents
several security and privacy threats. PPDL presents, among others, an important tool to help
ensure the security and privacy of sensitive data. In this study, we proposed a hybrid approach
based on PHE and SHE. Experimental results are conducted on a real-world RS dataset and
show good performance in terms of the accurate decision, data security, and runtime. This
helps to popularize the adoption of DLaas without compromising data privacy.
The literature review is split into two parts. The first part defines remote sensing and provides
the characteristics of remote sensing images. The second part presents the background of
privacy-preserving machine learning that is the fields of Deep Learning, Convolutional Neural
Networks, Transfer Learning, and an overview of privacy-preserving deep learning (PPDL)
approaches exciting in the literature. After that, Table 2-3 summarises the advantages and
drawbacks of each PPDL technique. Afterward, the description of related works based on
hybrid PPDL approaches are presented and Table 2-4 compares related works and our proposed
work. Nevertheless, in the methodology chapter, we have presented the proposed hybrid
PPDL architecture with its different steps. To preserve data privacy, we have presented the
reasons for how to use the combination of two approaches of Homomorphic Encryption.
The proposed approach is based on some properties of PHE and SHE which are described
in detail in the methodology chapter. Our Hybrid approach has two different stages; the
first stage is to generate the different public keys (Algorithm 7) and the second stage is to
generate the cipher-text (Algorithm 8). Hereafter, we demonstrate two equations (addition
and multiplication of two cipher values which are generated by the proposed hybrid approach)
and defined the pre-trained models which are used to assess our approach. The goal of the
experiment results chapter is to assess the performance of the proposed approach considering
four pre-trained models namely ResNet50, InceptionV3, DenseNet169, and MobileNetV2. In
fact, the assessment of our approach involves three steps. The first step is to encrypt the image
using the Hybrid PPDL. The second step is to evaluate its performance based on different
metrics (accuracy, precision, recall, F1_score). Thus, our study shows the performance of our
proposed technique (the loss accuracy between plain and encrypted images is less than 3,5%).
The last step is to evaluate the privacy of images based on various security parameters values
compared with (Alkhelaiwi et al., 2021) work.
5.2 Future Works
Several possible extensions can be considered in future work. First, we plan to extend the
proposed hybrid approach to other fields, such as medical image analysis, in order to safeguard
critical patient data. In addition, we aim to train DL models on plain and encrypted images,
which are included in the same dataset. Thus, the accuracy shown by these models makes
51
Chapter 5: Conclusion and Future Works
it possible to classify objects in any (plain or encrypted) image. Finally, proposing other
encryption schemes that can be applied in the context of real-time applications. This will
suppose ensuring the security of encrypted data but within less runtime of encryption.
52
Appendix Figures
InceptionV3 Model Summary
Figure A-1: InceptionV3 model’s last layers.
MobileNetV2 Model Summary
Figure A-2: MobileNetV2 model’s last layers.
53
Appendix
Resnet50 Model Summary
Figure A-3: Resnet50 model’s last layers.
DenseNet169 Model Summary
Figure A-4: DenseNet169 model’s last layers.
54
Appendix
Training output MobileNetV2
Figure A-5: Sample of Training output MobileNetV2
55
References
A Lapardo, A. B. et al. (2020). What is Secure Multi-party Computation ? Recuperated March
03,2021, from. https://medium.com/pytorch/what-is-secure- multi- party-computation-
8c875fb36ca5
Acar, A., Aksu, H., Uluagac, A. S., & Conti, M. (2018). A survey on homomorphic encryption
schemes: Theory and implementation. ACM Computing Surveys (CSUR),51(4), 1–35.
Ahmad, J., & Hwang, S. O. (2016). A secure image encryption scheme based on chaotic maps
and affine transformation. Multimedia Tools and Applications,75(21), 13951–13976.
Alkhelaiwi, M., Boulila, W., Ahmad, J., Koubaa, A., & Driss, M. (2021). An efficient approach
based on privacy-preserving deep learning for satellite image classification. Remote
Sensing,13(11), 2221.
Al-Rubaie, M., & Chang, J. M. (2019). Privacy-preserving machine learning: Threats and
solutions. IEEE Security & Privacy,17(2), 49–58.
Al-Sarem, M., Alsaeedi, A., Saeed, F., Boulila, W., & AmeerBakhsh, O. (2021). A novel hybrid
deep learning model for detecting COVID-19-related rumors on social media based on
LSTM and concatenated parallel cnns. Applied Sciences,11(17), 7940.
Ambrosin, M., Braca, P., Conti, M., & Lazzeretti, R. (2017). Odin: O bfuscation-based privacy-
preserving consensus algorithm for d ecentralized i nformation fusion in smart device n
etworks. ACM Transactions on Internet Technology (TOIT),18(1), 1–22.
Atitallah, S. B., Driss, M., Boulila, W., & Ghézala, H. B. (2020). Leveraging deep learning
and iot big data analytics to support the smart cities development: Review and future
directions. Computer Science Review,38, 100303.
Ayadi, Z., Boulila, W., Farah, I. R., Leborgne, A., & Gançarski, P. (2022). Resolution methods
for constraint satisfaction problem in remote sensing field: A survey of static and
dynamic algorithms. Ecological Informatics, 101607.
Bakaeva, N., & Le, M. T. (2022). Determination of urban pollution islands by using remote
sensing technology in moscow, russia. Ecological Informatics,67, 101493.
Boulemtafes, A., Derhab, A., & Challal, Y. (2020). A review of privacy-preserving techniques
for deep learning. Neurocomputing,384, 21–45.
Boulila, W. (2019). A top-down approach for semantic segmentation of big remote sensing
images. Earth Science Informatics,12(3), 295–306.
Boulila, W., Alzahem, A., Almoudi, A., Afifi, M., Alturki, I., & Driss, M. (2021).
A deep learning-based approach for real-time facemask detection. 2021 20th
IEEE International Conference on Machine Learning and Applications (ICMLA),
1478–1481.
Boulila, W., Ayadi, Z., & Farah, I. R. (2017). Sensitivity analysis approach to model epistemic
and aleatory imperfection: Application to land cover change prediction model. Journal
of computational science,23, 58–70.
Boulila, W., Farah, I. R., Ettabaa, K. S., Solaiman, B., & Ghézala, H. B. (2009). Improving
spatiotemporal change detection: A high level fusion approach for discovering uncertain
knowledge from satellite image databases. Icdm,9, 222–227.
Boulila, W., Farah, I. R., Ettabaa, K. S., Solaiman, B., & Ghézala, H. B. (2010). Spatio-temporal
modeling for knowledge discovery in satellite image databases. CORIA, 35–49.
56
References
Boulila, W., Ghandorh, H., Khan, M. A., Ahmed, F., & Ahmad, J. (2021). A novel CNN-LSTM-
based approach to predict urban expansion. Ecological Informatics,64, 101325.
Byun, H. (2019). The Advantages and Disadvantages of Homomorphic Encryption.
Recuperated February 28,2021, from. https : / / blog . openmined . org / what - is -
homomorphic-encryption/https://baffle.io/blog/the-advantages-and-disadvantages-of-
homomorphic-encryption/
Chabanne, H., de Wargny, A., Milgram, J., Morel, C., & Prouff, E. (2017). Privacy-preserving
classification on deep neural network. IACR Cryptol. ePrint Arch.,2017, 35.
Chase, M., Gilad-Bachrach, R., Laine, K., Lauter, K., & Rindal, P. (2017). Private collaborative
neural network learning. Cryptology ePrint Archive.
Chebbi, I., Boulila, W., & Farah, I. R. (2016). Improvement of satellite image classification:
Approach based on hadoop/mapreduce. 2016 2nd International Conference on
Advanced Technologies for Signal and Image Processing (ATSIP), 31–34.
Chen, C., Zhou, J., Wang, L., Wu, X., Fang, W., Tan, J., Wang, L., Liu, A. X., Wang, H., &
Hong, C. (2021). When homomorphic encryption marries secret sharing: Secure large-
scale sparse logistic regression and applications in risk control. Proceedings of the 27th
ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2652–2662.
Choi, R. Y., Coyner, A. S., Kalpathy-Cramer, J., Chiang, M. F., & Campbell, J. P. (2020).
Introduction to machine learning, neural networks, and deep learning. Translational
Vision Science & Technology,9(2), 14–14.
Coops, N. C., & Tooke, T. R. (2017). Introduction to remote sensing. Learning landscape
ecology (pp. 3–19). Springer.
Coron, J.-S., Naccache, D., & Tibouchi, M. (2012). Public key compression and modulus
switching for fully homomorphic encryption over the integers. Annual International
Conference on the Theory and Applications of Cryptographic Techniques, 446–464.
Dhanalakshmi, M., & Sankari, E. S. (2014). Privacy preserving data mining techniques-survey.
International Conference on Information Communication and Embedded Systems
(ICICES2014), 1–6.
Dhingra, S., & Kumar, D. (2019). A review of remotely sensed satellite image classification.
International Journal of Electrical & Computer Engineering (2088-8708),9(3).
Domingo-Ferrer, J., Farras, O., Ribes-González, J., & Sánchez, D. (2019). Privacy-preserving
cloud computing on sensitive data: A survey of methods, products and challenges.
Computer Communications,140, 38–60.
Dosselmann, R., & Yang, X. D. (2011). A comprehensive assessment of the structural similarity
index. Signal, Image and Video Processing,5(1), 81–91.
El Makkaoui, K., Beni-Hssane, A., & Ezzati, A. (2016). Can hybrid homomorphic encryption
schemes be practical? 2016 5th International Conference on Multimedia Computing
and Systems (ICMCS), 294–298.
Feng, S.-H., Xu, J.-Y., & Shen, H.-B. (2020). Artificial intelligence in bioinformatics:
Automated methodology development for protein residue contact map prediction.
Biomedical information technology (pp. 217–237). Elsevier.
Ferchichi, A., Boulila, W., & Farah, I. R. (2017). Propagating aleatory and epistemic
uncertainty in land cover change prediction process. Ecological informatics,37, 24–37.
Ganti, A. (2020). Correlation coefficient. Corp. Financ. Account,9, 145–152.
Ghandorh, H., Boulila, W., Masood, S., Koubaa, A., Ahmed, F., & Ahmad, J. (2022). Semantic
segmentation and edge detection—approach to road detection in very high resolution
satellite images. Remote Sensing,14(3), 613.
57
References
Gruson, D., Helleputte, T., Rousseau, P., & Gruson, D. (2019). Data science, artificial
intelligence, and machine learning: Opportunities for laboratory medicine and the value
of positive regulation. Clinical biochemistry,69, 1–7.
Hajjaji, Y., Boulila, W., & Farah, I. R. (2021). An improved tile-based scalable distributed
management model of massive high-resolution satellite images. Procedia Computer
Science,192, 2931–2942.
Hariss, K., Chamoun, M., & Samhat, A. E. (2017). On DGHV and BGV fully homomorphic
encryption schemes. 2017 1st Cyber Security in Networking Conference (CSNet), 1–9.
Hernández-Garcıa, A., & König, P. (2018). Data augmentation instead of explicit
regularization. arXiv preprint arXiv:1806.03852.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected
convolutional networks. Proceedings of the IEEE conference on computer vision and
pattern recognition, 4700–4708.
Kaaniche, N., Laurent, M., & Belguith, S. (2020). Privacy enhancing technologies for solving
the privacy-personalization paradox: Taxonomy and survey. Journal of Network and
Computer Applications, 102807.
Kandel, I., & Castelli, M. (2020). Transfer learning with convolutional neural networks for
diabetic retinopathy image classification. a review. Applied Sciences,10(6), 2021.
Kasar, N. (2016). Image secret sharing using Shamir’s Algorithm. Recuperated March 15,2021,
from. https : / / fr . slideshare . net / NikitaKasar / image - secret - sharing - using - shamirs -
algorithm-59670385
Kassani, S. H., & Kassani, P. H. (2019). A comparative study of deep learning architectures on
melanoma detection. Tissue and Cell,58, 76–83.
Khamparia, A., Gupta, D., de Albuquerque, V. H. C., Sangaiah, A. K., & Jhaveri, R. H. (2020).
Internet of health things-driven deep learning system for detection and classification of
cervical cells using transfer learning. The Journal of Supercomputing, 1–19.
Kulynych, B. (2015). Symmetric somewhat homomorphic encryption over the integers.
Proceedings of Ukrainian scientific conference of young scientists on Mathematics and
Physics, 1–12.
Lopardo A, F. T. et al. ((n.d)). What is Homomorphic Ecryption? Recuperated February
28,2021, from. https://blog.openmined.org/what-is-homomorphic-encryption/
Lu, Q., Yu, L., & Zhu, C. (2022). Symmetric image encryption algorithm based on a new
product trigonometric chaotic map. Symmetry,14(2), 373.
Mahmood, A., Bennamoun, M., An, S., Sohel, F., & Boussaid, F. (2020). Resfeats:
Residual network based features for underwater image classification. Image and Vision
Computing,93, 103811.
Marcano, N. J. H., Moller, M., Hansen, S., & Jacobsen, R. H. (2019). On fully homomorphic
encryption for privacy-preserving deep learning. 2019 IEEE Globecom Workshops (GC
Wkshps), 1–6.
Mattsson, U. (2021). Security and Performance of Homomorphic Encryption. recuperated
August 26,2021, from. https://www.globalsecuritymag.com/Security-and-Performance-
of,20210601,112333.html
Medhi, B. (2019). Privacy-Preserving Computation Techniques FHE from Ziros Labs.
Recuperated February 28,2021, from. https:/ /medium.com/ @bhaskarmedhi/privacy-
preserving-computation-techniques-fhe-from-ziroh-labs-8814e88044a.
Morris, L. (2013). Analysis of partially and fully homomorphic encryption. Rochester Institute
of Technology, 1–5.
58
References
Muhammad, K., Sugeng, K. A., & Murfi, H. (2018). Machine learning with partially
homomorphic encrypted data. Journal of Physics: Conference Series,1108(1), 012112.
n.A. (2021). Implementing shamir’s secret sharing scheme in python. Recuperated February
28,2021, from. https://www.geeksforgeeks.org/implementing-shamirs-secret-sharing-
scheme-in-python/
(n.A). ((n.d)). What is Dimensionality Reduction Techniques, Methods, Components. March
03,2021, from. https://data-flair.training/blogs/dimensionality-reduction-tutorial/
Ng, H.-W., Nguyen, V. D., Vonikakis, V., & Winkler, S. (2015). Deep learning for emotion
recognition on small datasets using transfer learning. Proceedings of the 2015 ACM on
international conference on multimodal interaction, 443–449.
Ogburn, M., Turner, C., & Dahal, P. (2013). Homomorphic encryption. Procedia Computer
Science,20, 502–509.
Paillier, P. (1999). Public-key cryptosystems based on composite degree residuosity classes.
International conference on the theory and applications of cryptographic techniques,
223–238.
Pan, X., Jiang, J., & Xiao, Y. (2022). Identifying plants under natural gas micro-leakage stress
using hyperspectral remote sensing. Ecological Informatics,68, 101542.
Pandya, M. D., Shah, P. D., & Jardosh, S. (2019). Medical image diagnosis for disease
detection: A deep learning approach. U-healthcare monitoring systems (pp. 37–60).
Elsevier.
Pang, L.-J., & Wang, Y.-M. (2005). A new (t, n) multi-secret sharing scheme based on shamir’s
secret sharing. Applied Mathematics and Computation,167(2), 840–848.
Parmar, P. V., Padhar, S. B., Patel, S. N., Bhatt, N. I., & Jhaveri, R. H. (2014). Survey of various
homomorphic encryption algorithms and schemes. International Journal of Computer
Applications,91(8).
Petrovska, B., Atanasova-Pacemska, T., & Stojanovic, I. (2018). Classification of small sets
of images with pre-trained neural networks. International Journal of Engineering and
Manufacturing,8(4), 40–55.
Pisa, P. S., Abdalla, M., & Duarte, O. C. M. B. (2012). Somewhat homomorphic
encryption scheme for arithmetic operations on large integers. 2012 Global Information
Infrastructure and Networking Symposium (GIIS), 1–8.
Pritt, M., & Chern, G. (2017). Satellite image classification with deep learning. 2017 IEEE
Applied Imagery Pattern Recognition Workshop (AIPR), 1–7.
Qayyum, A., Qadir, J., Bilal, M., & Al-Fuqaha, A. (2020). Secure and robust machine learning
for healthcare: A survey. IEEE Reviews in Biomedical Engineering,14, 156–180.
Sanderson, R. (2010). Introduction to remote sensing. New Mexico State University, 25–6.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). Mobilenetv2:
Inverted residuals and linear bottlenecks. Proceedings of the IEEE conference on
computer vision and pattern recognition, 4510–4520.
Serra, E., Sharma, A., Joaristi, M., & Korzh, O. (2018). Unknown landscape identification
with CNN transfer learning. 2018 IEEE/ACM International Conference on Advances in
Social Networks Analysis and Mining (ASONAM), 813–820.
Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on image data augmentation for deep
learning. Journal of Big Data,6(1), 1–48.
Shrestha, R., & Kim, S. (2019). Integration of iot with blockchain and homomorphic
encryption: Challenging issues and opportunities. Advances in computers
(pp. 293–331). Elsevier.
59
References
Singh, P. (2020). Dimensionality Reduction Approches. Recuperated March 03,2021, from.
https://towardsdatascience.com/dimensionality-reduction-approaches-8547c4c44334
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception
architecture for computer vision. Proceedings of the IEEE conference on computer
vision and pattern recognition, 2818–2826.
Tanuwidjaja, H. C., Choi, R., Baek, S., & Kim, K. (2020). Privacy-preserving deep
learning on machine learning as a service—a comprehensive survey. IEEE Access,8,
167425–167447.
Tanuwidjaja, H. C., Choi, R., & Kim, K. (2019). A survey on deep learning techniques for
privacy-preserving. International Conference on Machine Learning for Cyber Security,
29–46.
Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., Zhang, R., & Zhou, Y. (2019). A
hybrid approach to privacy-preserving federated learning. Proceedings of the 12th ACM
Workshop on Artificial Intelligence and Security, 1–11.
Van Dijk, M., Gentry, C., Halevi, S., & Vaikuntanathan, V. (2010). Fully homomorphic
encryption over the integers. Annual international conference on the theory and
applications of cryptographic techniques, 24–43.
Wood, A., Najarian, K., & Kahrobaei, D. (2020). Homomorphic encryption for machine
learning in medicine and bioinformatics. ACM Computing Surveys (CSUR),53(4),
1–35.
Xiao, Y., Lim, S. K., Tan, T. S., & Tay, S. C. (2004). Feature extraction using very high
resolution satellite imagery. IGARSS 2004. 2004 IEEE International Geoscience and
Remote Sensing Symposium,3.
Xiong, L., Dong, D., Xia, Z., & Chen, X. (2018). High-capacity reversible data hiding for
encrypted multimedia data with somewhat homomorphic encryption. IEEE Access,6,
60635–60644.
Xu, R., Baracaldo, N., Zhou, Y., Anwar, A., & Ludwig, H. (2019). Hybridalpha: An efficient
approach for privacy-preserving federated learning. Proceedings of the 12th ACM
Workshop on Artificial Intelligence and Security, 13–23.
Xue, X., Jin, H., Zhou, D., & Zhou, C. (2021). Medical image protection algorithm based on
deoxyribonucleic acid chain of dynamic length. Frontiers in Genetics,12, 266.
Yi, X., Paulet, R., & Bertino, E. (2014). Homomorphic encryption. Homomorphic encryption
and applications (pp. 27–46). Springer.
You, W., Shen, C., Wang, D., Chen, L., Jiang, X., & Zhu, Z. (2019). An intelligent deep feature
learning method with improved activation functions for machine fault diagnosis. IEEE
Access,8, 1975–1985.
Yuan, Q., Shen, H., Li, T., Li, Z., Li, S., Jiang, Y., Xu, H., Tan, W., Yang, Q., Wang, J., et al.
(2020). Deep learning in environmental remote sensing: Achievements and challenges.
Remote Sensing of Environment,241, 111716.
Zapechnikov, S. (2020). Privacy-preserving machine learning as a tool for secure personalized
information services. Procedia Computer Science,169, 393–399.
Zhang, S., Bamakan, S. M. H., Qu, Q., & Li, S. (2018). Learning for personalized medicine:
A comprehensive review from a deep learning perspective. IEEE reviews in biomedical
engineering,12, 194–208.
Zhao, C., Zhao, S., Zhao, M., Chen, Z., Gao, C.-Z., Li, H., & Tan, Y.-a. (2019). Secure
multi-party computation: Theory, practice and applications. Information Sciences,476,
357–372.
60
References
Zhou, J., Luo, X., Shen, Q., & Xu, Z. (2020). Information and communications security: 21st
international conference, ICICS 2019, beijing, china, december 15–17, 2019, revised
selected papers (Vol. 11999). Springer Nature.
61
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In the present work, a neotype chaotic product trigonometric map (PTM) system is proposed. We demonstrate the chaotic characteristics of a PTM system by using a series of complexity criteria, such as bifurcation diagrams, Lyapunov exponents, approximate entropy, permutation entropy, time-series diagrams, cobweb graphs, and NIST tests. It is proved that the PTM system has a wider chaotic parameter interval and more complex chaotic performance than the existing sine map system. In addition, a novel PTM based symmetric image encryption scheme is proposed, in which the key is related to the hash value of the image. The algorithm realizes the encryption strategy of one-graph-one-key, which can resist plaintext attack. A two-dimensional coordinate traversal matrix for image scrambling and a one-dimensional integer traversal sequence for image pixel value transformation encryption are generated by the pseudo-random integer generator (PRING). Security analysis and various simulation test results show that the proposed image encryption scheme has good cryptographic performance and high time efficiency.
Article
Full-text available
Road detection technology plays an essential role in a variety of applications, such as urban planning, map updating, traffic monitoring and automatic vehicle navigation. Recently, there has been much development in detecting roads in high-resolution (HR) satellite images based on semantic segmentation. However, the objects being segmented in such images are of small size, and not all the information in the images is equally important when making a decision. This paper proposes a novel approach to road detection based on semantic segmentation and edge detection. Our approach aims to combine these two techniques to improve road detection, and it produces sharp-pixel segmentation maps, using the segmented masks to generate road edges. In addition, some well-known architectures, such as SegNet, used multi-scale features without refinement; thus, using attention blocks in the encoder to predict fine segmentation masks resulted in finer edges. A combination of weighted cross-entropy loss and the focal Tversky loss as the loss function is also used to deal with the highly imbalanced dataset. We conducted various experiments on two datasets describing real-world datasets covering the three largest regions in Saudi Arabia and Massachusetts. The results demonstrated that the proposed method of encoding HR feature maps effectively predicts sharp segmentation masks to facilitate accurate edge detection, even against a harsh and complicated background.
Article
Full-text available
Spreading rumors in social media is considered under cybercrimes that affect people, societies , and governments. For instance, some criminals create rumors and send them on the internet, then other people help them to spread it. Spreading rumors can be an example of cyber abuse, where rumors or lies about the victim are posted on the internet to send threatening messages or to share the victim's personal information. During pandemics, a large amount of rumors spreads on social media very fast, which have dramatic effects on people's health. Detecting these rumors manually by the authorities is very difficult in these open platforms. Therefore, several researchers conducted studies on utilizing intelligent methods for detecting such rumors. The detection methods can be classified mainly into machine learning-based and deep learning-based methods. The deep learning methods have comparative advantages against machine learning ones as they do not require pre-processing and feature engineering processes and their performance showed superior enhancements in many fields. Therefore, this paper aims to propose a Novel Hybrid Deep Learning Model for Detecting COVID-19-related Rumors on Social Media (LSTM-PCNN). The proposed model is based on a Long Short-Term Memory (LSTM) and Concatenated Parallel Convolutional Neural Networks (PCNN). The experiments were conducted on an ArCOV-19 dataset that included 3157 tweets; 1480 of them were rumors (46.87%) and 1677 tweets were non-rumors (53.12%). The findings of the proposed model showed a superior performance compared to other methods in terms of accuracy, recall, precision, and F-score.
Article
Full-text available
Satellite images have drawn increasing interest from a wide variety of users, including business and government, ever since their increased usage in important fields ranging from weather, forestry and agriculture to surface changes and biodiversity monitoring. Recent updates in the field have also introduced various deep learning (DL) architectures to satellite imagery as a means of extracting useful information. However, this new approach comes with its own issues, including the fact that many users utilize ready-made cloud services (both public and private) in order to take advantage of built-in DL algorithms and thus avoid the complexity of developing their own DL architectures. However, this presents new challenges to protecting data against unauthorized access, mining and usage of sensitive information extracted from that data. Therefore, new privacy concerns regarding sensitive data in satellite images have arisen. This research proposes an efficient approach that takes advantage of privacy-preserving deep learning (PPDL)-based techniques to address privacy concerns regarding data from satellite images when applying public DL models. In this paper, we proposed a partially homomorphic encryption scheme (a Paillier scheme), which enables processing of confidential information without exposure of the underlying data. Our method achieves robust results when applied to a custom convolutional neural network (CNN) as well as to existing transfer learning methods. The proposed encryption scheme also allows for training CNN models on encrypted data directly, which requires lower computational overhead. Our experiments have been performed on a real-world dataset covering several regions across Saudi Arabia. The results demonstrate that our CNN-based models were able to retain data utility while maintaining data privacy. Security parameters such as correlation coefficient (−0.004), entropy (7.95), energy (0.01), contrast (10.57), number of pixel change rate (4.86), unified average change intensity (33.66), and more are in favor of our proposed encryption scheme. To the best of our knowledge, this research is also one of the first studies that applies PPDL-based techniques to satellite image data in any capacity.
Article
Full-text available
Current image encryption algorithms have various deficiencies in effectively protecting medical images with large storage capacity and high pixel correlation. This article proposed a new image protection algorithm based on the deoxyribonucleic acid chain of dynamic length, which achieved image encryption by DNA dynamic coding, generation of DNA dynamic chain, and dynamic operation of row chain and column chain. First, the original image is encoded dynamically according to the binary bit from a pixel, and the DNA sequence matrix is scrambled. Second, DNA sequence matrices are dynamically segmented into DNA chains of different lengths. After that, row and column deletion operation and transposition operation of DNA dynamic chain are carried out, respectively, which made DNA chain matrix double shuffle. Finally, the encrypted image is got after recombining DNA chains of different lengths. The proposed algorithm was tested on a list of medical images. Results showed that the proposed algorithm showed excellent security performance, and it is immune to noise attack, occlusion attack, and all common cryptographic attacks.
Article
Monitoring environmental evolutions, one of the most crucial axes on which sustainable development is based, requires the knowledge of information on the observed geographical scenes. Due to the continuous technological developments in the remote sensing field, the data sources increase exponentially over time, and the information contained in satellite images becomes increasingly rich. These characteristics make the extraction, processing and resolution of such data a complicated task that varies according to the situation. It is, therefore, necessary to develop tools adapted to satellite images interpretation and analysis problems. In this context, the constraint satisfaction problem (CSP) seems to be one of the methods to solve these problems. Despite some challenges, the CSP approach has performed well and proven its value in various fields. The effectiveness of CSP is based on two key points: first, the definition of constraints, and second, the choice of resolution methods. Therefore, a synthesis document covering CSP resolution methods, both static and dynamic, becomes relevant and necessary. This paper represents the first comprehensive review of CSP. We begin by listing CSP methods, detailing their principles, and presenting the corresponding algorithms. We then illustrate the execution process of CSP methods using examples applied in the remote sensing domain for each one. Following this, we present a complete comparative study arranged according to key characteristics, which is intended to guide researchers to help them select the most appropriate resolution method for any given context. Finally, we present a set of challenges and future directions designed to suggest and drive further research in this promising field.
Article
Natural gas is an important clean energy source. The demand for, and consumption of, natural gas have been increasing in recent years. Slight natural gas leakage can occur during transportation, which can have a negative impact on the environment, economy, and safety. However, it is relatively difficult to directly detect natural gas microleakage. Hyperspectral remote sensing technology is useful for analyzing the spectral characteristics of vegetation near leakage areas, thereby indirectly obtaining leakage information. In this study, a field experiment was designed to simulate natural gas leakage from an underground pipeline and gas stress on three plant species. The canopy spectral reflectance of the vegetation throughout the growth period of the plants was collected and analyzed. Variational mode decomposition was then used to decompose the spectra. Based on the stress distance (SD) and intrinsic mode functions, it was found that the second intrinsic mode function, with a decomposition scale of 32, was sensitive to gas stress. According to the results of SD, the bands (616 and 829 nm) sensitive to natural gas stress for the three plant species were extracted, and the variational mode decomposition index (VMDI) was constructed. The Jeffries–Matusita distance (JMD) was used to quantitatively evaluate the VMDI index and three indices were used to evaluate the ability to recognize stress. It was found that the index proposed in this study could identify stressed wheat and grass one week earlier than other indices and could better identify stressed vegetation throughout the phenological cycle (JMD > 1.8). The results show that the proposed index can be used as a reliable method to identify natural gas-stressed plants, and that hyperspectral technology is promising for detecting the location of natural gas leaks from underground pipelines.
Article
Pollution of the atmosphere with harmful substances is currently the most dangerous form of degradation of the natural environment in Russia. The peculiarities of the environmental situation and the emerging environmental problems in some areas of the Russian Federation are caused by local natural conditions and the nature of the impacts from industries, transport, utilities, and agriculture (the specifics of enterprises, their capacity, location, technologies used). As a rule, the magnitude of air pollution depends on the degree of urbanization and anthropogenic transformation of the territory and climatic conditions that determine the potential for atmospheric pollution. During high-temperature technological processes, the smallest aerosol particles (0.5..0.10 μm) formed, poorly captured by gas purification plants, and can migrate in the atmosphere for considerable distances. Larger particles (2.5 μm and above) are formed due to the mechanical decomposition of solid particles and enter the atmosphere due to wind erosion, the dusting of dirt roads, the erasure of vehicle tires. The particles suspended with a diameter of not more than 2.5 μm (PMX) are the most destructive to health since they penetrate and get deposited deep into human lungs. These microns, present in a suspended state in the air, consist of a complex mixture of large and small, solid and liquid particles, of both inorganic and organic substances. The boundary between the two fractions is usually particles with a diameter of 2.5 μm (PM2.5). This study sought to build a model for determining fine dust PM2.5 in the Moscow air environment using Landsat 8 OLI satellite image channels and data on the concentrations of fine dust PM2.5 obtained by weather stations in the city. In addition, a correlation analysis was carried out to determine a regression model for studying the dispersion of fine dust in the city. The results obtained are presented on a map of the concentration of fine dust PM2.5 in Moscow, supporting management decisions and decision-making on environmental policy in urban planning.
Article
The amount of remote sensing (RS) data has increased at an unexpected scale, due to the rapid progress of earth-observation and the growth of satellite RS and sensor technologies. Traditional relational databases attend their limit to meet the needs of high-resolution and large-scale RS Big Data management. As a result, massive RS data management is currently one of the most imperative topics. To address this problem, this paper describes a distributed architecture for big RS data storage based on a unified metadata file, pyramid model, and Hilbert curve for data composition and indexing using NoSQL databases (i.e, Apache Hbase). In this paper, a Hadoop-based framework in AzureInsight cloud platform is designed to manage massive RS data in a parallel and distributed way. Experimental results prove that our method has the potential to overcome the weakness of traditional methods. The proposed model is suitable for massive high-resolution image data management.