ArticlePDF Available

Image anonymization using clustering with pixelization

Authors:

Abstract and Figures

With the increasing usage of images to express opinions, feelings and one’s self, on social media, and other websites, privacy concerns become an issue. The need to anonymize a person’s face, or other aspects presented in an image for legal or personal reasons has sometimes been overlooked. Pixelization is a common technique that is used for anonymizing images. However, this technique has been proved to be a not-so-reliable technique, as the images can be restored using de-pixelization techniques. Clustering is usually used in relation to images, for image segmentation. When used in combination with pixelization, it proves to be an effective way to anonymize images. In this paper, the authors investigate the cons of using only pixelization, and prove how the use of clustering can improve the chances of anonymizing effec-tively.
Content may be subject to copyright.
Copyright © 2018 Ria. Elin Thomas et. al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
International Journal of Engineering & Technology, 7 (2.33) (2018) 990-993
International Journal of Engineering & Technology
Website: www.sciencepubco.com/index.php/IJET
Research paper
Image anonymization using clustering with pixelization
Ria. Elin Thomas 1 *, Sharmila K. Banu 2, B. K. Tripathy 3
1 Masters Student, Scope, Vellore Institute of Technology, Vellore
2 Assistant Professor, Scope, Vellore Institute of Technology, Vellore
3 Senior Professor, Scope, Vellore Institute of Technology, Vellore
*Corresponding author E-mail: riaelinthomas@gmail.com
Abstract
With the increasing usage of images to express opinions, feelings and one’s self, on social media, and other websites, privacy concerns
become an issue. The need to anonymize a person’s face, or other aspects presented in an image for legal or personal reasons has sometimes
been overlooked. Pixelization is a common technique that is used for anonymizing images. However, this technique has been proved to be
a not-so-reliable technique, as the images can be restored using de-pixelization techniques. Clustering is usually used in relation to images,
for image segmentation. When used in combination with pixelization, it proves to be an effective way to anonymize images. In this paper,
the authors investigate the cons of using only pixelization, and prove how the use of clustering can improve the chances of anonymizing
effec-tively.
Keywords: Fuzzy C-Means Clustering; Image Anonymization; Pixelization; Privacy.
1. Introduction
Social media is one of the platforms where millions of images are
uploaded into the world-wide web on a daily basis. However, many
of these contain images of people or things that may exploit privacy
or may lead to legal issues. Pixelization and blurring are some of
the techniques that have been used to avoid these issues. With the
increase in sophisticated face detection, face recognition, and image
restoration algorithms and technologies, the pixelization and blur-
ring techniques have failed to achieve its purpose.
A framework [1] was defined which helps to assess the privacy pro-
tection solutions for video surveillance. Two face recognition algo-
rithms were assessed, namely Linear Discriminant Analysis (LDA)
and Principal Components Analysis (PCA). The CSU Face Identi-
fication Evaluation System was used to compare the two face recog-
nition algorithms. These face recognition algorithms were applied
to those images that were altered using privacy protection tech-
niques such as pixelization, Gaussian blur, and scrambling. The au-
thors concluded that pixelization and blurring was ineffective for
the purpose of privacy protection. However, their observations were
that the scrambling techniques were proved to be more efficient in
comparison to the former two techniques. Another framework [2]
was proposed that helps to protect privacy during crowd movement
analysis. The face image data are first anonymized before sending
it to the data center. In the data center, the authors have proposed to
use the eigen face recognition, where the pattern matching is done
in a low dimensional space. The anonymization is done by k-mem-
ber clustering applied to the facial feature vectors. The authors con-
cluded that although this did ensure privacy, the obtained
knowledge was not as reliable, since anonymization led to infor-
mation losses.
The authors in article [3] have investigated the applicability of
Fuzzy clustering-based k-anonymization for the crowd movement
analysis. Since, fuzzy clustering has the multi-cluster assignment
feature, this helps to reduce information losses. The fuzzy cluster-
ing was applied to the eigen face features that would be sent to the
data center. The authors concluded that with an appropriate fuzzi-
ness degree, the fuzzy clustering-based k-anonymization had ad-
vantage over the normal k-member clustering. In [4], the authors
conducted a user study to discuss the effects of anonymization tech-
niques that include different pixelization techniques. 103 users par-
ticipated in this study, and they were made to verify whether people
from the obscured images can be identified, and also whether their
actions can be recognized.
A system [5] is designed to protect the privacy in the medical im-
ages. DICOM (Digital Imaging and Communications in Medicine)
image information and the oncology patient records are anony-
mized here. Anonymization of the data is done by implementing
policies based on the type of the user accessing the system.
Authors in [6] deal with the process of preserving the privacy of the
data providers while combining their works in the database for
datamining problems. M-privacy algorithm and multiparty compu-
tational protocol is used for anonymization instead of general k-an-
onymity and l-diversity for privacy preserving.
A machine-learning model [7] was proposed for anonymizing DI-
COM images. The model consisted of image preprocessing, classi-
fication algorithms, and de-identification. The authors were able to
conclude that with RBM and Random Forest classifier, they were
able to attain a 94% of precision, recall and F1-Score. Privacy pro-
tection filters like pixelization, Gaussian blur, blackener etc. were
analyzed by a framework [8] that detects the presence of a filter,
and classifies the type of filter that was used, along with the strength
of the filter. An appropriate tool was then used to reverse the anon-
ymization process which revealed that the filters were unable to
achieve its purpose of protecting the identity of the people.
Authors [9] did a review on the different methods used to achieve
image anonymization like blurring, pixelization, chaos cryptog-
raphy etc., to verify whether these methods can be reversible. They
evaluated these methods based on security and intelligibility. The
methods were categorized under Transform-domain and pixel level,
991
based on the method that is used by each of these techniques to
anonymize the image. Authors in [10] performed an attack on the
pixelization technique that was done on video streams. This was
done by first taking the average of two frames, and then applying
maximum-a-posteriori method to recover the image.
The paper is organized as follows: Section II explains the two algo-
rithms that are used for anonymizing the image, i.e., pixelization
and clustering. Section III shows how the algorithms were put to-
gether for implementing the anonymization algorithm, and Section
IV shows the results that were obtained. Section V gives the con-
clusion to the paper.
2. Concepts
In this section, we provide some of the concepts to be used in the
paper.
2.1. Pixelization
Pixelization is a strategy to make parts of a picture difficult to rec-
ognize by the human eye by misleadingly diminishing the picture
resolution. It is achieved by splitting the image into M*M squares
that are non-overlapping, where M is user-defined. The pixels
within the image are replaced by the average value within each
square.
11
200
1
( , ) ,
nn
ij
xy
Ip x y I i j
n n n



   
 
 
   
   

(1)
where x and y are the pixel coordinates and n is the block size.
Pixelization has been used for various purposes such as censorship
and anonymization. It is a very common way to anonymize faces,
number plates, gestures that are deemed unfit to show to the public
through media, etc. However, through various de-pixelization tech-
niques such as Bicubic Interpolation [11] and Cubic convolution
interpolation [12], it has been proved that this technique cannot be
relied for effective anonymization.
2.2 Fuzzy c-means clustering
Clustering has been defined as the process of putting more alike
elements into groups such that elements in different groups are less
alike than those in a single group. It has been noted that we the hu-
man beings study elements in the universe in clusters. This saves
the reasoning time as if elements are in a single cluster then it be-
comes easier to study the characteristics of an element and extend
it to those of the other elements in the cluster. Whatever knowledge
is obtained for a single element or individual elements need not be
repeated again and again [18].
The first clustering algorithm introduced is the hard C-means and
the outputs of this algorithm are non-overlapping by nature. We
find in real life that most of the cases we require the clusters to be
overlapping; i.e. an element can belong to more than one cluster to
certain degrees lying in the interval [0, 1]. The clustering techniques
which generate such type clusterings are called uncertainty based
clustering algorithms. There are several models of uncertainty
available in the literature introduced so far. Each of these models
depends upon one of the uncertainty based models like Fuzzy sets,
Rough sets, Intuitionistic fuzzy sets or soft sets and their hybrid
models like the fuzzy rough sets, rough fuzzy sets, intuitionistic
fuzzy rough sets or rough intuitionistic fuzzy sets. The outputs in
the case of these algorithms are fuzzy sets in case of Fuzzy C-means,
rough sets in case of rough C-means and so on. The property of a
fuzzy set is that we have graded membership values for the ele-
ments instead of crisp membership, i. e. an element either belongs
to a cluster or not. The graded membership leads to partial belong-
ingness of elements to the cluster. Similarly, the concept of intui-
tionistic fuzzy sets associates two functions with the set; one is
called the membership function and the other one is called the non-
membership function. In case of also these two
two membership functions are in existence. But the non-member-
ship function is just the one’s complement of the membership func-
tion. So, it had no separate existence. However, in case of intuition-
istic fuzzy sets, the sum of the membership values of any element
lies in the interval [0, 1]. Hence in the case of intuitionistic fuzzy
C-means, the clusters are such that the elements have both member-
ship and non-membership values. This leads to an important notion
called the hesitation function. This adds value to the uncertainty of
belongingness of an element. The intuitionistic fuzzy c-means al-
gorithm has been developed and studied in [19], [21].
Similarly, in case of rough C-means, the clusters are rough sets. The
rough set notion depends upon the notion of uncertainty being cap-
tured by the boundary region of a set. However, the basic definition
of a rough set depends upon an equivalence relation defined over
the universe. This is because, the definition of knowledge intro-
duced by Pawlak, the originator of the notion of rough set is that
human knowledge depends upon the ability to classify objects in a
domain. The classifications are disjoint subsets of the universe and
when they are combined together by union we get the whole uni-
verse. The equivalence relations defined over a universe also de-
compose the universe into disjoint classes. It is easy to see that the
two notions of classifications and the equivalence relations are in-
terchangeable notions. So, for mathematical reasons Pawlak took
equivalence relations to define rough sets. He introduced the no-
tions of upper approximation and lower approximation with a set
with respect to an equivalence relations which approximate the set
from the lower and upper side by the way the set being included in
the upper approximation and containing the lower approximation.
Obviously, when the lower and upper approximation becomes iden-
tical we get a crisp set. Otherwise the difference between the upper
approximation and the lower approximation is termed as the bound-
ary of the set and is the region containing the uncertain elements.
It was observed in the beginning that the two models of fuzzy set
and rough set are competing models and even people tried to estab-
lish the superiority of model over the other. But, it was established
by two scientists Dubois and Prade in 1990 that far from being com-
petitive they the models complement each other. Going a step ahead
they combined these two models to propose the hybrid models of
rough fuzzy and fuzzy rough models. It has been established since
then that the hybrid models are more efficient than the individual
components. It is worth noting that many such models have been
proposed in the form of rough intuitionistic fuzzy sets and intuition-
istic fuzzy rough sets and more importantly C-means algorithms
have been proposed and studied for clustering data and have been
applied to image segmentation [16, 17], Several applications of
these algorithms can be found in some recent works [15, 20, 22],.
However, in this paper, we focus our study by taking the fuzzy C-
means algorithm only.
Definition 2.2.1: A fuzzy set A defined over a universe W is deter-
mined by its membership function
A
m
such that
: [0,1]
A
mW
.
Thus for any element e in W,
()
A
me
assumes a value lying in [zero,
1].
The notion of fuzzy set is an extension of the crisp set in the sense
that every crisp set is associated with a function called its charac-
teristic function. When
A
m
assumes values only zero or one, it re-
duces to a characteristic function and the corresponding fuzzy set
reduces to a crisp set.
Definition 2.2.2: An intuitionistic fuzzy set An over a universe W is
determined by the membership and non-membership functions
A
m
and
A
n
such that
, : [0,1]
AA
m n W
such that for any element ’e’
from W,
( ) ( ) [0,1]
AA
m e n e
.
So, the hesitation function
A
h
is such that for all e in W,
( ) 1 { ( ) ( )}
A A A
h e m e n e 
.
992
International Journal of Engineering & Technology
When,
( ) 1 ( )
AA
n e m e
, the intuitionistic fuzzy set reduces to a fuzzy
set. Here the hesitation function becomes a zero function.
Definition 2.2.3: Let A be a set over a universe W and P be an
equivalence relation over W. Let us denote the equivalence classes
generated by P over W related to an element ‘e’ as
[]
P
e
.
Then we denote the lower approximation and upper approximation
of A with respect to P by
()
P
LA
and
()
P
UA
respectively and define
them as
( ) { |[ ] }
PP
L A e W e A 
and
( ) { |[ ] }
PP
U A e W e A
 
When
( ) ( )
PP
L A U A
, we say that A is rough with respect to P and it
is said to be P-definable.
Fuzzy c-means clustering was first introduced by Dunn in 1973 [13].
Several authors have tried to improve the algorithm over the years
since then. However, the algorithm used at present is the version
enhanced by Bezdek in 1984 [14]. We would like to state that in the
algorithm proposed by the concept of fuzzifier is used. He has used
the fuzzifier to be a real number ‘m’, which has a range of values.
In fact in his objective function Dunn had used 2 for the value of
‘m’. It was noted later that the value of ‘m’ lies in the interval [1.5,
2.5]. It has been taken that the ideal value of ‘m’ happens to be ‘2’.
So, although Bezdek has extended the objective function of Dunn,
the ideal case is the special objective function used by Dunn. The
fuzzy c-means addresses the situations where a data can belong to
two different clusters, and thus providing the “fuzzy” effect to the
traditional k-means clustering algorithm. It follows on the minimi-
zation function of the following objective function:
(2)
where ai is the ith data within the dataset, mij is the degree of mem-
bership of ai in the cluster j, cj is the centre of the cluster, ||*|| is any
standard of measure to express the similarity between any data and
the center of the cluster, and p is a real number greater than 1.
The membership function mij can be measured as follows:
(3)
The cluster center cj is measured as follows:
(4)
The whole process iterates until it meets the stopping criterion, i.e,
(5)
Where ϵ is between 0 and 1, and k is the number of iterations.
The algorithm is composed of the following steps:
1) Initialize M = [mij] matrix, M(0)
2) At kth step: calculate the centers vectors C(k) = [cj] with M(k)
3) Update M(k), M(k+1)
4) If || M(k+1) M(k)|| < ε the STOP, otherwise return to step 2
3. Implementation
Python packages were used to implement the anonymization. The
Pillow packages were used to resize the image and the Nearest
Neighbor Interpolation sampling filter was applied. The Geotiff
package was used to read the image and the gdal package converts
it into grayscale image by taking only a layer of the raster band. The
Pillow package was used to convert it into a numpy array, which is
then used to give as input to the fuzzy c-means clustering module.
4. Results
In this section we take a specific image of a human being in a spe-
cific position and generate the pixelized image from it then we ap-
ply clustering to generate the segmented mage.
Fig. 1: Original Image.
Fig. 2: Pixelized Image
993
Fig. 3: Clustered Image.
The original image (Fig 1) is pixelized (Fig 2), and is then clustered
(Fig 3) using fuzzy c-means clustering to obscure the image against
face recognition algorithms. The obscured image is now more anon-
ymized than when pixelized.
5. Conclusion
With pixelization being used as a common method to obscure im-
ages, the authors have proposed to use clustering algorithm along
with pixelization to better secure it against de-pixelization methods
and other possible techniques that may give away the identities of
the people in a picture. Since the clustering algorithm uses complex
numerical expressions, it becomes almost impossible to de-cluster
the image, thus providing more security for maintaining the privacy
of the image. This can be extended onto videos, like surveillance,
where certain identities need to be anonymized, before publishing
it.
References
[1] Dufaux, F., & Ebrahimi, T. (2010, July). A framework for the vali-
dation of privacy protection solutions in video surveillance. In Mul-
timedia and Expo (ICME), 2010 IEEE International Conference on
(pp. 66-71). IEEE.
[2] Honda, K., Omori, M., Ubukata, S., & Notsu, A. (2015, June). A
privacy-preserving crowd movement analysis by k-member cluster-
ing of face images. In Informatics, Electronics & Vision (ICIEV),
2015 International Conference on (pp. 1-5). IEEE.
[3] Honda, K., Omori, M., Ubukata, S., & Notsu, A. (2015, November).
A study on fuzzy clustering-based k-anonymization for privacy pre-
serving crowd movement analysis with face recognition. In Soft
Computing and Pattern Recognition (SoCPaR), 2015 7th Interna-
tional Conference of (pp. 37-41). IEEE.
[4] Birnstill, P., Ren, D., & Beyerer, J. (2015, August). A user study on
anonymization techniques for smart video surveillance. In Advanced
Video and Signal Based Surveillance (AVSS), 2015 12th IEEE In-
ternational Conference on (pp. 1-6). IEEE.
[5] Shahbaz, S., Mahmood, A., & Anwar, Z. (2013, December). SOAD:
Securing oncology EMR by anonymizing DICOM images. In Fron-
tiers of Information Technology (FIT), 2013 11th International Con-
ference on (pp. 125-130). IEEE.
[6] Indhumathi, R., & Priya, S. M. (2014). Data Preserving By Anony-
mization Techniques for Collaborative Data Publishing. Interna-
tional Journal of Innovative Research in Science, Engineering and
Technology, 3(1), 358363.
[7] Monteiro, E., Costa, C., & Oliveira, J. L. (2015, August). A machine
learning methodology for medical imaging anonymization. In Engi-
neering in Medicine and Biology Society (EMBC), 2015 37th An-
nual International Conference of the IEEE (pp. 1381-1384). IEEE.
[8] Ruchaud, N., & Dugelay, J. L. (2016). Automatic Face Anonymiza-
tion in Visual Data: Are we really well protected?. Electronic Imag-
ing, 2016(15), 1-7.
[9] Pantoja, C., Arguedas, V. F., & Izquierdo, E. Anonymization and De-
identification of Personal Surveillance Visual Information: A Re-
view.
[10] Cavedon, L., Foschini, L., & Vigna, G. (2011, August). Getting the
Face behind the Squares: Reconstructing Pixelized Video Streams.
In WOOT (pp. 37-45).
[11] Keys, R. (1981). Cubic convolution interpolation for digital image
processing. IEEE transactions on acoustics, speech, and signal pro-
cessing, 29(6), 1153-1160.
[12] Dong, W., Zhang, L., Shi, G., & Wu, X. (2011). Image deblurring
and super-resolution by adaptive sparse domain selection and adap-
tive regularization. IEEE Transactions on Image Processing, 20(7),
1838-1857.
[13] Dunn, J. C. (1973). A fuzzy relative of the ISODATA process and
its use in detecting compact well-separated clusters.
[14] Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-
means clustering algorithm. Computers & Geosciences, 10(2-3),
191-203.
[15] B.K.Tripathy and P.Swarnalatha: A Comparative Study of RIFCM
with Other Related Algorithms from Their Suitability in Analysis of
Satellite Images using Other Supporting Techniques, Kybernetes,
vol.43, no.1,(2014), pp. 53-81
[16] B.K.Tripathy, R. Bhargav, A. Tripathy, E. Verma, Raj Kumar and
P.Swarnalatha: Rough Intuitionistic Fuzzy C-Means Algorithm and
a Comparative Analysis, COMPUTE‟13, Aug 22-24, Vellore, Tamil
Nadu, India Copyright 2013 ACM 978-1-4503-2545-5/13/08
[17] B.K.Tripathy and R. Bhargav: Kernel Based Rough-Fuzzy C-Means,
PReMI, ISI Calcutta, December, LNCS 8251, (2013), pp.148-157
[18] Swarnalatha P, Tripathy B.K., Nithin, P. L. and D. Ghosh: Cluster
Analysis Using Hybrid Soft Computing Techniques, CNC-2014In-
ternational Conference of Network and Power Engineering ,Proceed-
ings of Fifth CNC-2014,pp. 516-524
[19] B.K.Tripathy, Avik Basu and Sahil Govel: Image segmentation us-
ing spatial intuitionistic fuzzy C-means clustering, proceedings of
the IEEE ICCIC2014, (2014), pp.878-882
[20] B.K.Tripathy and D. Mittal: Efficiency Analysis of Kernel Functions
in Uncertainty Based C-Means Algorithms, International Conference
on Advances in Computing, Communications and Informatics,
ICACCI 2015, Article number 7275709, pp. 807-813 (2015).
[21] B.K. Tripathy, Deepthi P.H. and Dishant Mittal: Hadoop with Intui-
tionistic Fuzzy C-means for clustering in Big Data, Advances in In-
telligent Systems and Computing, Volume 438, 2016, Pages 599-610.
[22] B. K. Tripathy, Akarsh Goyal and Rahul Chowdhury: MMeNR:
Neighborhood Rough Set Theory Based Algorithm for Clustering
Heterogeneous Data, International Conference on Inventive Com-
munication and Computational Technologies (ICICCT 2017), (2017),
pp.323-328.
... Overview of image data anonymization using a CAM (adapted from [50]). ...
Article
Full-text available
With the continuous increase in avenues of personal data generation, privacy protection has become a hot research topic resulting in various proposed mechanisms to address this social issue. The main technical solutions for guaranteeing a user’s privacy are encryption, pseudonymization, anonymization, differential privacy (DP), and obfuscation. Despite the success of other solutions, anonymization has been widely used in commercial settings for privacy preservation because of its algorithmic simplicity and low computing overhead. It facilitates unconstrained analysis of published data that DP and the other latest techniques cannot offer, and it is a mainstream solution for responsible data science. In this paper, we present a comprehensive analysis of clustering-based anonymization mechanisms (CAMs) that have been recently proposed to preserve both privacy and utility in data publishing. We systematically categorize the existing CAMs based on heterogeneous types of data (tables, graphs, matrixes, etc.), and we present an up-to-date, extensive review of existing CAMs and the metrics used for their evaluation. We discuss the superiority and effectiveness of CAMs over traditional anonymization mechanisms. We highlight the significance of CAMs in different computing paradigms, such as social networks, the internet of things, cloud computing, AI, and location-based systems with regard to privacy preservation. Furthermore, we present various proposed representative CAMs that compromise individual privacy, rather than safeguarding it. Besides, this article provides an extended knowledge (e.g., key assertion(s), strengths, weaknesses, clustering methods used in the anonymization process, and %age improvements in quantitative results) about each technique that provides a clear view of how much this topic has been investigated thus far, and what are the research gaps that seek pertinent solutions in the near future. Finally, we discuss the technical challenges of applying CAMs, and we suggest promising opportunities for future research. To the best of our knowledge, this is the first work to systematically cover current CAMs involving different data types and computing paradigms.
... Two general types of algorithms have been developed. The first are global algorithms that apply a uniform transformation to the whole image, such as Gaussian blur, superpixelation, downsampling, or wavelet decomposition [5][6][7][8]. These methods are fast and simple to implement but have several downsides. ...
Article
Full-text available
In recent years, high-performance video recording devices have become ubiquitous, posing an unprecedented challenge to preserving personal privacy. As a result, privacy-preserving video systems have been receiving increased attention. In this paper, we present a novel privacy-preserving video algorithm that uses semantic segmentation to identify regions of interest, which are then anonymized with an adaptive blurring algorithm. This algorithm addresses two of the most important shortcomings of existing solutions: it is multi-scale, meaning it can identify and uniformly anonymize objects of different scales in the same image, and it is class-generic, so it can be used to anonymize any class of objects of interest. We show experimentally that our algorithm achieves excellent anonymity while preserving meaning in the visual data processed.
Conference Paper
Full-text available
In recent times enumerable number of clustering algorithms have been developed whose main function is to make sets of objects having almost the same features. But due to the presence of categorical data values, these algorithms face a challenge in their implementation. Also some algorithms which are able to take care of categorical data are not able to process uncertainty in the values and so have stability issues. Thus handling categorical data along with uncertainty has been made necessary owing to such difficulties. So, in 2007 MMR [1] algorithm was developed which was based on basic rough set theory. MMeR [2] was proposed in 2009 which surpassed the results of MMR in taking care of categorical data. It has the capability of handling heterogeneous data but only to a limited extent because it is based on classical rough set model. In this paper, we generalize the MMeR algorithm with neighbourhood relations and make it a neighbourhood rough set model which we call MMeNR (Min Mean Neighborhood Roughness). It takes care of the heterogeneous data and also the uncertainty associated with it. Standard data sets have been used to gauge its effectiveness over the other methods.
Conference Paper
Full-text available
A key mechanism of privacy-aware smart video surveillance is anonymization of video data. We conducted a user study with a response of 103 participants in order to investigate which pixel operations are suitable for protecting persons' identities while, at the same time, allowing a human operator to recognize persons' activities i.e., preserving the utility of the video data. Regarding the activities in the data set, namely stealing, fighting, and dropping a bag, our data does not approve the common hypothesis that privacy and utility of video data are necessarily trade-off.
Article
Full-text available
A fuzzy algorithm is presented for image segmentation of 2D gray scale images whose quality have been degraded by various kinds of noise. Traditional Fuzzy C Means (FCM) algorithm is very sensitive to noise and does not give good results. To overcome this problem, a new fuzzy c means algorithm was introduced [1] that incorporated spatial information. The spatial function is the sum of all the membership functions within the neighborhood of the pixel under consideration. The results showed that this approach was not as sensitive to noise as compared to the traditional FCM algorithm and yielded better results. The algorithm we have proposed adds an intuitionistic approach in the membership function of the existing spatial FCM (sFCM). Intuitionistic refers to the degree of hesitation that arises as a consequence of lack of information and knowledge. Proposed method is comparatively less hampered by noise and performs better than existing algorithms.
Article
Full-text available
Purpose – The purpose of this paper is to provide a way to analyze satellite images using various clustering algorithms and refined bitplane methods with other supporting techniques to prove the superiority of RIFCM. Design/methodology/approach – A comparative study has been carried out using RIFCM with other related algorithms from their suitability in analysis of satellite images with other supporting techniques which segments the images for further process for the benefit of societal problems. Four images were selected dealing with hills, freshwater, freshwatervally and drought satellite images. Findings – The superiority of the proposed algorithm, RIFCM with refined bitplane towards other clustering techniques with other supporting methods clustering, has been found and as such the comparison, has been made by applying four metrics (Otsu (Max-Min), PSNR and RMSE (40%-60%-Min-Max), histogram analysis (Max-Max), DB index and D index (Max-Min)) and proved that the RIFCM algorithm with refined bitplane yielded robust results with efficient performance, reduction in the metrics and time complexity of depth computation of satellite images for further process of an image. Practical implications – For better clustering of satellite images like lands, hills, freshwater, freshwatervalley, drought, etc. of satellite images is an achievement. Originality/value – The existing system extends the novel framework to provide a more explicit way to analyze an image by removing distortions with refined bitplane slicing using the proposed algorithm of rough intuitionistic fuzzy c-means to show the superiority of RIFCM.
Conference Paper
Full-text available
Pixelization is a technique to make parts of an image impossible to discern by the human eye by artificially decreasing the image resolution. Pixelization, as other forms of image censorship, is effective at hiding parts of an image that might be offensive to the viewer. However, pixelization is also often used also to achieve anonymity, for example to make the features of a person's face unrecognizable or the defining characteristics of cars and building unidentifiable. This use of pixelization is somewhat effective in the case of still images, even though it is open to dictionary attacks. However, when used in videos, pixelization might be vulnerable to full reconstruction attacks. In this paper, we describe an attack against the anonymization of videos through pixelization. We develop an approach that, given a pixelized video, reconstructs the image being pixelized so that the human eye can clearly identify the object being protected. We implemented our approach and tested it against both artificial and real-world videos. The results of our experiments show that, in many cases, video pixelization does not provide sufficient guarantees of anonymity.
Article
With the proliferation of digital visual data in diverse domains (video surveillance, social networks, medias, etc.), privacy concerns increase. Obscuring faces in images and videos is one option to preserve privacy while keeping a certain level of quality and intelligibility of the video. Most popular filters are blackener (black masking), pixelization and blurring. Even if it appears efficient at first sight, in terms of human perception, we demonstrate in this article that as soon as the category and the strength of the filter used to obscure faces can be (automatically) identified, there exist in the literature ad-hoc powerful approaches enable to partially cancel the impact of such filters with regards to automatic face recognition. Hence, evaluation is expressed in terms of face recognition rate associated with clean, obscured and de-obscured face images. Figure 1: Respectively, " 20 minutes " a French magazine using pixelization filter, " crimes " a French program using blurring filter and Street view by google using blurring filter.
Conference Paper
Crowd movement analysis is an important issue in social design. This paper studies an machine learning approach to crowd movement estimation through face image recognition. Although high performance face recognition is a powerful tool in individual authentication with surveillance camera images in public spaces, utilization of personal information is often hesitated under fear of privacy violation. In this paper, a privacy preserving framework for crowd movement analysis is proposed considering k-anonymization of face image features. k-anonymity is a quantitative measure of secureness in data mining and is expected to enhance the utility of personal information. An experimental result demonstrates the applicability of the secure framework in capturing crowd movement characteristics even if individual features are k-aonymized so that each individual is not distinguishable from others k − 1 ones.
Conference Paper
Privacy protection is a major requirement for the complete success of EHR systems, becoming even more critical in collaborative scenarios, where data is shared among institutions and practitioners. While textual data can be easily de-identified, patient data in medical images implies a more elaborate approach. In this work we present a solution for sensitive word identification in medical images based on a combination of two machine-learning models, achieving a F1-score of 0.94. Three experts evaluated the system performance. They analyzed the output of the present methodology and categorized the studies in three groups: studies that had their sensitive words removed (true positive), studies with complete patient identity (false negative) and studies with mistakenly removed data (false positive). The experts were unanimous regarding the relevance of the present tool in collaborative medical environments, as it may improve the exchange of anonymized patient data between institutions.
Conference Paper
In the prevailing healthcare industry requirements, the demand of electronic medical record (EMR) has been increased to provide better healthcare to patients and provide convenient access to EMR. Healthcare providers are keen to move EMR's to the cloud. Cloud computing paradigm is giving insight to shared environment for EMR, however it brings a lot of challenges like security, privacy of medical data along with its advantages. Our designed system provides appropriate management of oncology patient records and provides privacy to patient textual and DICOM (Digital Imaging and Communications in Medicine) image information by anonymizing them. We have identified PDATA (Personal Data) and CDATA (Clinical Data) from the DICOM images retrieved from PACS (Picture Archiving and Communication System) server. The role based policies are been implemented and are stored in the database, which are used for the anonymization of PDATA.