ArticlePDF Available

Image anonymization using clustering with pixelization

July 2018
International Journal of Engineering and Technology 7(2.33):990-993

July 2018
7(2.33):990-993

Authors:

Sharmila Banu Kather

VIT University

B.K. Tripathy

VIT University

With the increasing usage of images to express opinions, feelings and one’s self, on social media, and other websites, privacy concerns become an issue. The need to anonymize a person’s face, or other aspects presented in an image for legal or personal reasons has sometimes been overlooked. Pixelization is a common technique that is used for anonymizing images. However, this technique has been proved to be a not-so-reliable technique, as the images can be restored using de-pixelization techniques. Clustering is usually used in relation to images, for image segmentation. When used in combination with pixelization, it proves to be an effective way to anonymize images. In this paper, the authors investigate the cons of using only pixelization, and prove how the use of clustering can improve the chances of anonymizing effec-tively.

Figures - uploaded by B.K. Tripathy

Content may be subject to copyright.

Content uploaded by B.K. Tripathy

Content may be subject to copyright.

permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

International Journal of Engineering & Technology, 7 (2.33) (2018) 990-993

International Journal of Engineering & Technology

Website: www.sciencepubco.com/index.php/IJET

Research paper

Image anonymization using clustering with pixelization

Ria. Elin Thomas 1 *, Sharmila K. Banu 2, B. K. Tripathy 3

1 Masters Student, Scope, Vellore Institute of Technology, Vellore

2 Assistant Professor, Scope, Vellore Institute of Technology, Vellore

3 Senior Professor, Scope, Vellore Institute of Technology, Vellore

*Corresponding author E-mail: riaelinthomas@gmail.com

Abstract

With the increasing usage of images to express opinions, feelings and one’s self, on social media, and other websites, privacy concerns

become an issue. The need to anonymize a person’s face, or other aspects presented in an image for legal or personal reasons has sometimes

been overlooked. Pixelization is a common technique that is used for anonymizing images. However, this technique has been proved to be

a not-so-reliable technique, as the images can be restored using de-pixelization techniques. Clustering is usually used in relation to images,

for image segmentation. When used in combination with pixelization, it proves to be an effective way to anonymize images. In this paper,

the authors investigate the cons of using only pixelization, and prove how the use of clustering can improve the chances of anonymizing

effec-tively.

Keywords: Fuzzy C-Means Clustering; Image Anonymization; Pixelization; Privacy.

1. Introduction

Social media is one of the platforms where millions of images are

uploaded into the world-wide web on a daily basis. However, many

of these contain images of people or things that may exploit privacy

or may lead to legal issues. Pixelization and blurring are some of

the techniques that have been used to avoid these issues. With the

increase in sophisticated face detection, face recognition, and image

restoration algorithms and technologies, the pixelization and blur-

ring techniques have failed to achieve its purpose.

A framework [1] was defined which helps to assess the privacy pro-

tection solutions for video surveillance. Two face recognition algo-

rithms were assessed, namely Linear Discriminant Analysis (LDA)

and Principal Components Analysis (PCA). The CSU Face Identi-

fication Evaluation System was used to compare the two face recog-

nition algorithms. These face recognition algorithms were applied

to those images that were altered using privacy protection tech-

niques such as pixelization, Gaussian blur, and scrambling. The au-

thors concluded that pixelization and blurring was ineffective for

the purpose of privacy protection. However, their observations were

that the scrambling techniques were proved to be more efficient in

comparison to the former two techniques. Another framework [2]

was proposed that helps to protect privacy during crowd movement

analysis. The face image data are first anonymized before sending

it to the data center. In the data center, the authors have proposed to

use the eigen face recognition, where the pattern matching is done

in a low dimensional space. The anonymization is done by k-mem-

ber clustering applied to the facial feature vectors. The authors con-

cluded that although this did ensure privacy, the obtained

knowledge was not as reliable, since anonymization led to infor-

mation losses.

The authors in article [3] have investigated the applicability of

Fuzzy clustering-based k-anonymization for the crowd movement

analysis. Since, fuzzy clustering has the multi-cluster assignment

feature, this helps to reduce information losses. The fuzzy cluster-

ing was applied to the eigen face features that would be sent to the

data center. The authors concluded that with an appropriate fuzzi-

ness degree, the fuzzy clustering-based k-anonymization had ad-

vantage over the normal k-member clustering. In [4], the authors

conducted a user study to discuss the effects of anonymization tech-

niques that include different pixelization techniques. 103 users par-

ticipated in this study, and they were made to verify whether people

from the obscured images can be identified, and also whether their

actions can be recognized.

A system [5] is designed to protect the privacy in the medical im-

ages. DICOM (Digital Imaging and Communications in Medicine)

image information and the oncology patient records are anony-

mized here. Anonymization of the data is done by implementing

policies based on the type of the user accessing the system.

Authors in [6] deal with the process of preserving the privacy of the

data providers while combining their works in the database for

datamining problems. M-privacy algorithm and multiparty compu-

tational protocol is used for anonymization instead of general k-an-

onymity and l-diversity for privacy preserving.

A machine-learning model [7] was proposed for anonymizing DI-

COM images. The model consisted of image preprocessing, classi-

fication algorithms, and de-identification. The authors were able to

conclude that with RBM and Random Forest classifier, they were

able to attain a 94% of precision, recall and F1-Score. Privacy pro-

tection filters like pixelization, Gaussian blur, blackener etc. were

analyzed by a framework [8] that detects the presence of a filter,

and classifies the type of filter that was used, along with the strength

of the filter. An appropriate tool was then used to reverse the anon-

ymization process which revealed that the filters were unable to

achieve its purpose of protecting the identity of the people.

Authors [9] did a review on the different methods used to achieve

image anonymization like blurring, pixelization, chaos cryptog-

raphy etc., to verify whether these methods can be reversible. They

evaluated these methods based on security and intelligibility. The

methods were categorized under Transform-domain and pixel level,

International Journal of Engineering & Technology

991

based on the method that is used by each of these techniques to

anonymize the image. Authors in [10] performed an attack on the

pixelization technique that was done on video streams. This was

done by first taking the average of two frames, and then applying

maximum-a-posteriori method to recover the image.

The paper is organized as follows: Section II explains the two algo-

rithms that are used for anonymizing the image, i.e., pixelization

and clustering. Section III shows how the algorithms were put to-

gether for implementing the anonymization algorithm, and Section

IV shows the results that were obtained. Section V gives the con-

clusion to the paper.

2. Concepts

In this section, we provide some of the concepts to be used in the

paper.

2.1. Pixelization

Pixelization is a strategy to make parts of a picture difficult to rec-

ognize by the human eye by misleadingly diminishing the picture

resolution. It is achieved by splitting the image into M*M squares

that are non-overlapping, where M is user-defined. The pixels

within the image are replaced by the average value within each

square.

200

( , ) ,

Ip x y I i j

n n n







   

  

 

   

   



(1)

where x and y are the pixel coordinates and n is the block size.

Pixelization has been used for various purposes such as censorship

and anonymization. It is a very common way to anonymize faces,

number plates, gestures that are deemed unfit to show to the public

through media, etc. However, through various de-pixelization tech-

niques such as Bicubic Interpolation [11] and Cubic convolution

interpolation [12], it has been proved that this technique cannot be

relied for effective anonymization.

2.2 Fuzzy c-means clustering

Clustering has been defined as the process of putting more alike

elements into groups such that elements in different groups are less

alike than those in a single group. It has been noted that we the hu-

man beings study elements in the universe in clusters. This saves

the reasoning time as if elements are in a single cluster then it be-

comes easier to study the characteristics of an element and extend

it to those of the other elements in the cluster. Whatever knowledge

is obtained for a single element or individual elements need not be

repeated again and again [18].

The first clustering algorithm introduced is the hard C-means and

the outputs of this algorithm are non-overlapping by nature. We

find in real life that most of the cases we require the clusters to be

overlapping; i.e. an element can belong to more than one cluster to

certain degrees lying in the interval [0, 1]. The clustering techniques

which generate such type clusterings are called uncertainty based

clustering algorithms. There are several models of uncertainty

available in the literature introduced so far. Each of these models

depends upon one of the uncertainty based models like Fuzzy sets,

Rough sets, Intuitionistic fuzzy sets or soft sets and their hybrid

models like the fuzzy rough sets, rough fuzzy sets, intuitionistic

fuzzy rough sets or rough intuitionistic fuzzy sets. The outputs in

the case of these algorithms are fuzzy sets in case of Fuzzy C-means,

rough sets in case of rough C-means and so on. The property of a

fuzzy set is that we have graded membership values for the ele-

ments instead of crisp membership, i. e. an element either belongs

to a cluster or not. The graded membership leads to partial belong-

ingness of elements to the cluster. Similarly, the concept of intui-

tionistic fuzzy sets associates two functions with the set; one is

called the membership function and the other one is called the non-

membership function. In case of also these two

two membership functions are in existence. But the non-member-

ship function is just the one’s complement of the membership func-

tion. So, it had no separate existence. However, in case of intuition-

istic fuzzy sets, the sum of the membership values of any element

lies in the interval [0, 1]. Hence in the case of intuitionistic fuzzy

C-means, the clusters are such that the elements have both member-

ship and non-membership values. This leads to an important notion

called the hesitation function. This adds value to the uncertainty of

belongingness of an element. The intuitionistic fuzzy c-means al-

gorithm has been developed and studied in [19], [21].

Similarly, in case of rough C-means, the clusters are rough sets. The

rough set notion depends upon the notion of uncertainty being cap-

tured by the boundary region of a set. However, the basic definition

of a rough set depends upon an equivalence relation defined over

the universe. This is because, the definition of knowledge intro-

duced by Pawlak, the originator of the notion of rough set is that

human knowledge depends upon the ability to classify objects in a

domain. The classifications are disjoint subsets of the universe and

when they are combined together by union we get the whole uni-

verse. The equivalence relations defined over a universe also de-

compose the universe into disjoint classes. It is easy to see that the

two notions of classifications and the equivalence relations are in-

terchangeable notions. So, for mathematical reasons Pawlak took

equivalence relations to define rough sets. He introduced the no-

tions of upper approximation and lower approximation with a set

with respect to an equivalence relations which approximate the set

from the lower and upper side by the way the set being included in

the upper approximation and containing the lower approximation.

Obviously, when the lower and upper approximation becomes iden-

tical we get a crisp set. Otherwise the difference between the upper

approximation and the lower approximation is termed as the bound-

ary of the set and is the region containing the uncertain elements.

It was observed in the beginning that the two models of fuzzy set

and rough set are competing models and even people tried to estab-

lish the superiority of model over the other. But, it was established

by two scientists Dubois and Prade in 1990 that far from being com-

petitive they the models complement each other. Going a step ahead

they combined these two models to propose the hybrid models of

rough fuzzy and fuzzy rough models. It has been established since

then that the hybrid models are more efficient than the individual

components. It is worth noting that many such models have been

proposed in the form of rough intuitionistic fuzzy sets and intuition-

istic fuzzy rough sets and more importantly C-means algorithms

have been proposed and studied for clustering data and have been

applied to image segmentation [16, 17], Several applications of

these algorithms can be found in some recent works [15, 20, 22],.

However, in this paper, we focus our study by taking the fuzzy C-

means algorithm only.

Definition 2.2.1: A fuzzy set A defined over a universe W is deter-

mined by its membership function

such that

: [0,1]

mW

Thus for any element e in W,

()

assumes a value lying in [zero,

1].

The notion of fuzzy set is an extension of the crisp set in the sense

that every crisp set is associated with a function called its charac-

teristic function. When

assumes values only zero or one, it re-

duces to a characteristic function and the corresponding fuzzy set

reduces to a crisp set.

Definition 2.2.2: An intuitionistic fuzzy set An over a universe W is

determined by the membership and non-membership functions

and

such that

, : [0,1]

m n W 

such that for any element ’e’

from W,

( ) ( ) [0,1]

m e n e

So, the hesitation function

is such that for all e in W,

( ) 1 { ( ) ( )}

A A A

h e m e n e  

992

International Journal of Engineering & Technology

When,

( ) 1 ( )

n e m e

, the intuitionistic fuzzy set reduces to a fuzzy

set. Here the hesitation function becomes a zero function.

Definition 2.2.3: Let A be a set over a universe W and P be an

equivalence relation over W. Let us denote the equivalence classes

generated by P over W related to an element ‘e’ as

[]

Then we denote the lower approximation and upper approximation

of A with respect to P by

()

and

()

respectively and define

them as

( ) { |[ ] }

L A e W e A  

and

( ) { |[ ] }

U A e W e A



  

When

( ) ( )

L A U A

, we say that A is rough with respect to P and it

is said to be P-definable.

Fuzzy c-means clustering was first introduced by Dunn in 1973 [13].

Several authors have tried to improve the algorithm over the years

since then. However, the algorithm used at present is the version

enhanced by Bezdek in 1984 [14]. We would like to state that in the

algorithm proposed by the concept of fuzzifier is used. He has used

the fuzzifier to be a real number ‘m’, which has a range of values.

In fact in his objective function Dunn had used 2 for the value of

‘m’. It was noted later that the value of ‘m’ lies in the interval [1.5,

2.5]. It has been taken that the ideal value of ‘m’ happens to be ‘2’.

So, although Bezdek has extended the objective function of Dunn,

the ideal case is the special objective function used by Dunn. The

fuzzy c-means addresses the situations where a data can belong to

two different clusters, and thus providing the “fuzzy” effect to the

traditional k-means clustering algorithm. It follows on the minimi-

zation function of the following objective function:

(2)

where ai is the ith data within the dataset, mij is the degree of mem-

bership of ai in the cluster j, cj is the centre of the cluster, ||*|| is any

standard of measure to express the similarity between any data and

the center of the cluster, and p is a real number greater than 1.

The membership function mij can be measured as follows:

(3)

The cluster center cj is measured as follows:

(4)

The whole process iterates until it meets the stopping criterion, i.e,

(5)

Where ϵ is between 0 and 1, and k is the number of iterations.

The algorithm is composed of the following steps:

1) Initialize M = [mij] matrix, M(0)

2) At kth step: calculate the centers vectors C(k) = [cj] with M(k)

3) Update M(k), M(k+1)

4) If || M(k+1) – M(k)|| < ε the STOP, otherwise return to step 2

3. Implementation

Python packages were used to implement the anonymization. The

Pillow packages were used to resize the image and the Nearest

Neighbor Interpolation sampling filter was applied. The Geotiff

package was used to read the image and the gdal package converts

it into grayscale image by taking only a layer of the raster band. The

Pillow package was used to convert it into a numpy array, which is

then used to give as input to the fuzzy c-means clustering module.

4. Results

In this section we take a specific image of a human being in a spe-

cific position and generate the pixelized image from it then we ap-

ply clustering to generate the segmented mage.

Fig. 1: Original Image.

Fig. 2: Pixelized Image

International Journal of Engineering & Technology

993

Fig. 3: Clustered Image.

The original image (Fig 1) is pixelized (Fig 2), and is then clustered

(Fig 3) using fuzzy c-means clustering to obscure the image against

face recognition algorithms. The obscured image is now more anon-

ymized than when pixelized.

5. Conclusion

With pixelization being used as a common method to obscure im-

ages, the authors have proposed to use clustering algorithm along

with pixelization to better secure it against de-pixelization methods

and other possible techniques that may give away the identities of

the people in a picture. Since the clustering algorithm uses complex

numerical expressions, it becomes almost impossible to de-cluster

the image, thus providing more security for maintaining the privacy

of the image. This can be extended onto videos, like surveillance,

where certain identities need to be anonymized, before publishing

it.

References

[1] Dufaux, F., & Ebrahimi, T. (2010, July). A framework for the vali-

dation of privacy protection solutions in video surveillance. In Mul-

timedia and Expo (ICME), 2010 IEEE International Conference on

(pp. 66-71). IEEE.

[2] Honda, K., Omori, M., Ubukata, S., & Notsu, A. (2015, June). A

privacy-preserving crowd movement analysis by k-member cluster-

ing of face images. In Informatics, Electronics & Vision (ICIEV),

2015 International Conference on (pp. 1-5). IEEE.

[3] Honda, K., Omori, M., Ubukata, S., & Notsu, A. (2015, November).

A study on fuzzy clustering-based k-anonymization for privacy pre-

serving crowd movement analysis with face recognition. In Soft

Computing and Pattern Recognition (SoCPaR), 2015 7th Interna-

tional Conference of (pp. 37-41). IEEE.

[4] Birnstill, P., Ren, D., & Beyerer, J. (2015, August). A user study on

anonymization techniques for smart video surveillance. In Advanced

Video and Signal Based Surveillance (AVSS), 2015 12th IEEE In-

ternational Conference on (pp. 1-6). IEEE.

[5] Shahbaz, S., Mahmood, A., & Anwar, Z. (2013, December). SOAD:

Securing oncology EMR by anonymizing DICOM images. In Fron-

tiers of Information Technology (FIT), 2013 11th International Con-

ference on (pp. 125-130). IEEE.

[6] Indhumathi, R., & Priya, S. M. (2014). Data Preserving By Anony-

mization Techniques for Collaborative Data Publishing. Interna-

tional Journal of Innovative Research in Science, Engineering and

Technology, 3(1), 358–363.

[7] Monteiro, E., Costa, C., & Oliveira, J. L. (2015, August). A machine

learning methodology for medical imaging anonymization. In Engi-

neering in Medicine and Biology Society (EMBC), 2015 37th An-

nual International Conference of the IEEE (pp. 1381-1384). IEEE.

[8] Ruchaud, N., & Dugelay, J. L. (2016). Automatic Face Anonymiza-

tion in Visual Data: Are we really well protected?. Electronic Imag-

ing, 2016(15), 1-7.

[9] Pantoja, C., Arguedas, V. F., & Izquierdo, E. Anonymization and De-

identification of Personal Surveillance Visual Information: A Re-

view.

[10] Cavedon, L., Foschini, L., & Vigna, G. (2011, August). Getting the

Face behind the Squares: Reconstructing Pixelized Video Streams.

In WOOT (pp. 37-45).

[11] Keys, R. (1981). Cubic convolution interpolation for digital image

processing. IEEE transactions on acoustics, speech, and signal pro-

cessing, 29(6), 1153-1160.

[12] Dong, W., Zhang, L., Shi, G., & Wu, X. (2011). Image deblurring

and super-resolution by adaptive sparse domain selection and adap-

tive regularization. IEEE Transactions on Image Processing, 20(7),

1838-1857.

[13] Dunn, J. C. (1973). A fuzzy relative of the ISODATA process and

its use in detecting compact well-separated clusters.

[14] Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-

means clustering algorithm. Computers & Geosciences, 10(2-3),

191-203.

[15] B.K.Tripathy and P.Swarnalatha: A Comparative Study of RIFCM

with Other Related Algorithms from Their Suitability in Analysis of

Satellite Images using Other Supporting Techniques, Kybernetes,

vol.43, no.1,(2014), pp. 53-81

[16] B.K.Tripathy, R. Bhargav, A. Tripathy, E. Verma, Raj Kumar and

P.Swarnalatha: Rough Intuitionistic Fuzzy C-Means Algorithm and

a Comparative Analysis, COMPUTE‟13, Aug 22-24, Vellore, Tamil

[17] B.K.Tripathy and R. Bhargav: Kernel Based Rough-Fuzzy C-Means,

PReMI, ISI Calcutta, December, LNCS 8251, (2013), pp.148-157

[18] Swarnalatha P, Tripathy B.K., Nithin, P. L. and D. Ghosh: Cluster

Analysis Using Hybrid Soft Computing Techniques, CNC-2014In-

ternational Conference of Network and Power Engineering ,Proceed-

ings of Fifth CNC-2014,pp. 516-524

[19] B.K.Tripathy, Avik Basu and Sahil Govel: Image segmentation us-

ing spatial intuitionistic fuzzy C-means clustering, proceedings of

the IEEE ICCIC2014, (2014), pp.878-882

[20] B.K.Tripathy and D. Mittal: Efficiency Analysis of Kernel Functions

in Uncertainty Based C-Means Algorithms, International Conference

on Advances in Computing, Communications and Informatics,

ICACCI 2015, Article number 7275709, pp. 807-813 (2015).

[21] B.K. Tripathy, Deepthi P.H. and Dishant Mittal: Hadoop with Intui-

tionistic Fuzzy C-means for clustering in Big Data, Advances in In-

telligent Systems and Computing, Volume 438, 2016, Pages 599-610.

[22] B. K. Tripathy, Akarsh Goyal and Rahul Chowdhury: MMeNR:

Neighborhood Rough Set Theory Based Algorithm for Clustering

Heterogeneous Data, International Conference on Inventive Com-

munication and Computational Technologies (ICICCT 2017), (2017),

pp.323-328.

Toward Privacy Preservation Using Clustering Based Anonymization: Recent Advances and Future Research Outlook

Article

Full-text available

Jan 2022

With the continuous increase in avenues of personal data generation, privacy protection has become a hot research topic resulting in various proposed mechanisms to address this social issue. The main technical solutions for guaranteeing a user’s privacy are encryption, pseudonymization, anonymization, differential privacy (DP), and obfuscation. Despite the success of other solutions, anonymization has been widely used in commercial settings for privacy preservation because of its algorithmic simplicity and low computing overhead. It facilitates unconstrained analysis of published data that DP and the other latest techniques cannot offer, and it is a mainstream solution for responsible data science. In this paper, we present a comprehensive analysis of clustering-based anonymization mechanisms (CAMs) that have been recently proposed to preserve both privacy and utility in data publishing. We systematically categorize the existing CAMs based on heterogeneous types of data (tables, graphs, matrixes, etc.), and we present an up-to-date, extensive review of existing CAMs and the metrics used for their evaluation. We discuss the superiority and effectiveness of CAMs over traditional anonymization mechanisms. We highlight the significance of CAMs in different computing paradigms, such as social networks, the internet of things, cloud computing, AI, and location-based systems with regard to privacy preservation. Furthermore, we present various proposed representative CAMs that compromise individual privacy, rather than safeguarding it. Besides, this article provides an extended knowledge (e.g., key assertion(s), strengths, weaknesses, clustering methods used in the anonymization process, and %age improvements in quantitative results) about each technique that provides a clear view of how much this topic has been investigated thus far, and what are the research gaps that seek pertinent solutions in the near future. Finally, we discuss the technical challenges of applying CAMs, and we suggest promising opportunities for future research. To the best of our knowledge, this is the first work to systematically cover current CAMs involving different data types and computing paradigms.

Multi-Scale, Class-Generic, Privacy-Preserving Video

Article

Full-text available

May 2021

In recent years, high-performance video recording devices have become ubiquitous, posing an unprecedented challenge to preserving personal privacy. As a result, privacy-preserving video systems have been receiving increased attention. In this paper, we present a novel privacy-preserving video algorithm that uses semantic segmentation to identify regions of interest, which are then anonymized with an adaptive blurring algorithm. This algorithm addresses two of the most important shortcomings of existing solutions: it is multi-scale, meaning it can identify and uniformly anonymize objects of different scales in the same image, and it is class-generic, so it can be used to anonymize any class of objects of interest. We show experimentally that our algorithm achieves excellent anonymity while preserving meaning in the visual data processed.

MMeNR: Neighborhood Rough Set Theory Based Algorithm for Clustering Heterogeneous Data

Conference Paper

Full-text available

Mar 2017

In recent times enumerable number of clustering algorithms have been developed whose main function is to make sets of objects having almost the same features. But due to the presence of categorical data values, these algorithms face a challenge in their implementation. Also some algorithms which are able to take care of categorical data are not able to process uncertainty in the values and so have stability issues. Thus handling categorical data along with uncertainty has been made necessary owing to such difficulties. So, in 2007 MMR [1] algorithm was developed which was based on basic rough set theory. MMeR [2] was proposed in 2009 which surpassed the results of MMR in taking care of categorical data. It has the capability of handling heterogeneous data but only to a limited extent because it is based on classical rough set model. In this paper, we generalize the MMeR algorithm with neighbourhood relations and make it a neighbourhood rough set model which we call MMeNR (Min Mean Neighborhood Roughness). It takes care of the heterogeneous data and also the uncertainty associated with it. Standard data sets have been used to gauge its effectiveness over the other methods.

A User Study on Anonymization Techniques for Smart Video Surveillance

Conference Paper

Full-text available

Aug 2015

A key mechanism of privacy-aware smart video surveillance is anonymization of video data. We conducted a user study with a response of 103 participants in order to investigate which pixel operations are suitable for protecting persons' identities while, at the same time, allowing a human operator to recognize persons' activities i.e., preserving the utility of the video data. Regarding the activities in the data set, namely stealing, fighting, and dropping a bag, our data does not approve the common hypothesis that privacy and utility of video data are necessarily trade-off.

Image segmentation using spatial intuitionistic fuzzy C means clustering

Article

Full-text available

Sep 2015

A fuzzy algorithm is presented for image segmentation of 2D gray scale images whose quality have been degraded by various kinds of noise. Traditional Fuzzy C Means (FCM) algorithm is very sensitive to noise and does not give good results. To overcome this problem, a new fuzzy c means algorithm was introduced [1] that incorporated spatial information. The spatial function is the sum of all the membership functions within the neighborhood of the pixel under consideration. The results showed that this approach was not as sensitive to noise as compared to the traditional FCM algorithm and yielded better results. The algorithm we have proposed adds an intuitionistic approach in the membership function of the existing spatial FCM (sFCM). Intuitionistic refers to the degree of hesitation that arises as a consequence of lack of information and knowledge. Proposed method is comparatively less hampered by noise and performs better than existing algorithms.

A comparative study of RIFCM with other related algorithms from their suitability in analysis of satellite images using other supporting techniques

Article

Full-text available

Jan 2014
KYBERNETES

Purpose – The purpose of this paper is to provide a way to analyze satellite images using various clustering algorithms and refined bitplane methods with other supporting techniques to prove the superiority of RIFCM. Design/methodology/approach – A comparative study has been carried out using RIFCM with other related algorithms from their suitability in analysis of satellite images with other supporting techniques which segments the images for further process for the benefit of societal problems. Four images were selected dealing with hills, freshwater, freshwatervally and drought satellite images. Findings – The superiority of the proposed algorithm, RIFCM with refined bitplane towards other clustering techniques with other supporting methods clustering, has been found and as such the comparison, has been made by applying four metrics (Otsu (Max-Min), PSNR and RMSE (40%-60%-Min-Max), histogram analysis (Max-Max), DB index and D index (Max-Min)) and proved that the RIFCM algorithm with refined bitplane yielded robust results with efficient performance, reduction in the metrics and time complexity of depth computation of satellite images for further process of an image. Practical implications – For better clustering of satellite images like lands, hills, freshwater, freshwatervalley, drought, etc. of satellite images is an achievement. Originality/value – The existing system extends the novel framework to provide a more explicit way to analyze an image by removing distortions with refined bitplane slicing using the proposed algorithm of rough intuitionistic fuzzy c-means to show the superiority of RIFCM.

Getting the face behind the squares: reconstructing pixelized video streams

Conference Paper

Full-text available

Aug 2011

Pixelization is a technique to make parts of an image impossible to discern by the human eye by artificially decreasing the image resolution. Pixelization, as other forms of image censorship, is effective at hiding parts of an image that might be offensive to the viewer. However, pixelization is also often used also to achieve anonymity, for example to make the features of a person's face unrecognizable or the defining characteristics of cars and building unidentifiable. This use of pixelization is somewhat effective in the case of still images, even though it is open to dictionary attacks. However, when used in videos, pixelization might be vulnerable to full reconstruction attacks. In this paper, we describe an attack against the anonymization of videos through pixelization. We develop an approach that, given a pixelized video, reconstructs the image being pixelized so that the human eye can clearly identify the object being protected. We implemented our approach and tested it against both artificial and real-world videos. The results of our experiments show that, in many cases, video pixelization does not provide sufficient guarantees of anonymity.

Automatic Face Anonymization in Visual Data: Are we really well protected?

Article

Feb 2016

With the proliferation of digital visual data in diverse domains (video surveillance, social networks, medias, etc.), privacy concerns increase. Obscuring faces in images and videos is one option to preserve privacy while keeping a certain level of quality and intelligibility of the video. Most popular filters are blackener (black masking), pixelization and blurring. Even if it appears efficient at first sight, in terms of human perception, we demonstrate in this article that as soon as the category and the strength of the filter used to obscure faces can be (automatically) identified, there exist in the literature ad-hoc powerful approaches enable to partially cancel the impact of such filters with regards to automatic face recognition. Hence, evaluation is expressed in terms of face recognition rate associated with clean, obscured and de-obscured face images. Figure 1: Respectively, " 20 minutes " a French magazine using pixelization filter, " crimes " a French program using blurring filter and Street view by google using blurring filter.

A privacy-preserving crowd movement analysis by k-member clustering of face images

Conference Paper

Jun 2015

Crowd movement analysis is an important issue in social design. This paper studies an machine learning approach to crowd movement estimation through face image recognition. Although high performance face recognition is a powerful tool in individual authentication with surveillance camera images in public spaces, utilization of personal information is often hesitated under fear of privacy violation. In this paper, a privacy preserving framework for crowd movement analysis is proposed considering k-anonymization of face image features. k-anonymity is a quantitative measure of secureness in data mining and is expected to enhance the utility of personal information. An experimental result demonstrates the applicability of the secure framework in capturing crowd movement characteristics even if individual features are k-aonymized so that each individual is not distinguishable from others k − 1 ones.

A study on fuzzy clustering-based k-anonymization for privacy preserving crowd movement analysis with face recognition

Conference Paper

Nov 2015

A machine learning methodology for medical imaging anonymization

Conference Paper

Aug 2015

Privacy protection is a major requirement for the complete success of EHR systems, becoming even more critical in collaborative scenarios, where data is shared among institutions and practitioners. While textual data can be easily de-identified, patient data in medical images implies a more elaborate approach. In this work we present a solution for sensitive word identification in medical images based on a combination of two machine-learning models, achieving a F1-score of 0.94. Three experts evaluated the system performance. They analyzed the output of the present methodology and categorized the studies in three groups: studies that had their sensitive words removed (true positive), studies with complete patient identity (false negative) and studies with mistakenly removed data (false positive). The experts were unanimous regarding the relevance of the present tool in collaborative medical environments, as it may improve the exchange of anonymized patient data between institutions.

SOAD: Securing oncology EMR by anonymizing DICOM images

Conference Paper

Dec 2013

In the prevailing healthcare industry requirements, the demand of electronic medical record (EMR) has been increased to provide better healthcare to patients and provide convenient access to EMR. Healthcare providers are keen to move EMR's to the cloud. Cloud computing paradigm is giving insight to shared environment for EMR, however it brings a lot of challenges like security, privacy of medical data along with its advantages. Our designed system provides appropriate management of oncology patient records and provides privacy to patient textual and DICOM (Digital Imaging and Communications in Medicine) image information by anonymizing them. We have identified PDATA (Personal Data) and CDATA (Clinical Data) from the DICOM images retrieved from PACS (Picture Archiving and Communication System) server. The role based policies are been implemented and are stored in the database, which are used for the anonymization of PDATA.

Image anonymization using clustering with pixelization

Abstract and Figures

Recommended publications

Analysis of incomplete data and an intrinsic-dimension Helly theorem

Protecting One's Privacy: Insighs into the Views and Nature of the Early Adopters of Privacy Service...

History-enhanced focused website segment crawler

Appearance-based Keypoint Clustering