ArticlePDF Available

Image Mining Framework and Techniques: A Review

April 2015
International Journal of Image Mining In-press(1)

April 2015
In-press(1)

DOI:10.1504/IJIM.2015.070028

Authors:

Sayan Chakraborty

Techno India College Of Technology

Wahiba Abdessalem

Institut Supérieur de Gestion de Tunis

Nilanjan Dey

Techno International New Town

Sukanya Banerjee

Heritage Institute of Technology

Show all 6 authorsHide

Image mining refers to a data mining technique where images are used as data. It supports a large field of applications like medical diagnosis, agriculture, industrial work, space research and obviously the educational field. The image mining technique can extract knowledge and exciting patterns which are not stored in the database by analyzing the images using various tools. The new era of advanced technology and high storage capability supports the growth of large and detailed image database. This review paper presents a detailed view on the existing research works in the area of image mining and also summarized the different techniques used in. http://www.inderscience.com/info/ingeneral/forthcoming.php?jcode=ijim

mage mining review in tabular form (continued)

…

mage mining review in tabular form (continued)

…

Figures - uploaded by Nilanjan Dey

Content may be subject to copyright.

Content uploaded by Nilanjan Dey

Content may be subject to copyright.

Int. J. Image Mining, Vol. 1, No. 1, 2015 45

Image mining framework and techniques: a review

Nilanjan Dey

Department of Computer Science and Engineering,

Bengal College of Engineering and Technology,

Durgapur, West Bengal, India

Email: dey.nilanjan@ymail.com

Wahiba Ben Abdessalem Karâa

Department of CSE,

High Institute of Management,

Le Bardo, Tunis, Tunisia

Email: wahiba.abdessalem@isg.rnu.tn

Sayan Chakraborty*

Department of Computer Science and Engineering,

Bengal College of Engineering and Technology,

Durgapur, West Bengal, India

Email: sayan.cb@gmail.com

*Corresponding author

Sukanya Banerjee

Department of Computer Science and Engineering,

JIS College of Engineering,

Kalyani, West Bengal, India

Email: banerjee.sukanya7@gmail.com

Mohammed A.M. Salem

Faculty of Computer and Information Sciences,

Ain Shams University,

Abbassia, Cairo, Egypt

Email: salem@cis.asu.edu.eg

Ahmad Taher Azar

Faculty of Computers and Information,

Benha University,

Qalyubia, Egypt

Email: ahmad_t_azar@ieee.org

46 N. Dey et al.

Abstract: Image mining refers to a data mining technique where images are

used as data. It supports a large field of applications like medical diagnosis,

agriculture, industrial work, space research and obviously the educational field.

The image mining technique can extract knowledge and exciting patterns

which are not stored in the database by analysing the images using various

tools. The new era of advanced technology and high storage capability supports

the growth of large and detailed image database. This review paper presents a

detailed view on the existing research works in the area of image mining and

also summarised the different techniques used.

Keywords: image mining; object recognition; image classification; image

indexing and retrieval; association rule mining; image clustering; image mining

frameworks; neural network.

Reference to this paper should be made as follows: Dey, N., Karâa, W.B.A.,

Chakraborty, S., Bnaerjee, S., Salem, M.A.M. and Azar, A.T. (2015)

‘Image mining framework and techniques: a review’, Int. J. Image Mining,

Vol. 1, No. 1, pp.45–64.

Biographical notes: Nilanjan Dey is an Assistant Professor in the Department

of Computer Science and Engineering in Bengal College of Engineering and

Technology, Durgapur, West Bengal, India. He is a PhD Scholar of Jadavpur

University, Electronics and Telecommunication Engineering Department,

Kolkata, India and also holds an honorary position of Visiting Scientist at

Global Biomedical Technologies Inc., CA, USA. He is the Managing Editor of

International Journal of Image Mining (IJIM), Inderscience (ISSN 2055–6039)

and is the Regional Editor Asia of International Journal of Intelligent

Engineering Informatics (IJIEI), Inderscience (ISSN 1758–8723). His

research interests include: medical imaging, soft computing, data mining,

machine learning, information hiding, security, computer aided diagnosis, and

atherosclerosis. He has applied for a patent, has four books (including three

edited books), 12 book chapters and almost 100 international conferences and

journal papers.

Wahiba Ben Abdessalem Karâa completed her PhD in Computer Science,

University of Paris, VII Jussieu, France. Currently, she is an Assistant

Professor at the High Institute of Management of Tunis, Dept. of Computer

Science Applied to Management, Tunisia. She has more than 50 research

papers in various reputed journals and conferences.

Sayan Chakraborty is currently working as an Assistant Professor in

Department of Computer Science and Engineering, Bengal College of

Engineering and Technology, Durgapur, India. He completed his MTech and

BTech from Department of CSE, JIS College of Engineering, Kalyani,

West Bengal, India. He has around 28 research papers in various international

journals and conferences. Biomedical image processing, watermarking and

meta-heuristics are the key areas which he is currently working on.

Sukanya Banerjee is an MTech Scholar in Department of Computer Science

and Engineering, JIS College of Engineering and Technology, Kalyani, Nadia.

She completed her BTech from M.C.E.T., at Murshidabad, West Bengal. She

has two research papers in international journals and conference. She is

currently working on medical image processing and watermarking.

Image mining framework and techniques 47

Mohammed A.M. Salem is an Assistant Professor in the Faculty of Computer

and Information Sciences, Ain Shams University in Cairo, Egypt since 2009.

He received his PhD degree from Humboldt-University in Berlin, Germany in

November 2008, in the topic of multiresolution image segmentation. His

research interests include: robot vision, multimedia, signal processing,

multiresolution analysis, and wavelet transform. He has published a book and

more than 40 papers in the fields of image/video processing.

He has gained teaching experience and supervised different Master’s and PhD

students through his position in Ain-Shams University in Cairo and

Humboldt-University in Berlin. He used to teach the courses of mathematics

for undergraduate students and course of image processing and visual serving

for post graduate students.

Ahmad Taher Azar received his MSc in System Dynamics (2006) and PhD in

Adaptive Neuro-Fuzzy Systems (2009) from the Faculty of Engineering, Cairo

University (Egypt). He is currently an Assistant Professor in the Faculty

of Computers and Information, Benha University, Egypt. He is the

Editor-in-Chief of two journals published by IGI Global, USA such as

International Journal of System Dynamics Applications (IJSDA). He is an

Associate Editor of IEEE Trans. Neural Networks and Learning Systems. He

has worked in the areas of system dynamics, soft computing and modelling in

biomedicine and has authored/co-authored over 80 research publications.

1 Introduction

Image mining aims to extract relationships and patterns which are not explicitly stored in

database from raw data images. Image mining is a well structured technique based on

data mining, artificial intelligence, machine learning, image retrieval, image processing,

computer vision and database etc. Image mining’s capability of discovering useful image

patterns opens various research fields to new frontiers. Mining large collection of images,

and combined data mining of large collections of images with associated alphanumeric

data are the two important themes of image mining. The main reason behind the

increasing popularity of image mining is its capability to infer knowledge from the image

data automatically. Raw images or image sequences with low level pixel representation is

processed efficiently and effectively to extract the high level objects and their

relationship from those images. Image mining is still at the experimental stage. It can be

considered as an efficient hybridisation of image processing and data mining concepts to

extract the useful knowledge. In the proposed framework of Zhang et al. (2001) on data

mining, only four levels of information was extracted, they were:

a pixel level

b semantic concept level

c object level

d pattern and knowledge level.

48 N. Dey et al.

High-dimensional indexing and retrieval techniques were used to maintain the data flow

within various levels. They also increased amount of image data which made the data

mining technique more demanding with its unique features. Various application domains

(Ping and Yueshun, 2009) of image mining include natural scene recognition, remote

sensing, weather forecasting, criminal investigation, image segmentation, image

watermarking (Dey et al., 2012a, 2012b, 2011) etc.

One of the well-known techniques (Pal et al., 2013; Dey et al., 2013a, 2013b, 2013e)

that have been implemented in image processing (Dey et al., 2012c, 2012d, 2015) for

information security is watermarking (Bhattacharya et al., 2012; Dey et al., 2014, 2013c).

Watermarking refers (Dey et al., 2012e, 2012f; Chakraborty et al., 2012) to hiding or

embedding any message inside a signal (Dey et al., 2012g, 2012h), image or video. In

image watermarking (Dey et al., 2012i, 2012j; Chakraborty et al., 2013) a data or

message is taken, then embedded inside an image. This technique is known as watermark

embedding (Dey et al., 2013c). Following the retrieval of embedded or watermarked

image, the watermark is extracted from the watermarked image in order to collect the

hidden information. Watermarking can be reversible or non-reversible. Image

watermarking (Dey et al., 2013d) can also be categorised into blind and non-blind

watermarking (Dey et al., 2012k). Regarding the operations on images, image mining

differs from those traditional operations of image processing and computer vision in the

way of their working technique on images. Image mining works on large collection of

images whereas most of the previously mentioned techniques work on a single image.

The main aim of image mining is to extract some relevant information and significant

patterns from the collection of existing image database and related alphanumeric data.

The important activities in image mining (Cosa et al., 2002) are searching and retrieval of

images, based on the features and similarity of a given input query image from the image

database. There are several image mining tools available such as iARM, CAViz, web

image-gathering task, SVM classifier, B2S, DisIClass, MetaSEEk, PLSA, fully

automated age estimation engine (Devsena et al., 2011), QBIC, Photobook, SWIM,

Virage, Visualseek, Netra, MARS and so on. Now-a-days, image mining has become be

an important research topic due its applications in various areas such as medical imaging,

weather forecasting, management of earth’s resources, forest fires, criminal investigation

etc.

Image mining provides a framework that uses the raw format images stored in the

database which cannot be used directly. To use them in high-level modelling they must

be processed first. An image mining technique is considered as a good technique if it

supports fully user interaction during retrieving the patterns and knowledge from the

collection of huge image (Bach et al., 1996) database. The following functions are

performed in image mining, they are: image storage, image processing, feature

extraction, image indexing and retrieval, patterns and knowledge discovery. The two

kinds of frameworks of image mining are

1 function driven framework: which focused on different modules component and

their functionalities?

2 information driven framework: that provided a hierarchical structure of levels and

the data needed into all the levels.

Image mining framework and techniques 49

Our present work provides an overall review on such existing image mining frameworks,

techniques and further describes their attributes, features, advantages, disadvantages etc.

Section 2 provides the literature review of the previous works done in this area. Various

image mining techniques are illustrated in Section 3. Section 4 presents an overview on

the image mining frameworks. Paper concludes in Section 5.

2 Review on image mining

Fay et al. (2003) developed a system for multisensory image fusion and interactive

mining. This system was dependent on neural models of colour vision processing,

learning and pattern recognition. They also had add-on modules which performed

image conditioning, image fusion, extraction of context features and interactive image

mining. All of these modules were combined together to create a work flow which

enabled a user to create vector products of foundation features (e.g., roads, rivers, and

forests). They also highlighted the target detections from raw multisensory or

multispectral imagery. In this image mining technique, multispectral imagery modified

by simulated environmental conditions was not addressed. In order to process large

amounts of remote sensing image data, Daschiel et al. (2005) developed the prototype

model of information mining system. It consisted of both an online interface as well as an

offline part. The offline part dealt with the generation of features relating to image

mining like data reduction, compression, unsupervised content index and the absorption

of catalogue entry.

Users can collect information from a vast amount of data which is present

in the WWW on demand by using various image mining techniques. The digitalised

image which is obtained from the web is relevant in the real world. Sometimes real world

images can differ from the obtained results due to the various classification/recognition

techniques used. Morsillo et al. (2008) proposed a technique which provided more

accurate visualisation of objects by reducing the noisy search. This model combined both

generative and discriminative elements to perform an efficient retrieval of web images. It

successfully worked on semi supervised machine learning technique. Zhan et al. (2009)

devised the relation between the two main characteristics of web image i.e., visual

feature cluster and keyword, using multi-mode association rule.

Image classification is an active research area in image mining. There exists several

mining algorithm for retrieving information from the web. The quality of a good

algorithm can be determined by the process, which semantically extracts the images from

the database. Zhu et al. (2009) developed a better nonlinear algorithm which classified

the problem depending upon the distance between training and test manifold. It also

reduced the dimensionality and complexity. Later, Zhan et al. (2009) introduced a search

technique which helped us to understand sensitive Markov stationary feature (C-MSF)

after getting information from the relevant images. It represented a random walk with

restart (RWR) algorithm on images where special cooccurrence and information were

integrated. They were transformed into a classified form with the help of an SVM

classifier.

50 N. Dey et al.

Image mining helps a lot in magnetic resonance imaging (MRI) of human brain in

medical field. In the field of neurology and neurocognitive study, the clustering of

Corpus Callosum (Fatma, 2012) in midsagittal is a very critical task because of the size

and structure of Corpus Callosum. The size and structure of Corpus Callosum (Elsayed

et al., 2010; Fatma, 2012) also varies depending upon the age, sex, neurode generative

diseases and lateralised behaviour of different people. The segmentation of Corpus

Callosum during MR imaging of the brain is a very complex task in image mining.

The method proposed by Rajendran and Madheswaran (2010a) was capable of

detecting the tumour from the CT scan report of the brain by removing all other

inconsistencies from the image report. The accuracy of detecting tumour from images

using this method was much better than other techniques. Sheela and Shanthi (2007)

devised an image mining technique to identify the normal and abnormal images of brain

which led to identification of brain diseases from the MRI of abnormal tissues. Rajendran

and Madheswaran (2010b) developed a system successfully detected brain tumour from

CT scan report by pre-processing extraction of features, association rule and hybrid

classifier. In this study, by using median filtering and canny edge detection method, the

pre-processing technique extracted the edge features of the scanned images.

In association rule mining, frequent pattern tree (FP-tree) detects various patterns

generated in CT scan image report. It provides far more accurate result than any other

existing classification methods. Hybrid method is something that combines both the

mining approaches which enhances the efficiency compared to any traditional methods.

Mohan and Kannan (2010) provided a system that classified and retrieved the image by

colour. It made the process fully interactive for the user. This technique used some steps

for gathering information like colour image classification, pre-processing, pre clustering,

texture feature extraction, similarity comparison and selection of neighbouring target

image. Dubey (2010) introduced the technique based on colour histogram and image

texture. The resulting image was generated after querying necessary images.

Images can be differentiated on the basis of colour distribution by histogram method.

The images with similar colour distribution may not be semantically associated with the

images which were retrieved by global colour histogram. In this regard, Silakari et al.

(2009) developed a system which used colour moment and block truncation coding

(BTC) for retrieving the features from image database. For image database clustering

purpose K-map clustering algorithm was used. Such methods may be applied on the

different colour spaces as, RGB, HSV, and others.

3 Image mining techniques

The techniques which were used by early image miners prior to the invention of

suitable framework include pattern recognition, image indexing and retrieval, image

classification, image clustering, association rule mining, and neural network. In the

following, is a survey on these techniques? The techniques are classified on five levels of

information and the associated image or data mining operations. These levels (from top

to bottom) are:

Image mining framework and techniques 51

a knowledge extraction level

b patterns and inter-image relations level

c semantic concept level

d region, objects, or visual patterns level

e pixel level.

3.1 Object recognition

One of the key areas of image mining is object recognition, which operates data on

patterns and inter-image relations level. It finds the object relevant to the real world, from

the image by processing the provided object models. It is also known as supervised

labelling method. The system has four parts, they are:

a feature detector

b model database

c hypothesiser

d hypothesis verifier.

In 2000, Jeremy and Bonet (2000) proposed a system to find out a specific known object

in the image, which applied image processing operations on the set of ‘characteristic

maps’. In Burl et al. (1999) employed learning techniques to generate recognisers

automatically. In this work, classified examples were used to capture the domain

knowledge implicitly. Later in 2001, Gibson et al. (2001) developed an optimal

FFT-based mosaicing algorithm to find common patterns in images. The results of this

work showed that the system worked well on various kinds of images.

3.2 Image retrieval

Image retrieval refers to the process of retrieving a particular image from a large database

using data mining. Retrieval of images in image mining (Tahoun et al., 2005) is done

based on some requirement specification. There are three levels of requirement

specifications and the complexity also increase with the levels.

a level 1 retrieve the image based on some basic features of images such as texture,

colour, shape or image elements’ spatial location

b level 2 is based on image retrieval which derives the logical features such as

individual objects or persons from images

c level 3 is based on image retrieval by abstract attributes which involves a high level

reasoning in order to obtain the meaning of the objects or scenes illustrated.

Kazman and Kominek (1993) introduced three query schemas to retrieve image

information. They were

52 N. Dey et al.

a query by description

b query by associate attributes

c query by image content.

Query by associate attributes refers to the technique of taking the conventional table

structure to tailor which fulfils the purpose of image needs. Query by description means

the method that uses description along with each image, through which the user can

locate the images interested. The image description is often referred as label or keyword.

With the emergence of large-scale image repositories, the problems of vocabulary and

non-scalability caused by manual operation have become more pronounced. Hence,

content-based image retrieval (CBIR) was proposed to overcome these difficulties.

IBM’s QBIC system (Flickner et al., 1995) could retrieve image description by any

combination of colour, texture and shape as well as text keyword. This system may be

one of the popular systems amongst all other image content retrieval frameworks. It uses

R*-tree indexes to improve efficiency. Image retrieval operates data on semantic concept

level, region, objects, visual patterns level and pixel level.

3.3 Image indexing

Apart from focusing on the information requirements at various levels, it is also

important to provide support for the retrieval of image data with a fast and efficient

indexing scheme. On the contrary, the image database to be searched is too large and the

feature vectors of images are of high dimension (in the order of 102) which increases the

search complexity. To reduce such complexity reducing dimensionality or indexing high

dimensional data can be used. Image indexing handles data and images in region, objects

and visual patterns level. Reducing the dimensions can be accomplished using two

well-known methods:

a the singular value decomposition (SVD) update algorithm

b clustering.

Although, the best way to reduce complexity is to perform appropriate multi-dimensional

indexing after performing dimension reduction, which provides non-Euclidean (Rui and

Huang, 1997) similarity measures.

Lin et al. (1994) introduced an efficient technique of colour indexing for retrieving

similar type data. In this work, they increased the search time as the size of the database

increased. In 2001, Tan et al. (2001) proposed a multi-level nested R-tree index which

retrieved the structure efficiently and effectively. It helped to select appropriate technique

and also helped to design new technique by prolife the retrieval process. This process

helped to evaluate the performance of colour-spatial retrieval techniques, which led to the

selection of a suitable new technique.

3.4 Image classification and clustering

Image classification and clustering refers to the method of arranging the images into

clusters which may be done in a supervised or unsupervised way. In supervised

classification, the problem is to classify a newly encountered image from a collection

of given pre-classified images. Whereas, in unsupervised classification (or image

Image mining framework and techniques 53

clustering), without any previous knowledge the unlabeled similar type of images are

grouped together which leads to cluster generation. Clustering the images based on their

content is an important and equally challenging task to infer information from the huge

collection of images. This technique is more focused on the levels of inter-image

relations, semantic in an image, and regions. However, this technique may operate on the

large raw data.

Uehara et al. (2001) discovered the method of grouping a set of images based their

low-level visual features. This method also used a binary Bayesian classifier which

classified the vacation images into indoor and outdoor categories. The existing statistical

parameter was updated using an unsupervised technique for a maximum likelihood (ML)

classifier. As a result, a new image lacking corresponding statistical parameter demanded

the analysis of corresponding training set.

Wang and Li (1997) proposed an image-based classification method of objectionable

websites (IBCOW) which classified the websites to detect if that website was objection

enable or based on image content. The early stage of mining process is Image clustering.

Important attributes for clustering are texture, colour and shape of a particular image.

They can be used separately or in combination. Several clustering techniques

are available such as: partition-based algorithms, hierarchical clustering algorithms,

mixture-resolving and mode-seeking algorithms, nearest neighbour clustering, fuzzy

clustering, evolutionary clustering approaches etc. The abstract features by cluster can be

recognised by the domain expert following the image clustering.

3.5 Association rule mining

Ordonez and Omiecinski (1998) discussed an algorithm for image mining association

rules. This algorithm reduced I/O and CPU overhead and operated data or images on

region, objects and visual pattern level. They also built the data mining system on the top

of CBIR system. This algorithm first segmented images into blobs. Then identified and

labelled objects present in the images. Later, similarity measurement was done on those

images. The value of similarity measurement being one indicated perfect match on all

desired features, whereas zero similarity measurement value referred to the worst match

possible on those desired features. To interpret the association rules, this process also

provided the auxiliary images with identified objects.

Data mining algorithm was applied to produce object association rules. Priyatharshini

and Chitrakala (2013) described the method of using association rule in case of image

retrieval. According to this method, for each query image, all association rules which

used the query image as the antecedent (A) must be found. The consequent (B) were the

candidate images for retrieval procedure. Afterwards, those candidate images were

ranked according to their confidence value. The algorithm also mentioned the support

value of rule A B being greater than A ⇒ C if B was a subset of C. If the candidate

image set was empty or consisted of less no. of images than it should be present then the

system picked several images randomly from the database which would give every image

a chance to establish the association rules. Deshpande (2011) presented a data mining

technique for finding image content-based association rule. The purpose of this

experiment was to do feasibility study of data mining approaches based on image

content.

54 N. Dey et al.

The frequent item set discovered by traditional association rule algorithm using

iteration, needed large calculation. This issue demanded a simpler approach for image

mining (Jain et al., 2013). Thus, the technique of image mining (Banda et al., 2014; Chen

and Mei, 2014) was divided into four important phases: image pre-processing, feature

extraction, conversion of image database to transaction database, and applying

association rule mining (Wang et al., 2014; Herold et al., 2011; Khodaskar and Ladhake,

2014) to this transaction database. The proposed new association rule algorithm

(Deshpande, 2010) reduced the number of scans for a priori algorithm. This algorithm

was described in four steps. In the first step the transaction database was transformed into

Boolean matrix. In the second step, frequent 1 item set L1 was generated. The Boolean

matrix was pruned by deleting some rows and columns, in the third step. In the last step,

frequent k item sets Lk(k>1) were generated.

4 Image mining framework

There are two different frameworks of image mining (Datcu and Seidel, 2000):

1 function driven framework

2 information driven framework.

Most of the existing image mining system architectures fall under the function driven

framework. However, function driven framework is not a generalised framework. It can

be application oriented or organisation oriented. Datcu and Seidel (2000) introduced

function driven framework for intelligent satellite mining system. The function driven

framework for the multimedia miner was proposed by Zaiane and Han (1998). The

advantage of this framework was it could organise and clarify the different tasks to be

performed in image mining, but on the contrary, it was unable to differentiate levels of

vital information representation to perform meaningful image mining. This drawback

was fixed in the information driven framework.

Zhang et al. (2001) provided information driven framework for the image mining,

representing different levels of information. This framework had four pixel levels

a object level

b semantic level

c pattern level

d knowledge level.

Pixel level was the lowest level in any image mining system. It worked with the raw

information about image such as image pixels and some basic image features such as

colour, texture, and shape. It was capable of answering queries like ‘retrieve the image

with red colour’. But it could not solve queries such as ‘retrieve the image of girl’. Object

level was capable of retrieving the images for such queries. It dealt with object

information based on the primitive features in the pixel level. Object recognition assigned

correct labels to a single region or set of regions. But still it could not retrieve images for

queries such as ‘image with sad faces’. The third logical concept level generated

high-level semantic concepts from the known objects to answer such queries. These three

Image mining framework and techniques 55

levels were useful for information retrieval from the image to mine it. It supported the

entire information requirement within the image mining framework.

4.1 Function-driven frameworks

Datcu and Siedel (2000) proposed an intelligent satellite mining system that had two

modules:

a a pre-processing, data acquisition and archiving system which was required to

extract the information from the image, database of raw images, and retrieval of

image

b an image mining system to help the users to understand the detailed information of

image and detect relevant information.

Similarly, the multi media niner (Zaiane and Han, 1998) included four major

components:

a image excavator which retrieved the image and videos from the existing multimedia

database

b a pre-processor which extracted the features of the images and store the processed

data into the database

c a search kernel to generate the result depends upon the query from the image and

video database

d the discovery modules such as classifier, association and characteriser perform

image information mining routines to generate the underlying patterns and

knowledge within the images

4.2 Information-driven frameworks

The function-driven framework performed the image mining by clear decomposition and

neat arrangement of different roles and tasks. But it was unable to describe the

information representation at different levels which was needed before any mining task.

Zhang et al. (2001) proposed an information-driven system which dealt with this

problem. The key attributes of this system were:

a pixel level which was the lowest level of image, consisted of the raw information

about image such as image pixels and the image features such as texture, colour and

shape

b object level referred to the object or region of information based on the result of

primitive features in the pixel level

c semantic concept level created high-level semantic concepts from the known objects

of the knowledge domain.

d pattern and knowledge level extracted the patterns and knowledge from the domain

related alphanumeric data and the logical concepts obtain from image data.

56 N. Dey et al.

Table 1 Image mining review in tabular form

Sl. no. Year of pub. Paper title Authors Short description

1 1993 Information organization in

multimedia resources

Kazman and Kominek

(1993)

Introduced three query schemas to retrieve image information. They were

a query by description

b query by associate attributes

c query by image content.

Query by associate attributes referred to the technique of taking the conventional table

structure to tailor which fulfils the purpose of image needs

2 1994

The TV-tree: an index structure

for high-dimensional data

Lin et al. (1994) Introduced an efficient technique of color indexing to retrieve similar type data.

The search time increased as the size of the database increased.

3 1995 Query by image and video

content: the QBIC system

Flickner et al. (1995) Proposed IBM’s QBIC system that could retrieve image description by any

combination of color, texture and shape as well as text keyword. This system may be

one of the popular systems amongst all other image content retrieval frameworks. It

used R*-tree indexes to improve efficiency.

4 1995 A scheme for visual features

based image retrieval

Zhang and Zhong

(1995)

Proposed the use of self-organisation map (SOM) neural nets which was the tool for

constructing the tree indexing structure. Advantages of using SOM were its

unsupervised learning ability and dynamic clustering nature.

5 1996

Image search engine: an open

framework for image management

Bach et al. (1996) Proposed a system that performed the following functions in image mining such as:

image storage, image processing, feature extraction, image indexing and retrieval,

patterns and knowledge discovery. The two frameworks of image mining were

1 function driven framework: which focused on different modules component as well

the functionalities

2 information driven framework: provided a hierarchical structure of levels and the

data needed into all the levels.

6 1996

Automatic detection of diabetic

retinopathy using an artificial neural

network: a screening tool

Gardner and Keating

(1996)

Applied artificial neural network (ANN) on image mining which provided an

automated approach of fund us image analysis by computer. This process improved the

efficiency of the assessment work of the image by offering an immediate classification

of the fund us of the patient at the time of acquisition of the image

7 1997 System for screening objectionable

images using daubechies’ wavelets and

color histograms

Wang and Li (1997) Proposed an image-based classification method of objectionable websites (IBCOW)

which classified the websites to detect if that website was objection enable or based on

image content

8 1997 Image retrieval: past,

present and future

Rui and Huang (1997) Performed appropriate multi-dimensional indexing after dimension reduction which

provided non-Euclidean similarity measures

Image mining framework and techniques 57

Table 1 Image mining review in tabular form (continued)

Sl. no. Year of pub. Paper title Authors Short description

9 1998 Image mining: a new

approach for data mining

Ordonez and

Omiecinski (1998)

Proposed an algorithm for mining that reduced I/O and CPU overhead. This led to

generating new data mining sy stem on the top of content-based image retrieval (CBIR)

system.

10 1998 Mining multimedia data Zaiane and Han

(1998)

Function driven framework for the multimedia miner was proposed. The features of

this framework was that it could organise and clarify the different tasks to be performed

in image mining, but it remained unable to differentiate levels of necessary image

information representation to perform meaningful mining.

11 1999 Mining for image content Burl et al. (1999) Proposed learning techniques to generate recognisers automatically. Classified

examples were used to capture the domain knowledge implicitly

12 2000 Image preprocessing for rapid

selection in pay attention mode

Jeremy and Bonet

(2000)

Proposed a system to find out a specific known object in the image.

13 2000

Image information mining: exploration

of image content in large archives

Datcu and Seidel

(2000)

Proposed an intelligent satellite mining system that included pre-processing, data

acquisition and archiving system which was required to extract the information from

the image, database of raw images, and retrieval of image.

14 2001

Retrieving similar shapes effectively

and efficiently, multimedia tools and

applications

Tan et al. (2001) Propose a multi-level nested R-tree index which retrieves the structure efficiently and

effectively. It helped to select appropriate technique and also helped to design new

technique by proliferate of the retrieval process.

15 2001 A computer-aided visual

exploration system for knowledge

discovery from images

Uehara et al. (2001) Discovered the method of grouping a set of images based on low-level visual features.

This method also used a binary Bayesian classifier which classified the vacation images

into indoor and outdoor categories.

16 2001

Intelligent mining in image databases,

with applications to satellite imaging

and web search, data mining and

computational intelligence

Gibson et al. (2001) Developed an optimal FFT-based mosaicing algorithm to find common patterns in

images and showed that it works well on various kinds of images.

17 2001

An information-driven framework for

image mining

Zhang et al. (2001) New information driven framework was proposed which fixed the problems generated

during function driven framework.

18 2002 Image mining by content Conc et al. (2002) Proposed a technique which helped to extract vital information from any image.

19 2003 Image fusion & mining tools for a

COTS environment

Fay et al. (2003) Proposed a system for multisensory image fusion and interactive mining. This system

was dependant on neural models of colour vision processing, learning and pattern

recognition.

20 2005

ARIRS: association rule based image

retrieval system

Yi et al. (2005) Introduced the technique to use association rule for image retrieval.

58 N. Dey et al.

Table 1 Image mining review in tabular form (continued)

Sl. no. Year of pub. Paper title Authors Short description

21 2005 Information mining in remote sensing

image archives: system evaluation

Daschiel and Datcu

(2005)

In order to process large amounts of remote sensing image data, they developed the

prototype model of information mining system. It consisted of both an online interface

as well as an offline part.

22 2007 Image mining techniques for

classification and segmentation of

brain MRI data

Sheela and Shanthi

(S2007)

Devised an image mining technique to identify the normal and abnormal images of

brain in order to identify any brain diseases from the MRI of abnormal tissues

23 2008 Mining the web for visual concepts Morsillo et al. (2008) Proposed a technique which provided more accurate visualisation of objects by

reducing the noisy search. This model combined both generative and discriminative

elements to perform an efficient retrieval of web images.

24 2009

Multi-modal mining in web image

retrieval computational intelligence

and industrial applications

Zhan (2009) Devised the relation between the two main characteristics of web image i.e. visual

feature cluster and keyword, using multi-mode association rule.

25 2009 Image classification

approach based on manifold

learning in web image mining

Zhu et al. (2009) Developed a better nonlinear algorithm which classified the problem depending upon

the distance between training and test manifold. It also reduced the dimensionality and

complexity

26 2009

Web image mining using concept

sensitive Markov stationary features

Zhang et al. 2009) Introduced a search technique which helped us to understand sensitive Markov

stationary feature (C-MSF) after getting information from the relevant images. It

represented a random walk with restart (RWR) algorithm on images where special

co-occurrence and information were integrated. They were transformed into a classified

form with the help of SVM classifier.

27 2009 Color image clustering using

block truncation algorithm

Silakari et al. (2009) Developed a system which used colour moment and block truncation coding (BTC) for

retrieving the features from image database. For image database clustering purpose

K-map clustering algorithm was used.

28 2010

Novel fuzzy association rule image

mining algorithm for medical decision

support system

Rajendran and

Madheswaran (2010b)

Introduced a search technique which helped to understand sensitive Markov stationary

feature (C-MSF) after getting information from the relevant images. It represented a

random walk with restart (RWR) algorithm on images where special co-occurrence and

information are integrated. They are transformed into a classified form with the help of

an SVM classifier.

29 2010

Hybrid medical image classification

using association rule mining with

decision tree algorithm

Rajendran and

Madheswaran (2010a)

Proposed a method where the tumour could be detected from the CT scan report of the

brain by removing all other inconsistencies from the image report. The accuracy of

detecting tumour from images using this method was much better than other

techniques.

Image mining framework and techniques 59

Table 1 Image mining review in tabular form (continued)

Sl. no. Year of pub. Paper title Authors Short description

30 2010 Color image classification

and retrieval using image

mining techniques

Mohan and Kannan

(2010)

A new technique of colour image classification and retrieval was proposed to improve

user interaction with image retrieval systems by fully exploiting the similarity

information.

31 2010

Image mining using content based

image retrieval system

Dubey (2010) Devised the technique based on colour histogram and image texture. The resulting

image was generated after querying necessary images.

32 2011

Association rule mining based on

image content

Deshpande (2011) Presented a data mining technique for finding image content based association rule. A

feasibility study of data mining was done based on image content.

33 2011

An experiential survey on image

mining tools, techniques and

applications

Devasena et al. (2011) Provided fully automated age estimation engine QBIC, Photobook, Swim, Virage,

Visualseek, Netra, MARS etc.

34 2005 Robust content-based image

retrieval system using multiple features

representations

Tahoun et al. (2005) Comparison between the combination of wavelet-based representations of the texture

feature and the colour feature with and without using the colour layout feature was

done.

35 2013 Image mining for image retrieval using

hierarchical k-means algorithm

Jain et al. (2013) Image retrieval was done using K-means algorithm.

36 2014 Big data new frontiers:

mining, search and management of

massive repositories of solar image

data and solar events

Banda et al. (2014) Used solar image data and events, and the process to use big data methodologies

37 2014 Mining weakly labeled web

facial images for search-based

face annotation

Wang et al. (2014) Proposed a method of image mining using weak facial images for face annotation.

38 2014 Mining mid-level features for

image classification

Fernando et al. (2014) Proposed an image classification technique using mid-level features.

39 2014

Toward dynamic scene understanding

by hierarchical motion pattern mining

Song et al. (2014) Used hierarchical motion pattern mining for understanding dynamic scene.

40 2011 Multivariate image mining Herold et al. (2011) Discussed about the multivariate image mining.

41 2014 Image mining: an overview

of current research

Khodaskar and

Ladhake (2014)

Illustrated the current research status on the domain of image mining.

42 2014

Mining frequent items in data stream

using time fading model

Chen and Mei (2014) Introduced new image mining frequent items, during data stream.

60 N. Dey et al.

The four levels were generalised further into two layers: lower layer comprised with the

pixel level and the object level, while the upper layer was concerned with pattern and

knowledge level as well as the semantic concept level. The lower layer consisted of raw

and extracted image information and performed the image processing, images analysis

and recognition. Operations such as semantic concept generation, knowledge discovery

from image database were caused by the higher layer. The main differences between two

layers are the upper layer information was more logical and meaningful than that of

lower level information.

5 Comparative study

This paper discussed and compared different image mining techniques and also discussed

about various image mining frameworks. The discussion and overview of all such

techniques and frameworks, helped to establish a comparative study among the existing

image mining methods. This paper also provided a comparative approach between the

image mining (Fernando et al., 2014) methods in tabular form (see Table 1). Although,

no method or framework was established as a superior than others. This paper was more

focused towards different attributes of various methods than claiming one as the superior

to others

6 Conclusions

The image mining task on image datasets majorly deals with classification, clustering,

and/or mining of knowledge from images using association rules and neural network. It

can be used to group the images on remote sensing, world wide web, medical diagnosis,

efficient retrieval of images, or to extract hidden meaningful information from image

datasets which is not explicitly available from image sources. Hence, this review paper

will help us in selecting an appropriate image mining technique among all the available

techniques. This paper still remains pilot in nature and requires further validation. Future

work may include discussion about new image mining methods and the updated

frameworks, also comparing them with previously discussed methods.

References

Bach, J.R., Fuller, C., Gupta, A., Hampapur, A., Horowitz, B., Humphrey, R., Jain, R. and Virage

CFS (1996) ‘Image search engine: an open framework for image management’, Storage and

Retrieval for Image and Video Databases, SPIE, Vol. 6, No. 76, pp.76–87.

Banda, J.M., Schuh, M.A., Angryk, R.A., Pillai, G.K. and McInerney, P. (2014) ‘Big data new

frontiers: mining, search and management of massive repositories of solar image data and

solar events’, Advances in Intelligent Systems and Computing, Vol. 241, No. 2, pp.151–158.

Bhattacharya, T., Dey, N., Chaudhuri, S.R.B. (2012) ‘A session based multiple image hiding

technique using DWT and DCT’, International Journal of Computer Applications, Vol. 38,

No. 5, pp.18–21.

Burl, M.C., Fowlkes, C. and Rowden, J. (1999) ‘Mining for image content’, Systemics,

Cybernetics, and Informatics and Information Systems: Analysis and Synthesis, pp.126–134,

Orlando, FL.

Image mining framework and techniques 61

Chakraborty, S., Maji, P., Pal, A.K., Biswas, D. and Dey, N. (2012) ‘Reversible color image

watermarking using trigonometric functions’, International Conference on Electronic

Systems, Signal Processing and Computing Technologies, Nagpur, pp.105–110.

Chakraborty, S., Samanta, S., Mukherjee, A., Dey, N. and Chaudhuri, S.S. (2013) ‘Particle swarm

optimization based parameter optimization technique in medical information hiding’, IEEE

International Conference on Computational Intelligence and Computing Research (ICCIC),

pp.1–6.

Chen, L. and Mei, Q. (2014) ‘Mining frequent items in data stream using time fading model’,

Information Sciences, Vol. 257, pp.54–69.

Conc, A., Mathias, E. and Castro, M.M. (2002) ‘Image mining by content’, Expert Systems with

Applications, Vol. 23, No. 4, pp.377–383.

Das, P., Munshi, R. and Dey, N. (2012) ‘Alattar’s method based reversible watermarking technique

of EPR within heart sound in wireless telemonitoring’, Intellectual Property Rights And

Patent Laws, IPRPL-2012, Jadavpur University, August 25, 2012.

Daschiel, H. and Datcu, M. (2005) ‘Information mining in remote sensing image archives: system

evaluation’, IEEE Transactions on Geoscience and Remote Sensing, Vol. 43, No. 1,

pp.188–199.

Datcu, M. and Seidel, K. (2000) ‘Image information mining: exploration of image content in large

archives’, IEEE Conference on Aerospace, Vol. 3, pp.253–264.

Deshpande, D. (2011) ‘Association rule mining based on image content’, International Journal of

Information Technology and Knowledge Management, Vol. 4, No. 1, pp.143–146.

Devasena, L., Sumathi, T. and Hemalatha, M. (2011) ‘An experiential survey on image mining

tools, techniques and applications’, International Journal on Computer Science and

Engineering (IJCSE), Vol. 3, No. 3, pp.1155–1160.

Dey, N., Biswas, S., Roy, A.B., Das, A. and Chaudhuri, S.S. (2012a) ‘Analysis of

photoplethysmographic signals modified by reversible watermarking technique using

prediction-error in wireless telecardiology’, International Conference of Intelligent

Infrastructure, 47th Annual National Convention of CSI, McGraw-Hill Proceeding.

Dey, N., Das, P., Das, A. and Chaudhuri, S.S. (2012b) ‘DWT-DCT-SVD based intravascular

ultrasound video watermarking’, Second World Congress on Information and Communication

Technologies (WICT 2012), pp.224–229.

Dey, N., Das, P., Das, A. and Chaudhuri, S.S. (2012c) ‘DWT-DCT-SVD based blind watermarking

technique of gray scale image in electrooculogram signal’, International Conference on

Intelligent Systems Design and Applications (ISDA-2012), pp.680–685.

Dey, N., Das, P., Das, A. and Chaudhuri, S.S. (2012d) ‘DWT-DCT-SVD based intravascular

ultrasound video watermarking’, Second World Congress on Information and Communication

Technologies (WICT 2012), pp.224–229.

Dey, N., Das, P., Das, A. and Chaudhuri, S.S. (2012e) ‘DWT-DCT-SVD based blind watermarking

technique of gray scale image in electrooculogram signal’, International Conference on

Intelligent Systems Design and Applications (ISDA-2012), pp.680–685.

Dey, N., Das, P., Das, A. and Chaudhuri, S.S. (2012f) ‘Feature analysis for the blind-watermarked

electroencephalogram signal in wireless telemonitoring using Alattar’s method’, 5th

International Conference on Security of Information and Networks (SIN 2012), in technical

cooperation with ACM Special Interest Group on Security, Audit and Control (SIGSAC),

Jaipur, India, pp.87–94.

Dey, N., Mishra, G., Nandi, B., Pal, M., Das, A. and Chaudhuri, S.S. (2012g) ‘Wavelet based

watermarked normal and abnormal heart sound identification using spectrogram analysis’,

2012 IEEE International Conference on Computational Intelligence and Computing Research

(ICCIC), pp.1–7.

62 N. Dey et al.

Dey, N., Mukhopadhyay, S., Das, A. and Chaudhuri, S.S. (2012h) ‘Using DWT analysis of P, QRS

and T components and cardiac output modified by blind watermarking technique within the

electrocardiogram signal for authentication in the wireless telecardiology’, I.J. Image,

Graphics and Signal Processing (IJIGSP), Vol. 4, No. 7, pp.33–46, ISSN:2074–9074.

Dey, N.S., Biswas, Das, P., Das, A. and Chaudhuri, S.S. (2012i) ‘Feature analysis for the reversible

watermarked electrooculography signal using low distortion prediction-error expansion’, 2012

International Conference on Communications, Devices and Intelligent Systems (CODIS),

pp.624–627.

Dey, N.S., Biswas, Das, P., Das, A. and Chaudhuri, S.S. (2012j) ‘Lifting wavelet transformation

based blind watermarking technique of photoplethysmographic signals in wireless

telecardiology’, Second World Congress on Information and Communication Technologies

(WICT 2012), pp.230–235.

Dey, N., Chakraborty, S. and Samanta, S. (2013a) Optimization of Watermarking in Biomedical

Signal, Lambert Publication, Heinrich-Böcking-Straße 6, 66121 Saarbrücken, Germany

[ISBN-13: 978-3-659-46460-7].

Dey, N., Das, P., Biswas, D., Maji, P., Das, A. and Chaudhuri, S.S. (2013b) ‘Visible watermarking

within the region of non-interest of medical images based on fuzzy c-means and Harris corner

detection’, The Fourth International Workshop Communications Security & Information

Assurance (CSIA-2013), Springer, pp.161–168.

Dey, N., Maji, P., Das, P., Das, A. and Chaudhuri, S.S. (2013c) ‘An edge based watermarking

technique of medical images without devalorizing diagnostic parameters’, International

Conference on Advances in technology and Engineering, pp.1–5.

Dey, N., Nandi, B., Das, P., Das, A. and Chaudhuri, S.S. (2013d) ‘Retention of electrocardiogram

features insignificantly devalorized as an effect of watermarking for a multi-modal biometric

authentication system’, Advances in Biometrics for Secure Human Authentication and

Recognition, ISBN-9781466582422, pp.450.

Dey, N., Samanta, S., Yang, X-S., Chaudhuri, S.S. and Das, A. (2013e) ‘Optimization of scaling

factors in electrocardiogram signal watermarking using Cuckoo Search’, in International

Journal of Bio-Inspired Computation (IJBIC), Vol. 5, No. 5, pp.315–326.

Dey, N., Dey, G., Chakraborty, S., Chaudhuri, S.S. (2014) ‘Feature analysis of blind watermarked

electromyogram signal in wireless telemonitoring’, Concepts and Trends in Healthcare

Information System of the Annals of Information Systems Series, Vol. 16, No. 2014,

pp.205–229.

Dey, N., Dey, M., Biswas, D., Das, P., Das, A. and Chaudhuri, S.S. (2015) ‘Tamper detection of

electrocardiographic signal using watermarked bio-hash code in wireless cardiology’, Special

Issue of the International Journal of Signal and Imaging Systems Engineering, Vol. 8, No. 1,

pp.46–58.

Dey, N., Roy, A.B. and Dey, S. (2011) ‘A novel approach of color image hiding using RGB color

planes and DWT’, International Journal of Computer Applications, Vol. 36, No. 5, pp.19–24.

Dubey, R.S. (2010) ‘Image mining using content based image retrieval system’, (IJCSE)

International Journal on Computer Science and Engineering, Vol. 2, No. 7, pp.2353–2356.

Elsayed, A, Coenen, F., Jiang, C., García-Fiñana, M. and Sluming, V. (2010) ‘Corpus callosum

MR image classification’, Journal Knowledge-Based Systems, Elsevier Science Publishers B,

V. Amsterdam, The Netherlands, The Netherlands, Vol. 23, No. 4, pp.330–336.

Fatma, S.N. (2012) ‘Image mining method and frameworks’, International Journal of

Computational Engineering Research (ijceronline.com) Vol. 2, No. 8, pp.131–145.

Fay, A., Ivey, R.T., Bomberger, N. and Waxm, A.M. (2003) ‘Image fusion & mining tools for a

COTS environment’, Proceedings of the Sixth International Conference of Information

Fusion, Vol. 1, pp.606–613.

Fernando, B., Fromont, E. and Tuytelaars, T. (2014) ‘Mining mid-level features for image

classification’, International Journal of Computer Vision, Vol. 108, No. 3, pp.1–30.

Image mining framework and techniques 63

Flickner, M., Sawhney, H.S., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D.,

Petkovic, D., Steele, D. and Yanker, P. (1995) ‘Query by image and video content: the QBIC

system’, IEEE Computer, Vol. 28, No. 9, pp.23–32.

Gardner, G. and Keating, D. (1996) ‘Automatic detection of diabetic retinopathy using an artificial

neural network: a screening tool’. British Journal of Ophthalmology, Vol. 80, No.11,

pp.940–944.

Gibson, S., Kreinovich, V., Longpre, L., Penn, B. and Starks, S.A. (2001) Intelligent Mining in

Image Databases, with Applications to Satellite Imaging and Web Search, Data Mining and

Computational Intelligence, pp.1–20, Springer-Verlag, Berlin.

Herold, J., Loyek, C. and Nattkemper, T.W. (2011) ‘Multivariate image mining’, Wires Data

Mining Knowledge Discovery, Vol. 1, No. 1, pp.2–13.

Jain, P.M., Gawande, A.D. and Gautam, L.K. (2013) ‘Image mining for image retrieval using

hierarchical K-means algorithm’, International Journal of Research in Computer Engineering

and Electronics, Vol. 2, No. 6, pp.1–6.

Jeremy, S. and Bonet, D. (2000) ‘Image preprocessing for rapid selection in pay attention mode’,

MIT Press.

Kazman, R. and Kominek, J (1993) ‘Information organization in multimedia resources’, 11th

Annual International Conference on Systems Documentation, pp.149–162.

Khodaskar, A.A. and Ladhake, S.A. (2014) ‘Image mining: an overview of current research’,

Fourth International Conference on Communication Systems and Network Technologies

(CSNT), pp.433–438.

Lin, K., Jagadish, H.V. and Faloutsos, C. (1994) ‘The TV-tree: an index structure for

high-dimensional data’, The VLDB Journal, Vol. 3, No. 4, pp.517–542.

Mohan, V. and Kannan, A. (2010) ‘Color image classification and retrieval using image mining

techniques’, International Journal of Engineering Science and Technology, Vol. 2, No. 5,

pp.1014–1020.

Morsillo, N., Pal, C. and Nelson, R. (2008) ‘Mining the web for visual concepts’, Proceedings of

the 9th International Workshop on Multimedia Data Mining: Held in Conjunction with the

ACM SIGKDD 2008, pp.18–25.

Ordonez, C. and Omiecinski, E. (1998) ‘Image mining: a new approach for data mining’, Research

Paper, pp.1–20.

Pal, A.K., Das, P. and Dey, N. (2013) Odd-Even Embedding Scheme Based Modified Reversible

Watermarking Technique Using Blueprint, arXiv preprintarXiv:1303.5972.

Ping, D. and Yueshun, H. (2009) ‘Research of remote sensing image data mining technique based

on web’, Asia-Pacific Conference on Information Processing, Shenzhen, APCIP, Vol. 1,

pp.298–300.

Priyatharshini, R. and Chitrakala, S. (2013) ‘Association rule based image retrieval system’,

Communications in Computer and Information Science, Vol. 296, pp.17–26.

Rajendran, P. and Madheswaran, M. (2010a) ‘Hybrid medical image classification using

association rule mining with decision tree algorithm’, Journal of Computing, Vol. 2, No. 1,

pp.127–136.

Rajendran, P. and Madheswaran, M. (2010b) ‘Novel fuzzy association rule image mining

algorithm for medical decision support system’, International Journal of Computer

Applications, Vol. 1, No. 20, pp.0975–8887.

Rui, Y. and Huang, T.S. (1997) ‘Image retrieval: past, present and future’, Journal of Visual

Communication and Image Representation, Vol. 10, pp.39–62.

Sheela, L.J. and Shanthi, V. (2007) ‘Image mining techniques for classification and segmentation

of brain MRI data’, Journal of Theoretical & Applied Information Technology, Vol. 3, No. 4,

pp.115–121.

64 N. Dey et al.

Silakari, S., Motwani, M. and Maheshwari, M. (2009) ‘Color image clustering using block

truncation algorithm’, IJCSI International Journal of Computer Science Issues, Vol. 4, No. 2,

pp.31–35.

Song, L., Jiang, F., Shi, Z., Molina, R. and Katsaggelos, A.K. (2014) ‘Toward dynamic scene

understanding by hierarchical motion pattern mining’, IEEE Transactions on Intelligent

Transportation Systems, Vol. 15, No. 3, pp.1273–1285.

Tahoun, M.A., Nagaty, K.A., El-Arief, T.I. and Megeed, M.A. (2005) ‘Robust content-based image

retrieval system using multiple features representations’, Proc. of Networking, Sensing and

Control, pp.116–122.

Tan, K.L., Ooi, B.C. and Thiang, L.F. (2001) Retrieving Similar Shapes Effectively and Efficiently,

Multimedia Tools and Applications, Kluwer Academic Publishers, The Netherlands.

Uehara, Y., Endo, S., Shiitani, S., Masumoto, D. and Nagata, S. (2001) ‘A computer-aided visual

exploration system for knowledge discovery from images’, Second International Workshop on

Multimedia Data Mining (MDM/KDD’2001), pp.102–109.

Wang, D., Hoi, S.C.H., He, Y. and Zhu, J. (2014) ‘Mining weakly labeled web facial images for

search-based face annotation’, IEEE Transactions on Knowledge and Data Engineering,

Vol. 26, No. 1, pp.166–179.

Wang, J.Z. and Li, J. (1997) ‘System for screening objectionable images using daubechies’,

wavelets and color histograms’, Proceedings of the Fourth European Workshop (IDMS’97),

pp.1–12.

Yi, H., Rajan, D. and Chia, L-T. (2005) ‘ARIRS: association rule based image retrieval system’,

International Workshop for Advanced Imaging Technology (IWAIT‘05), pp.1–6.

Zaiane, O.R. and Han, J.W. (1998) ‘Mining multimedia data’, CASCON’98, pp.83–96.

Zhan, R.H.W. (2009) ‘Multi-modal mining in web image retrieval computational intelligence and

industrial applications’, PACIIA 2009, Vol. 2, pp.425–428.

Zhang, H. and Zhong, D. (1995) ‘A scheme for visual features based image retrieval’, Storage and

Retrieval for Image and Video Databases, SPIE, pp.1–7.

Zhang, J., Hsu, W. and Lee, M.L. (2001) ‘An information-driven framework for image mining’,

Database and Expert Systems Applications in Computer Science, Vol. 2113, pp.232–242.

Zhang, J., Liu, H. and Ma, S. (2009) ‘Web image mining using concept sensitive Markov

stationary features’, IEEE International Conference on Multimedia and Expo, pp.462–465.

Zhu, R., Yao, M. and Liu, Y. (2009) ‘Image classification approach based on manifold learning in

web image mining’, Advanced Data Mining and Applications, Vol. 5678, pp.780–787.

Novel Adaptive Histogram Binning-Based Lesion Segmentation for Discerning Severity in COVID-19 Chest CT Scan Images

Article

Full-text available

Jan 2023

Coronavirus sickness (COVID-19) recently adversely disrupted the medical care system and the entire economy. Doctors, researchers, and specialists are working on new-fangled methods to detect COVID-19 relatively efficiently, such as constructing computerized COVID-19 detection systems. Medical imaging, such as Computed Tomography (CT), has a lot of opportunity as a solution to RT-PCR approaches for quantitative assessment and disease monitoring. COVID-19 diagnosis based on CT images can provide speedy and accurate results. A quantitative criterion for diagnosis is provided by an automated segmentation method of infection areas in the lungs. As an outcome, automatic image segmentation is in high demand as a clinical decision aid tool. To detect COVID-19, Computed Tomography images might be employed instead of the time-consuming RT-PCR assay. In this research, a unique technique is provided for segmenting infection areas in the lungs using CT scan images from COVID-19 patients. “Ground Glass Opacity (GGO)” regions were detected using Novel Adaptive Histogram Binning Based Lesion Segmentation (NAHBLS) method. Many metrics were also employed to evaluate the proposed method, including “Sorensen–Dice similarity”, “Sensitivity”, “Specificity”, “Precision”, and “Accuracy” measures. Experiments have shown that the proposed method can effectively separate the lung infections with good accuracy. The results show that the proposed Novel Adaptive Histogram Binning Based Lesion Segmentation based on automatic approach is effective at segmenting the lesion region of the image and calculated the Infection Rate (IR) over the lung region in Computed Tomography scan.

Novel Architecture for Image Classification Based on Rough Set

Article

Full-text available

Jan 2023

The Computed Tomography (CT) scan images classification problem is one of the most challenging problems in recent years. Different medical treatments have been developed based on the correctness of CT scan images classification. In this work, a novel deep learning architecture is proposed to correctly diagnose COVID-19 patients using CT scan images. In fact, a new classifier based on rough set theory is suggested. Extensive experiments showed that the novel deep learning architecture provides a significant improvement over well-known classifier. The new classifier produces 95% efficiency and a very low error rate on different metrics. The suggested deep learning architecture coupled with novel tolerance outperforms the other standard classification approaches for the detection of COVID-19 using CT-Scan images.

Image Mining for Real Time Quality Assurance in Rapid Prototyping

Conference Paper

Dec 2019

The development of new products could be a powerful engine for the success of an organization. In order to efficiently design prototypes using Additive Manufacturing techniques, it is advisable to perform Quality Assurance in real time. This can result in saving resources and increasing efficiency. However, it is important that configurations within the printable file and the calibration of 3d printer are without errors. A potential solution to avoid errors and low-quality prototypes is the use of Image Mining for a Real Time Quality Assurance of the printed prototype. As part of this paper, we developed such an Image Mining application using a design science research approach. Thereby, a supervised machine learning approach is considered to assign a quality class to the prototype in production. As a result, we identified the contribution Image Mining can make to Quality Assurance and the relationship between the accuracy of classification and the latency.

Novel Hybrid Genetic Arithmetic Optimization for Feature Selection and Classification of Pulmonary Disease Images

Article

Full-text available

Jan 2023

The difficulty in predicting early cancer is due to the lack of early illness indicators. Metaheuristic approaches are a family of algorithms that seek to find the optimal values for uncertain problems with several implications in optimization and classification problems. An automated system for recognizing illnesses can respond with accuracy, efficiency, and speed, helping medical professionals spot abnormalities and lowering death rates. This study proposes the Novel Hybrid GAO (Genetic Arithmetic Optimization algorithm based Feature Selection) (Genetic Arithmetic Optimization Algorithm-based feature selection) method as a way to choose the features for several machine learning algorithms to classify readily available data on COVID-19 and lung cancer. By choosing just important features, feature selection approaches might improve performance. The proposed approach employs a Genetic and Arithmetic Optimization to enhance the outcomes in an optimization approach.

Dental X-ray Identification System Based on Association Rules Extracted by k-Symbol Fractional Haar Functions

Article

Full-text available

Nov 2022

Several identification approaches have recently been employed in human identification systems for forensic purposes to decrease human efforts and to boost the accuracy of identification. Dental identification systems provide automated matching by searching photographic dental features to retrieve similar models. In this study, the problem of dental image identification was investigated by developing a novel dental identification scheme (DIS) utilizing a fractional wavelet feature extraction technique and rule mining with an Apriori procedure. The proposed approach extracts the most discriminating image features during the mining process to obtain strong association rules (ARs). The proposed approach is divided into two steps. The first stage is feature extraction using a wavelet transform based on a k-symbol fractional Haar filter (k-symbol FHF), while the second stage is the Apriori algorithm of AR mining, which is applied to find the frequent patterns in dental images. Each dental image’s created ARs are saved alongside the image in the rules database for use in the dental identification system’s recognition. The DIS method suggested in this study primarily enhances the Apriori-based dental identification system, which aims to address the drawbacks of dental rule mining.

Convolutional Neural Network Based Partial Face Detection

Conference Paper

Full-text available

Apr 2022

Due to the massive explanation of artificial intelligence, machine learning technology is being used in various areas of our day-to-day life. In the world, there are a lot of scenarios where a simple crime can be prevented before it may even happen or find the person responsible for it. A face is one distinctive feature that we have and can differentiate easily among many other species. But not just different species, it also plays a significant role in determining someone from the same species as us, humans. Regarding this critical feature, a single problem occurs most often nowadays. When the camera is pointed, it cannot detect a person’s face, and it becomes a poor image. On the other hand, where there was a robbery and a security camera installed, the robber’s identity is almost indistinguishable due to the low-quality camera. But just making an excellent algorithm to work and detecting a face reduces the cost of hardware, and it doesn’t cost that much to focus on that area. Facial recognition, widget control, and such can be done by detecting the face correctly. This study aims to create and enhance a machine learning model that correctly recognizes faces. Total 627 Data have been collected from different Bangladeshi people's faces on four angels. In this work, CNN, Harr Cascade, Cascaded CNN, Deep CNN & MTCNN are these five machine learning approaches implemented to get the best accuracy of our dataset. After creating and running the model, Multi-Task Convolutional Neural Network (MTCNN) achieved 96.2% best model accuracy with training data rather than other machine learning models.

A New Method Combining Pattern Prediction and Preference Prediction for Next Basket Recommendation

Article

Full-text available

Oct 2021
Entropy

Market basket prediction, which is the basis of product recommendation systems, is the concept of predicting what customers will buy in the next shopping basket based on analysis of their historical shopping records. Although product recommendation systems develop rapidly and have good performance in practice, state-of-the-art algorithms still have plenty of room for improvement. In this paper, we propose a new algorithm combining pattern prediction and preference prediction. In pattern prediction, sequential rules, periodic patterns and association rules are mined and probability models are established based on their statistical characteristics, e.g., the distribution of periods of a periodic pattern, to make a more precise prediction. Products that have a higher probability will have priority to be recommended. If the quantity of recommended products is insufficient, then we make a preference prediction to select more products. Preference prediction is based on the frequency and tendency of products that appear in customers’ individual shopping records, where tendency is a new concept to reflect the evolution of customers’ shopping preferences. Experiments show that our algorithm outperforms those of the baseline methods and state-of-the-art methods on three of four real-world transaction sequence datasets.

A differentiation between Image Mining and Computer Vision in the application area of Big Data

Conference Paper

Dec 2020

Analyzing Tagore's Emotion With the Passage of Time in Song-Offerings: A Philosophical Study Based on Computational Intelligence

Article

Jul 2019

The emotions of humans can be observed through tears, smiles, etc. The emotion of poets is reflected through poetry/songs. The works of a poet give philosophical insights about the beauty and mystery of nature, socio-economic conditions of that era, besides his personal state of mind. In the proposed work ‘Song- Offerings': A collection of poems and songs composed by Rabindranath Tagore, for which, Tagore received the Nobel Prize for literature in 1913, has been analyzed. Earlier, most of the research work on Song-Offerings was based on Zipf's law or bibliometric laws. This article analyzes the changes in Tagore's emotion in Song-Offerings with the passage of time (1895-1912). Emotions are analyzed based on the Arousal-Valence Model. To analyze the arousal state, ‘Plutchik's' emotion model has been employed and to find the valence, a Fuzzy-based model has been engaged. The work reveals that the emotions of the poet gradually mellows with the passage of time barring some transitional time, nevertheless, poet submission towards almighty remains unchanged during this period.

Towards Efficient for Learning Model Image Retrieval

Conference Paper

Full-text available

Sep 2018

IMAGE MINING TECHNIQUES FOR CLASSIFICATION AND SEGMENTATION OF BRAIN MRI DATA

Article

Full-text available

Nov 2019

L Jaba Sheela

Alattar’s Method based Reversible Watermarking Technique of EPR within Heart Sound in Wireless Telemonitoring

Conference Paper

Full-text available

Aug 2012

Mining Weakly Labeled Web Facial Images for Search-Based Face Annotation

Article

Full-text available

Jan 2014

This paper investigates a framework of search-based face annotation (SBFA) by mining weakly labeled facial images that are freely available on the World Wide Web (WWW). One challenging problem for search-based face annotation scheme is how to effectively perform annotation by exploiting the list of most similar facial images and their weak labels that are often noisy and incomplete. To tackle this problem, we propose an effective unsupervised label refinement (ULR) approach for refining the labels of web facial images using machine learning techniques. We formulate the learning problem as a convex optimization and develop effective optimization algorithms to solve the large-scale learning task efficiently. To further speed up the proposed scheme, we also propose a clustering-based approximation algorithm which can improve the scalability considerably. We have conducted an extensive set of empirical studies on a large-scale web facial image testbed, in which encouraging results showed that the proposed ULR algorithms can significantly boost the performance of the promising SBFA scheme.

Analysis of P-QRS-T Components Modified by Blind Watermarking Technique Within the Electrocardiogram Signal for Authentication in Wireless Telecardiology Using DWT

Article

Full-text available

Jul 2012

Presently considerable amount of work has been done in tele-monitoring which involves the transmission of bio-signals and medical images in the wireless media. Intelligent exchange of bio-signals amongst hospitals needs efficient and reliable transmission. Watermarking adds “ownership” information in multimedia contents to prove the authenticity, to verify signal integrity, or to achieve control over the copy process. This paper proposes a novel session based blind watermarking method with secret key by embedding binary watermark image into (Electrocardiogram) ECG signal. The ECG signal is a sensitive diagnostic tool that is used to detect various cardio-vascular diseases by measuring and recording the electrical activity of the heart in exquisite detail. The first part of this paper proposes a multi-resolution wavelet transform based system for detection ‘P’,‘Q’,‘R’,‘S’,‘T’ peaks complex from original ECG signal of human being. ‘R-R’ time lapse is an important component of the ECG signal that corresponds to the heartbeat of the concerned person. Abrupt increase in height of the ‘R’ wave or changes in the measurement of the ‘R-R’ interval denote various disorders of human heart. Similarly ‘P-P’, ‘Q-Q’, ‘S-S’, ‘T-T’ intervals also correspond to different disorders of heart and their peak amplitude envisages other cardiac diseases. In this proposed method the ‘P Q R S T’-peaks are marked and stored over the entire signal and the time interval between two consecutive ‘R’-peaks and other peaks interval are measured to detect anomalies in behavior of heart, if any. The peaks are achieved by the composition of Daubechies sub-bands wavelet of original ECG signal. The accuracy of the P, QRS and T components detection and interval measurement is achieved with high accuracy by processing and thresholding the original ECG signal. The second part of the paper proposes a Discrete Wavelet Transformation (DWT) and Spread Spectrum based watermarking technique. In this approach, the generated watermarked signal having an acceptable level of imperceptibility and distortion is compared to the Original ECG signal. Finally, a comparative study is done for the intervals of two consecutive ‘R-R’ peaks, ‘P-R’, ‘Q-T’, ‘QTc’, QRS duration, cardiac output between original P, QRS and T components detected ECG signal and the watermarked P,QRS and T components detected ECG signal.

Optimization of Watermarking in Biomedical Signal

Book

Full-text available

Jan 2014

Biomedical signal and information hiding has a long history. Written in an easy-to-read and comprehend, straightforward style, this book addresses to the need of improvements in data hiding algorithms or techniques within a biomedical signal to ensure authenticity and security of patients’ information. The author has used embedding of watermark within biomedical signal as the optimization problem, which is solved using Particle Swarm Optimization and Genetic Algorithm based techniques. This book also offers MATLAB result sets, tables, flow charts and illustrations to exemplify the complicated concepts discussed in the text. This book is published in the hope that it will interest students and research workers in biomedical signals and watermarking.

Analysis of Photoplethysmographic Signals Modified by Reversible Watermarking Technique using Prediction-Error in Wireless Telecardiology

Conference Paper

Full-text available

Jan 2013

In the present medical era, considerable amount of work has been done in tele-monitoring that involves transmission of biomedical signals through wireless media. Exchange of biomedical signals amongst hospitals requires efficient and reliable transmission. Watermarking is added “ownership” information in multimedia content to prove authenticity, verify signal integrity, and achieve control over the copy process. The Photoplethysmography (PPG) signal is a very responsive and significant diagnostic tool in the medical world to analyse the functioning of the human heart. The variance in its characteristics also provides important information about cardiac diseases incorporated with the same cause and raises alarm about heart malfunctioning. This is assimilated with the all-time invaluable medical contrivance, Electrocardiography (ECG) signal that is used to detect various cardio-vascular diseases by measuring and recording the electrical activity of the heart in exquisite detail. This paper proposes a method of reversible binary watermark embedding into the PPG signal and a watermark extraction mechanism using Prediction Error Based Algorithm. In this approach, the generated watermarked signal having an acceptable level of imperceptibility and distortion is compared to the original PPG signal. Finally, a comparative study of detected R peak from the ECG with the Peak and valley of PPG signal is done to measure the diagnostic value change as an effect of watermarking. Telecardiology is achieved by successful transmission of the watermarked PPG signal with reversibility of watermark to avoid discrepancies in data to deliver precise treatment.

Odd-Even Embedding Scheme Based Modified Reversible Watermarking Technique using Blueprint

Conference Paper

Mar 2013

Digital watermarking is a technique of information adding or information hiding in order to identify the owner of the data in multimedia content. It seems that a signal or digital image can permanently embed over another digital data providing a good way to protect intellectual property from illegal replication. The cover data that is transmitted through the internet hides the watermark in a computer aided assertion method such that it becomes undetectable. Finally it stands as a hindrance over many operations without harming the embedded host document. Unfortunately, many owners of the digital materials such as images, text, audio and video are reluctant to the spreading of their documents on the web or other networked environment, because the ease of duplicating digital materials facilitates copyright violation. Digital media distribution occurs through various channels. The cover data may or may not hold any relation with the watermark information. In the last two decades, a considerable amount of research has been done on the digital watermarking of multimedia files such as audio, video, images and text. Different type of watermarking algorithms has been proposed by the researchers to achieve high level of security and authenticity. In our proposed method, a modified reversible watermarking technique is introduced, which employs a blueprint generation of original image based on odd-even embedding methodology to yield large data hiding capacity, security as well as high watermarked quality. The experimental results demonstrate that, no matter how much secret data is embedded, the watermarked quality is about 51dB in this proposed scheme.

Intelligent mining in image databases, with applications to satellite imaging and to web search

Article

Jan 2001

S. Gibson

Image Mining: An Overview of Current Research

Conference Paper

Apr 2014

We devoted this paper to concise overview of recent developments in image mining techniques which helps in many applications. To improve performance of image retrieval, researchers focus image mining techniques. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images. Efficient image searching, browsing and retrieval tools are required by users from various domains, including remote sensing, fashion, crime prevention, publishing, medicine, in architecture, etc. Current research in image mining is still in its infancy. In this paper, we tried to focus on current developments in image mining, particularly, image mining frameworks, techniques and systems.

Analysis of P-QRS-T Components Modified by Blind Watermarking Technique Within the Electrocardiogram Signal for Authentication in Wireless Telecardiology Using DWT

Article

Jul 2012

Nilanjan Dey

Presently considerable am ount of work has been done in tele-monitoring which involves the transmission of bio-signals and medical images in the wireless media. Intelligent exchange of bio-signals amongst hospitals needs efficient and reliable transmission. Watermarking adds ―ownership‖ information in multimedia contents to prove the authenticity, to verify signal integrity, or to achieve control over the copy process. This paper proposes a novel session based blind watermarking method with secret key by embedding binary watermark i mage into (Electrocardiogram) ECG signal. The ECG signal is a sensitive diagnostic tool that is used to detect various cardio-vascular diseases by measuring and recording the electrical activity of the heart in exquisite detail. The first part of this paper proposes a multi-resolution wavelet transform based system for detection P,Q,R,S,T peaks complex from original ECG signal of human being. R-R time lapse is an important component of the ECG signal that corresponds to the heartbeat of the concerned person. Abrupt increase in height of the R wave or changes in the measurement of the R-R interval denote various disorders of human heart. Similarly P-P, Q-Q, S-S, T-T intervals also correspond to different disorders of heart and their peak amplitude envisages other cardiac diseases. In this proposed method the P Q R S T-peaks are marked and stored over the entire signal and the time interval between two consecutive R-peaks and other peaks interval are measured to detect anomalies in behavior of heart, if any. The peaks are achieved by the composition of Daubechies sub-bands wavelet of original ECG signal. The accuracy of the P, QRS and T components detection and interval measurement is achieved with high accuracy by processing and thresholding the original ECG signal. The second part of the paper proposes a Discrete Wavelet Transformation (DWT) and Spread Spectrum based watermarking technique. In this approach, the generated watermarked signal having an acceptable level of imperceptibility and distortion is compared to the Original ECG signal. Finally, a comparative study is done for the intervals of two consecutive 'R-R‘ peaks, 'P-R‘, 'Q-T‘, 'QTc‘, QRS duration, cardiac output between original P, QRS and T components detected ECG signal and the watermarked P,QRS and T components detected ECG signal.

Image Mining Framework and Techniques: A Review

Abstract and Figures

Recommended publications

DTFA Rule Mining-Based Model to Predict Students’ Performance

Detecting Unusual Behaviour and Mining Unstructured Data

Mining pest level based on Weather using associate classification

A Hybrid Image Mining Technique using LIMbased Data Mining Algorithm