ArticlePDF Available

Abstract and Figures

E-mail is one of the most popular programs used by most people today. As a result of the continuous daily use, thousands of messages are accumulated in the electronic box of most individuals, which make it difficult for them after a period of time to retrieve the attachments of these messages. Most Email providers constantly improved their search technology, but till now there is something could not be done; i.e., searching inside attachments. Some email providers like Gmail has added searching words inside attachments for some file types (.pdf files, .doc documents, .ppt presentations) but for image files this feature not supported till now. However, E-mail providers and even modern researches have not focused on retrieving the image attachments in the E-mail box. The paper was aimed to introduce a novel idea of using Content based Image Retrieval (CBIR) in E-mail application to retrieve images from email attachments based on entire contents. The work main phases are: feature extraction based on color features and connect to Email server to read Emails, the second phase is retrieving similar image attachments. The tests carried on mail inbox contain 100 messages with 500 image attachments and gave good precision and recall rates When the threshold value is less than or equal to 0.4.
Content may be subject to copyright.
Kurdistan Journal for Applied Research kjar.spu.edu.iq
Volume 2, Issue 1, June 2017 P-ISSN: 2411-7684 E-ISSN: 2411-7706
An Online Content Based Email Attachments Retrieval
System
http://dx.doi.org/10.24017/science.2017.1.12
Dr. Noor Ghazi M. Jameel
Technical College of Informatics
Sulaimani Polytechnic University
Sulaimani, Iraq
Noor.ghazi@spu.edu.iq
Dr. Esraa Zeki Mohammed
Kirkuk Dept.
State company for Internet Services
Kirkuk, Iraq
Isramohammed2@gmail.com
Dr. Loay Edwar George
Computer Science Dept.
University of Baghdad
Baghdad, Iraq
loayedwar57@scbaghdad.edu.iq
Abstract: E-mail is one of the most popular programs
used by most people today. As a result of the
continuous daily use, thousands of messages are
accumulated in the electronic box of most individuals,
which make it difficult for them after a period of time
to retrieve the attachments of these messages. Most
Email providers constantly improved their search
technology, but till now there is something could not be
done; i.e., searching inside attachments. Some email
providers like Gmail has added searching words inside
attachments for some file types (.pdf files, .doc
documents, .ppt presentations) but for image files this
feature not supported till now. However, E-mail
providers and even modern researches have not
focused on retrieving the image attachments in the E-
mail box. The paper was aimed to introduce a novel
idea of using Content based Image Retrieval (CBIR) in
E-mail application to retrieve images from email
attachments based on entire contents. The work main
phases are: feature extraction based on color features
and connect to Email server to read Emails, the second
phase is retrieving similar image attachments. The tests
carried on mail inbox contain 100 messages with 500
image attachments and gave good precision and recall
rates When the threshold value is less than or equal to
0.4.
Keywords: CBIR, Color Features, Email Attachments,
Email Retrieval System, Image Retrieval, Similarity
Measure.
1. INTRODUCTION
he Methodology for searching images efficiently is an
important research topic and retrieving images that
match user’s needs is not a simple task [1].
These days, images are used in numerous applications;
hence, finding successful techniques for retrieving
images have gotten extensive interest. To overcome the
problems of the traditional approaches for retrieving
images based on keywords, CBIR was introduced [2].
The most recognized feature for image retrieval is color.
It considered as primitive feature for classy image
retrieval systems. One of the methodologies used for
color feature extraction is Color Histogram (CH). CH
shows the distribution of color contents in an image. It is
very fast and efficient technique. Many commercial and
academic systems used CH for image retrieval such as
QBIC, NETRA, RETIN, KIWI, and Image Minor [3].
Email still fill in as imperative application to store data
and information for their day by day activities [4]. Some
of this information is attachments attached to email
messages. Attachments include images, audio, video,
PDF, Word documents, and so on. In this paper an
online images retrieval system is introduced to retrieve
images from email attachments based on the content of
the image.
2. LITERATURE REVIEW
Recently, there was noticeable increase for utilizing the
developed CBIR methods in different applications, for
example:
Loay and Mohammed [5], improved the retrieval
performance based on texture features. They use 600
samples from variety human tissues and the results
reflected very high retrieving rates.
Alsmadi and Alhami [6], evaluated several approaches
to cluster emails based on their contents. For
classification purpose algorithms were developed for
large collection of text.
Yuvaraj and Hariharan [7], presented similar objects
matching depending on three features using computer
vision. The experiments were conducted using Matlab
software; the results indicated that region based and
color histogram based methods are effective methods.
Dubey et al. [8] introduced two multi-channel decoded
local binary patterns; the experiments applied on 10 DB
with variety natural scene and textures.
PyykkÖ and Glowacka [9] used deep neural network for
interactive content based image retrieval by using few
training samples to learn automatically from users’
interaction and feedback to reduce the training time.
Image features were extracted using Convolutional
Neural Networks (CNN).
Parthiban and Srinivasa [10] used Adaboost algorithm to
classify images based on bag of features to minimized
the storage cost and for efficient retrieval.
3. CONCEPTS AND METHODS
3.1 Email System
The electronic mail is one of the most common internet
services, it remains one of its important applications over
the years. Email has enormous features, including
sending messages with hyperlinks, attachments, HTML
text, and embedded photos [11].
T
Kurdistan Journal for Applied Research kjar.spu.edu.iq
Volume 2, Issue 1, June 2017 P-ISSN: 2411-7684 E-ISSN: 2411-7706
http://dx.doi.org/10.24017/science.2017.1.12
3.1.1 Email System Architecture
The email system architecture is illustrated in Figure (1).
It contains two sub systems: (i) the user agents are used
to read, send, compose, replies to messages, display
incoming messages, and arrange messages by filing,
searching, and deleting them. Examples to most common
user agents are Google Gmail, Microsoft Outlook,
Mozilla and Apple Mail. (ii) The message transfer
agents, are used to send messages from the source to the
destination with the help of Simple Mail Transfer
Protocol (SMTP). They are also known as mail servers.
[12][13].
Figure 1 Architecture of email system [13]
3.1.2 Email Message Format
The email has an envelope and a message. The sender
and the receiver addresses are contained in the envelope
part of the email. The message part contains the header
and the body. Messages must be formatted in a standard
way to be handled by message transfer agents. RFC 822
is a standard format which defines messages to have a
header and a body and they are represented in ASCII
text. primarily, the body was supposed to be simple text.
RFC 822 was updated quite a few times to allow email
messages to support and transfer many different types of
data: audio, video, images, PDF documents, and so on
[13]. The header specifies the sender, the receiver, the
subject of the message, and some other information (e.g.
content type, encoding type, etc.). The body contains the
actual information to be read by the receiver. The
general layout of Email file is illustrated in Figure (2)
[14].
Figure 2 Electronic Email [14]
3.1.3 Search Mechanism in Email Agents
Email clients have gotten a lot smarter agents over the
last 10 years, especially their search features. Many web
based email services and email clients offer search
mechanisms for the full text of the message, many
companies offer desktop applications that can support
indexing, searching the file systems, emails and the
browser caches and there are also many research
prototypes which perform the search operation [4]. User
agents recently offer wide capabilities to search the
mailbox. Search capabilities let users find messages
quickly, for example message that someone sent in the
last month about specific topic [13]. Gmail, yahoo, and
many other email clients provide search capabilities; like
search messages (From) or (To) fields for specific email
addresses or people, search for keyword or word in the
header or the body of the message, messages sent or
received before or after specific date or in specific period
of time, messages with file size, search for messages that
have files attached to them, messages that are starred,
unread, read or chat message, and search for file names
of attachments or files with extensions .jpg, .pdf, .doc,
.ppt, .xls and return emails with the specific file with that
extension. But till now there is no search methodology to
retrieve the image files from emails depending on the
content of the image.
3.2 CBIR
In 1992, Kato [15] introduced the concept of CBIR to
describe images retrieving from a database automatically
by using the color and shape features [16]. The main
tasks for CBIR systems are the similarity comparison
that depend on finding the difference between query
image feature with the corresponding features of other
image stored in a database [17].
3.3 Image Histogram
In this work a conventional color histogram (CCH) used
to indicate occurrence of every color in an image for
representing the statistical behavior of each color in
image.


Where, hi represents the number of pixels in color Ci
[18].
4. THE PROPOSED SYSTEM
The proposed email attachments retrieval system is
client based which is shown in Figure (3), it uses query-
by-example (QBE) paradigm. An image sample based
on what a user needs to search or find in email
attachments loaded to the system and the similar images
to a given sample are retrieved from email attachments.
First, a user starts by uploading the image sample from
the main system interface and enters his Email ID,
password, server name and email delivery protocol then
connects to the mail server. For test purposes the system
Kurdistan Journal for Applied Research kjar.spu.edu.iq
Volume 2, Issue 1, June 2017 P-ISSN: 2411-7684 E-ISSN: 2411-7706
http://dx.doi.org/10.24017/science.2017.1.12
connected to a real mail server (Hotmail server). Then,
the mail server will check the entered information, if it is
correct, then the system will read each email from the
user’s mailbox. The mailbox contains email messages
with and without attachments. The system will check
every email if it contains attachment or no. If the email
contains image attachment(s), then the color image
histogram features are computed for them to use later for
comparison with query image feature vector. Then set of
attachment images that have high similarity to the query
image are retrieved, displayed and saved in the list
containing the file name (email number with the
attachment file name) to avoid duplicates. The system
was developed using Visual Basic.Net programming
language.
Figure 3 The Interface of the Email Attachments
Retrieval System
The block diagram of the email attachments system is
shown in Figure (4) and explains the steps of the
proposed system in general. The implementation of
automated identification of attachment images illustrated
in the flowchart in figure (5) and implies the following
steps:
4.1 Loading Image, Read, Parse, and Check
Email Attachments
This step loads the data of input image. Also, through
the application, the user will enter his email ID,
password, server name and email delivery protocol, then
connect to the mail server. Port 110 is the default POP3
server port to receive Emails. Port 995 is the common
POP3 Secure Socket Layer (SSL) port used to receive
email over implicit SSL connection. Port 143 is the
default IMAP4 server port, Port 993 is the common port
for IMAP4 SSL. Now SSL is commonly used, many
email servers require SSL connection such as Gmail,
Outlook, Office 365 and Yahoo. In this system a
connection to Hotmail server was done using IMAP4
through SSL and IMAP Hotmail server name (imap-
mail.outlook.com). If the connection was successful,
each email will be read from the mail server and parsed
into header and message body. The body will be checked
if it contains attachments or no. If it contains
attachments, the files will be read and checked if they
are images with the extensions (JPG, BMP, or GIF). The
color image histogram will be computed for image
attachments.
4.2 Compute Color Image Histogram
A color histogram is computed for every image used in
the proposed system, the x-axis represents the number of
colors in an image. The y-axis represents the number of
pixels there are in each color [18].
4.3 Distance Measure
The similarity measure between Qj (Query Image) and
Tk (Attachment Image) having feature vectors
{qji|i=0…N-1} and {Tki| i=0 to N-1} is computed using
Euclidean distance metric [19]:
 


if the similarity is less or equal to the threshold, the
attachment images will be retrieved and stored on the
computer to be displayed later after the retrieval process
completed and all emails were checked.
Figure 4 The System Block Diagram
Kurdistan Journal for Applied Research kjar.spu.edu.iq
Volume 2, Issue 1, June 2017 P-ISSN: 2411-7684 E-ISSN: 2411-7706
http://dx.doi.org/10.24017/science.2017.1.12
Figure 5 System Flowchart
5. RESULTS AND DISCUSSION
The conducted tests results are presented in this section
to show the performance of the established system
whose structure is introduced above. As well as, the tests
are arranged to explore the effects using different
threshold values on the overall system retrieval
performance.
For retrieval purpose two metrics were used; they are
[20]:
  
 
  
 
The data sets used in this study are sets of email
attachment images with different extensions (e.g., .bmp,
.jpeg, .gif) which contain different subjects (e.g., apples,
cars, chairs, babies face, flowers, grass, mobiles, sea,
scanned documents) of varying sizes.
About 500 images were used in this test taken from 100
email messages, as well as other set of images were used
for test purpose. Table (1) presents examples of the used
ten image data sets which have been used.
Table1: Examples of Images Data Set
Example of images
No. of
images
50
35
15
50
100
25
50
85
50
40
Kurdistan Journal for Applied Research kjar.spu.edu.iq
Volume 2, Issue 1, June 2017 P-ISSN: 2411-7684 E-ISSN: 2411-7706
http://dx.doi.org/10.24017/science.2017.1.12
One of the main concerns in the conducted tests is to
find the suitable value of threshold parameter; which
leads to more accurate retrieval. If the value of threshold
is too small, then the number of retrieved images will
greatly have decreased and only the very similar images
will be retrieved. But, if the value is too large, then,
images from another set may retrieved. There is no
analytical method for finding the optimal threshold
value; it is usually assessed using trial mechanism (i.e.,
trying different values tuning the system performance) as
shown in Table (2).
Table2: The Effect of Distance Measure Threshold
Value
Threshold
Value
Apple
Chair
Baby Face
Preci-
sion
Re-
call
Preci-
sion
Recall
Preci-
sion
Recall
<= 0.3
73%
26%
77%
21%
84%
25%
<= 0.4
53%
31%
59%
28%
76%
29%
<= 0.5
46%
45%
46%
57%
57%
40%
<= 0.6
32%
56%
36%
58%
48%
53%
<= 0.7
28%
68%
29%
65%
32%
69%
Threshold
Value
Formal Paper
Flower
Grass
Preci-
sion
Re-
call
Preci-
sion
Recall
Preci-
sion
Recall
<= 0.3
76%
19%
76%
30%
84%
34%
<= 0.4
66%
29%
65%
46%
76%
47%
<= 0.5
50%
32%
53%
56%
63%
55%
<= 0.6
42%
46%
43%
68%
51%
69%
<= 0.7
31%
59%
33%
79%
42%
75%
Threshold
Value
Mobile
Red Cars
Sea
Preci-
sion
Re-
call
Preci-
sion
Recall
Preci-
sion
Recall
<= 0.3
95%
21%
81%
30%
85%
21%
<= 0.4
83%
34%
60%
45%
74%
32%
<= 0.5
75%
48%
49%
56%
55%
36%
<= 0.6
63%
59%
33%
64%
49%
53%
<= 0.7
51%
71%
30%
73%
38%
70%
Threshold
Value
White Cars
Prec-
ision
Recall
<= 0.3
88%
15%
<= 0.4
78%
22%
<= 0.5
59%
34%
<= 0.6
48%
51%
<= 0.7
35%
71%
Figure (6) illustrates the effect of different values for
threshold parameter on the precision for each image
category. Figure (7) shows the effect of different
threshold parameter values on the recall for each image
category.
Figure 6 The Effect of Threshold on Precision
Figure 7 The Effect of Threshold on Recall
6. CONCLUSION
The proposed retrieval system facilitates access to
email attachments images in mailbox based on
interaction user interface that allow user to quickly
obtain an overview of similar images in email
account. The color histogram can be used to
describe the color content of images. Testing the
different threshold values helps for best retrieval
results. The system gave better rates, when the
threshold value is less than or equal (0.4).
7. REFERENCE
1. X. Qian, X. Tan, Y. Zhang, R. Hong, and M.
Wang, “Enhancing sketch-based image retrieval by
re-ranking and relevance feedback”, IEEE Trans.
Image Processing, Vol. 25, pp. 195-208, 2015.
2. M. Azodinia, and A. Hajdu, “A Novel
combinational relevance feedback based method
for content-based image retrieval”,
ActaPolytechnicaHungarica, Vol. 13, no. 5, pp.
121-134, 2016.
3. A. Saini and R. Bharti “A review on content based
image retrieval by different techniques”,
International Journal of Neural Systems
Engineering, Vol. 1, no. 1, pp. 1-6, 2017
4. S. B. Pitla, “Organizational Search in Email
Systems”, M.S. thesis, Dept. Mathematics and
Computer Science, Western Kentucky Univ., 2012.
5. L. E. George, and E. Z. Mohammed, "Tissues
image retrieval system based on Co-occurrence,
run length and roughness features", IEEE
Conference Publications, International Conference
on Computer Medical Applications (ICCMA),
DOI: 10.1109/ICCMA.2013.6506186, pp. 1-6,
2013.
6. I. Alsmadi, and I. Alhami, “Clustering and
classification of email contents”, Journal of King
Saud University Computer and Information
Sciences, Production and hosting by Elsevier B.V.
on behalf of King Saud University, Vol. 27, pp.
4657, 2015.
7. D. Yuvaraj, and S. Hariharan, “Content-based
image retrieval based on integrating region
segmentation and colour histogram”, International
Arab Journal of Information Technology, Vol. 13,
pp. 203-207, 2016.
Kurdistan Journal for Applied Research kjar.spu.edu.iq
Volume 2, Issue 1, June 2017 P-ISSN: 2411-7684 E-ISSN: 2411-7706
http://dx.doi.org/10.24017/science.2017.1.12
8. S. R. Dubey, S. K. Singh, and R. K. Singh,
“Multichannel decoded local binary patterns for
content based image retrieval", IEEE Trans. Image
Processing, Vol. 25, pp. 4018-4032, 2016.
9. J. PyykkÖ and D. Glowacka, “Interactive content-
based image retrieval with deep neural networks”,
Symbiotic 2016, LNCS 9961, pp. 7788, 2017
10. Parthiban S. and Srinivasa Raghavan S., “Content
based image classification and retrieval using visual
bag of features and adaboost algorithm”, ARPN
Journal of Engineering and Applied Sciences, Vol.
12, No. 2, pp. 588-590, 2017.
11. J. F. Kurose, and K. W. Ross, “Application layer in
Computer Networking a Top-Down Approach”, 6th
ed., USA: Pearson Education, Inc., pp. 118-130,
2013.
12. A. S. Tanenbaum, and D. J. Wetherall, “The
application layer in Computer Networks”, 5th ed.,
USA: Pearson Education, Inc., pp. 623-646, 2011.
13. L. L. Peterson and B. S. Davie, “Application in
Computer Networks a systems approach”, 5th ed.,
USA: Elsevier, Inc., pp. 700-708, 2012.
14. B. A. Forouzan, “Remote logging, electronic mail,
and file transfer” in “Data Communications and
Networking”, 4th ed., USA: McGraw-Hill, pp. 824-
840, 2007.
15. T. Kato, "Database Architecture for Content-Based
Image Retrieval", Proceedings of Image Storage
and Retrieval Systems (SPIE), pp. 112-123, 1992.
16. J. Eakins, and M.Graham, "Content-based image
retrieval", University of Northumbria at Newcastle,
Report no. 39, 1999.
17. E. Aulia, "Hierarch Indexing for Region Based
Image Retrieval", M.Sc. Thesis, Department of
Industrial and Manufacturing Systems Engineering,
Louisiana State University, 2001.
18. J. Huang, "Color-Spatial Image Indexing and
Applications", Ph.D. Thesis, Cornell University,
1998.
19. C., Li Wei, C., and R.Wilson, "A general
framework for content-based medical image
retrieval with its application to Mammograms",
Proceedings of the SPIE, Vol. 5748, pp. 134-143,
2005.
20. G.Brunner, "Structure features for content-based
image retrieval and classification problems", Ph.D.
Thesis, University of Freiburg, Germany, 2006.
Biography
Noor Ghazi M. Jameel received
the B.S. and M.S. degrees in
computer science from the
University of Technology, Iraq, in
2003 and the Ph.D. degree in
computer science from Sulaimani
University, Sulaimani, Kurdistan
Region, Iraq in 2013.From 2003 to
2007, she was Assistant Lecturer
with the Informatics Institute for Postgraduate Studies. From 2008-
2013 with the Computer Science Institute, Sulaimani polytechnic
university. Since 2013, she has been a Lecturer with the Computer
Networks Department, Sulaimani Polytechnic University, Technical
College of Informatics. Her research interests include information and
network security, machine learning, data mining, and computer
networks.
Esraa Zeki Mohammed received
the B.S. degree in computer science
from the University of Mosul, Iraq,
in 2001. M.S. and Ph.D. degrees in
computer science from Sulaimani
University, Sulaimani, Kurdistan
Region, Iraq in 2009 and 2013,
respectively. From 2002 till now she
worked as a senior programmer and
then head of advisory office in Ministry of
Communication/State Company for Internet Services.
Also she worked as a lecturer in Kirkuk Technical
Institute and Kirkuk University.
Loay Edwar Georgereceived the
B.S.in Physics, College of Science,
Baghdad University, Baghdad, Iraq
(1979). M.Sc. In Theoretical
Physics, College of Science,
Baghdad University, Baghdad, Iraq
(1983). Ph.D. In Digital Image
Processing, College of Science,
Baghdad University, Baghdad, Iraq (1997). He worked
as Head of Computer Science Department (Dec2010
Sep2015). Head of IT-Unit/ College of Science/
University of Baghdad (Jan.2008 - Dec.2010). IT
Consultant in the headquarter of the Ministry of Higher
Education and Scientific Research (for 1 year). Head of
the Directorate of "Software Development and Systems
Integration" in Al-Khawarezmi Company (for 4 years).
Head of the directorate of "Research and Development"
in Al-Khawarezmi company for specialized Software
Industry (for 4 years). Head of the research group in the
field of "Ionosphere and Geomagnetism", in the Space
Research Center (for 3 years).
... In the context of mortgage origination, where vast amounts of such data are exchanged, the need for robust cybersecurity measures is paramount. The increasing incidences of cyberattacks and data breaches in recent years have underscored the vulnerabilities inherent in digital communication systems [1]. Email and SMS, being the primary modes of communication between lenders, borrowers, and other stakeholders in the mortgage process, demand stringent security protocols to safeguard against such vulnerabilities. ...
... The landscape of secure digital communication has been extensively explored in recent research. Studies have focused on the implementation of advanced encryption techniques for email communications [1]. These techniques are pivotal in ensuring the confidentiality and integrity of sensitive data transmitted between lenders and borrowers. ...
... Research in email security has consistently highlighted the significance of encryption technologies. Studies show the effectiveness of end-to-end encryption in safeguarding email content against unauthorized access [1]. Additionally, the literature discusses the integration of digital signatures and secure email gateways as methods to enhance the security of email communications in the mortgage process. ...
Article
Full-text available
The increasing reliance on digital communication in mortgage origination processes has accentuated the need for robust cybersecurity measures, particularly in the realms of email and SMS communications. This research paper delves into the multifaceted aspects of securing such communications, considering the latest technological advancements, regulatory requirements, and industry practices. Drawing upon an array of academic research and contemporary articles, the study explores innovative methods for email and SMS encryption, the implementation of secure data transmission protocols, and the integration of cybersecurity practices into the mortgage origination process. Key themes include the use of end-to-end encryption technologies, adherence to FTC Safeguards Rule and other regulatory frameworks, and the development of secure messaging protocols tailored for the mortgage industry. The paper also evaluates the effectiveness of these methods in safeguarding sensitive borrower data, ensuring compliance, and maintaining the integrity of the mortgage origination process. By synthesizing theoretical insights and practical approaches, this study aims to provide a comprehensive understanding of the current landscape and future directions in secure communication within mortgage origination, emphasizing the balance between security, efficiency, and user convenience.
... Email -is a powerful, convenient, and efficient way of communicating over the Internet [1]. In the age of electronic technology, this type of communication is widely used in many fields. ...
Conference Paper
Full-text available
User privacy has become a prominent issue in the digital age, especially with the advent of the Internet and social media. Technologies have opened up new opportunities and different ways for us to communicate. At the same time, they have also brought other avenues and methods for cyberattacks. Email attacks such as the mass sending of malicious messages, links, and phishing dominate among them. Therefore, in our scientific article, we dealt with the most common type of cyberattacks that occur via e-mail. Machine learning methods (ML) have been actively involved in malicious email detection. To find out which algorithm is more effective, we tested different supervised ML algorithms such as Random Forest, Support Vector Machine (SVM), Decision Tree, Naïve Bayes, and K-Nearest Neighbors. And to work with real data, we used some datasets containing emails used in different phishing and bulk emails.KeywordsSpam Filtering SystemE-mail FilteringAlgorithm for MLRandom ForestSupport Vector MachineDecision TreeNaïve BayesK-Nearest Neighbors
Conference Paper
Email security is critical to all types of businesses, as it represents 80% of the plethora of official communication tools used by most organizations worldwide. Attackers use several techniques to trick users into performing harmful actions, mainly via emails. Identifying such activities or circumventing them is better than relying on the end user’s behavior of being unaware. Traditional e-mail systems use centralized servers to provide services, making them a single point of failure if servers are attacked or at least private information is leaked. Thus, a decentralized e-mail system can provide more trust and reliability. This study is an initial attempt to explore the use of Blockchain-based solutions to improve the security and privacy of traditional e-mail systems. This paper presents a two-fold coverage of this problem. First, a summary of common email security architectures is presented, outlined, and criticized for various parameters. Second, we propose a technique for solving the problem of phishing emails by targeting changes in the email system structure using two approaches. The first approach is to improve email security by using Blockchain technology whereas the second approach is by modifying the SSL protocol to disallow the use of similar domains, thereby preventing a considerable number of phishing attempts. We discuss each approach along with its advantages and disadvantages. The paper concludes with future research perspectives on this important topic using the proposed approach.
Conference Paper
Full-text available
Recent advances in deep neural networks have given rise to new approaches to content-based image retrieval (CBIR). Their ability to learn universal visual features for any target query makes them a good choice for systems dealing with large and diverse image datasets. However, employing deep neural networks in interactive CBIR systems still poses challenges: either the search target has to be predetermined, such as with hashing, or the computational cost becomes prohibitive for an online setting. In this paper, we present a framework for conducting interactive CBIR that learns a deep, dynamic metric between images. The proposed methodology is not limited to precalculated categories, hashes or clusters of the search space, but rather is formed instantly and interactively based on the user feedback. We use a deep learning framework that utilizes pre-extracted features from Convolutional Neural Networks and learns a new distance representation based on the user’s relevance feedback. The experimental results show the potential of applying our framework in an interactive CBIR setting as well as symbiotic interaction, where the system automatically detects what image features might best satisfy the user’s needs.
Article
Full-text available
Due to the extensive use of images in various fields, using effective approaches to retrieve the most related images given, a query image is of great importance. Contentbased image retrieval is the approach commonly used to address this issue. The contentbased image retrieval systems use many techniques to provide more accurate and comprehensive answers, among which, is relevance feedback. Relevance feedback is used by the system to help it retrieve more relevant images in response to a query. In this paper, we have proposed a novel relevance feedback method that is able to improve the precision of the content-based retrieval systems. The proposed method is based on multi-query relevance feedback, and similarity function refinement. © 2016, Budapest Tech Polytechnical Institution. All rights reserved.
Article
Full-text available
Information users depend heavily on emails’ system as one of the major sources of communication. Its importance and usage are continuously growing despite the evolution of mobile applications, social networks, etc. Emails are used on both the personal and professional levels. They can be considered as official documents in communication among users. Emails data mining and analysis can be conducted for several purposes such as: Spam detection and classification, subject classification, etc. In this paper, a large set of personal emails is used for the purpose of folder and subject classifications. Algorithms are developed to perform clustering and classification for this large text collection. Classification based on NGram is shown to be the best for such large text collection especially as text is Bi-language (i.e. with English and Arabic content).
Conference Paper
Full-text available
The research presented in this paper was aimed to improve the retrieval performance of an images retrieval system in medical applications based on texture features. In general, the work consists of two phases: (1) enrollment phase, which consist of feature extraction based on Co-occurrence matrix and run length matrix features combined with developed method to measure the roughness, (2) retrieving phase, which use the artificial neural network and similarity measurement. The conducted tests were carried on 600 medical images from four types of tissues (i.e., blood cells, breast tissues, GI tissues, liver tissues) and give very high precision and recall rates (100,98).
Article
This paper proposes the content based classification and retrieval of images using Visual bag of Features and adaboost classifier. The Visual bag of Features has been extracted from the input images and then the visual bag of features is classified using the adaboost classifier algorithm. The proposed algorithm greatly reduces the Storage cost and efficient search using the inverted data structure. The efficiency of the proposed algorithm is tested with Mean Opinion Score (MOS). © 2006-2017 Asian Research Publishing Network (ARPN). All rights reserved.
Article
Local binary pattern (LBP) is widely adopted for efficient image feature description and simplicity. To describe the color images, it is required to combine the LBPs from each channel of the image. The traditional way of binary combination is to simply concatenate the LBPs from each channel, but it increases the dimensionality of the pattern. In order to cope with this problem, this paper proposes a novel method for image description with multichannel decoded local binary patterns. We introduce adder and decoder based two schemas for the combination of the LBPs from more than one channel. Image retrieval experiments are performed to observe the effectiveness of the proposed approaches and compared with the existing ways of multichannel techniques. The experiments are performed over twelve benchmark natural scene and color texture image databases such as Corel-1k, MIT-VisTex, USPTex, Colored Brodatz, etc. It is observed that the introduced multichannel adder and decoder based local binary patterns significantly improves the retrieval performance over each database and outperforms the other multichannel based approaches in terms of the average retrieval precision and average retrieval rate.
Article
Developments in multimedia technology, increasing number of image retrieval functions and capabilities has led to the rapid growth of CBIR techniques. Colour histogram could be compared in terms of speed and efficiency. We have presented a modified approach based on a composite colour image histogram. A major research perspective in CBIR emphasize on matching similar objects based on shape, colour and texture using computer vision techniques in extracting image features. The colour histogram is perhaps the most popular one due to its simplicity. Image retrieval using colour histogram perhaps has both advantages and limitations. This paper presents some recommendations for improvements to CBIR system using unlabelled images. The experimental results presented using Matlab software significantly shows that region based histogram and colour histogram were effective as far as performance is concerned.
Article
Sketch-based image retrieval often needs to optimize the trade-off between efficiency and precision. Index structures are typically applied to large-scale databases to realize efficient retrievals. However, the performance can be affected by quantization errors. Moreover, the ambiguousness of user-provided examples may also degrade the performance, when compared with traditional image retrieval methods. Sketch-based image retrieval systems that preserve the index structure are challenging. In this paper, we propose an effective sketch-based image retrieval approach with re-ranking and relevance feedback schemes. Our approach makes full use of the semantics in query sketches and the top ranked images of the initial results. We also apply relevance feedback to find more relevant images for the input query sketch. The integration of the two schemes results in mutual benefits and improves the performance of sketch-based image retrieval.
Article
This paper describes visual interaction mechanisms for image database systems. The typical mechanisms for visual interactions are query by visual example and query by subjective descriptions. The former includes a sketch retrieval function and a similarity retrieval function, while the latter includes a sense retrieval function. We adopt both an image model and a user model to interpret and operate the contents of image data from the user's viewpoint. The image model describes the graphical features of image data, while the user model reflects the visual perception processes of the user. These models, automatically created by image analysis and statistical learning, are referred to as abstract indexes stored in relational tables. These algorithms are developed on our experimental database system, the TRADEMARK and the ART MUSEUM.