ArticlePDF Available

Detection and prevention of Phishing Attacks

Authors:
Detection and Prevention of Phishing Attacks
Abu Saad Choudhary
Department of Information Technology
Shree L R Tiwari College of Engineering
Thane-401107, India
chaudharyabusaad @gmail.com
Rucha Desai
Department of Information Technology
Shree L R Tiwari College of Engineering
Thane-401107, India
rucharockx@gmail.com
Lavkush Gupta
Department of Information Technology
Shree L R Tiwari College of Engineering
Thane-401107, India
Lavkushgupta9172@gmail.com
Madhuri Gedam
Department of Information Technology
Shree L R Tiwari College of Engineering
Thane-401107, India
madhuri.gedam@gmail.com
Abstract- Phishing is one amongst the main issues visaged by
cyber-world and ends up in monetary losses for each industries
and people. Detection of phishing attack with high accuracy has
forever been a difficult issue. At present, visual similarities-
based techniques square measure terribly helpful for police
work phishing websites expeditiously. Phishing web site
appearance terribly similar in look to its corresponding
legitimate web site to deceive users into basic cognitive process
that they are browsing the right web site. Visual similarity
primarily based phishing detection techniques utilize the
feature set like text content, text format, HTML tags,
Cascading sheet (CSS), image, then forth, to form the choice.
These approaches compare the suspicious web site with the
corresponding legitimate web site by victimisation numerous
options and if the similarity is larger than the predefined
threshold price then it is declared phishing [2].
Keywords Phishing Attack; URL; Real Time Model;
Phishing Detection
I. INTRODUCTION
Phishing could be a crime within which a wrongdoer
sends the faux e-mail, that seems to return from widespread
and trusty complete or organization, asking to input personal
certification like bank positive identification, username,
number, address, master card details, so forth. The faux e-
mails usually look astonishingly legitimate, and even the web
site wherever the net user is asked to input personal data
additionally sounds like legitimate one. Phishing messages
propagate over e-mail, SMS, instant messengers, social
networking sites, VoIP, so forth, however e-mail is that the
widespread thanks to perform this attack and phishing attack
is achieved by visiting the link hooked up to the e-mail.
Moreover, spear phishing attack is changing into widespread
these days. Business e-mail compromise (BEC) is discovered
as a serious net threat in 2015.In BEC, the persona non grata
uses spear phishing ways to fool organizations and net
persons [1]. More subtle spear phishing attacks targeted
individual or teams inside the organization. Phishing is
metaphorically like fishing within the water, however rather
than attempting to catch a fish, attackers attempt to steal
consumer’s personal data. once a user opens a faux webpage
and enters the username and guarded positive identification,
the credentials of the user area unit noninheritable by the
aggressor which may be Phishing websites look terribly
similar in look to their corresponding legitimate websites to
draw in sizable amount of net users. Recent developments in
phishing detection have junction rectifier to the expansion of
diverse new visual similarities- based approaches. Visual
similarity-based approaches compare the visual look of the
suspicious web site to its corresponding legitimate web site
by exploitation numerous parameters [1].
II. RELATED WORK
A. Protecting user against phishing using Antiphishing: -
AntiPhishing is employed to avoid users from
exploitation fallacious websites that successively could cause
phishing attack.Here, AntiPhishing traces the sensitive data
to be stuffed by the user and alerts the user whenever he/she
is trying to share his/her data to a untrusted computing
machine.The abundant effective elucidation for this can be
cultivating the users to approach just for trusty websites [2]
B. Learning to Detect Phishing Emails: -
An alternative for police investigation these attacks could
be a relevant method of reliableness of machine on a
attribute supposed for the reflection of the enclosed
deception of user by. This approach is utilized in the
detection of phishing websites, or the text messages sent
through emails that area unit used for stable gear the victims
[3].
C. Phishing detection system for e-banking using fuzzy data
mining: -
Phishing websites, primarily used for e-banking services,
area unit terribly advanced and dynamic to be known and
classified.because of the involvement of varied ambiguities
within the detection, sure crucial data processing techniques
could prove a good means that to keep the e-commerce
websites safe since itdeals. with considering numerous
quality factors instead of precise values [4].
D. Collaborative Detection of Fast Flux Phishing Dom
Here, 2 approaches area unit outlined to search out
correlation of evidences from multiple servers of DNS and
multiple suspects of FF domain.real world examples is wont
to prove that our correlation approaches expedite the
detection of the FF domain, that area unit supported
Associate in Nursing analytical model which mayquantify
numerous DNS queries that area unit needed to verify a FF
domain [5].
E. A Prior-based Transfer Learning Method for the
Phishing Detection: -
A supplying regression is that the root of a priority
primarily based transferrable learning technique, that is
Asian Journal of Convergence in Technology
ISSN NO: 2350-1146 I.F-5.11
Volume VII and Issue I
This work is licensed under a
Creative Commons Attribution-Noncommercial 4.0 International License
193
conferred here for our classifier of applied mathematics
machine learning.it's used for the detection of the phishing
websites counting on our elect characteristics of the
URLs.because of the divergence within the allocation of the
options within the distinct phishing areas, multiple model’s
area unit projected for various regions [6].
III. PROPOSED SYSTEM
Phishing has been a major security threat in which there
is a huge loss for companies as well as customers. These
phishing attacks are increasing day by day due to lack of
efficient detection techniques and effective preventive
measures. A comprehensive efficient detection technique
should be developed in order to detect and inform the web
users about the phishing attacks to make sure that their
sensitive data will not be disclosed during these attacks.
There are a unit varied techniques exists for detection of
phishing, however it's still become a difficult work to note
faux websites with the prevailing methodology. There is a
unit numerous technique obtainable like blacklisting, white
listing, heuristics and machine learning to observe phishing,
however machine learning is being extensively used.to
forestall this, data processing techniques is projected during
this analysis work to spot the phishing website and alerting
users from revealing their passwords.
IV. FLOWCHART
Fig. 1. Flowchart
Phishing is one of the major problems faced by cyber-
world and leads to financial losses for both industries and
individuals. Detection of phishing attack is always a
challenging issue. Phishing website looks very similar in
appearance to its corresponding legitimate website to deceive
users into believing that they are browsing the correct
website. As the phishing sites uses the host name, that is
incredibly concerning the legitimate website, the edit
distance worth can clearly be low. So, the information
processing address of the entered website is compared with
the information processing address of the positioning within
the white list that encountered the minimum edit distance. If
each the addresses area unit same, then it's a legitimate.
During this method the user is alerted [2].
V. MACHINE LEARNING IMPLEMENTATION
The following algorithms were chosen based on their
performance on classification problems.
A. Random Forests
Random Forest is a popular machine learning algorithm
that belongs to the supervised learning technique. It can be
used for both Classification and Regression problems in ML.
It is based on the concept of ensemble learning, which is a
process of combining multiple classifiers to solve a complex
problem and to improve the performance of the model.
Random forest is a supervised learning algorithm. The
"forest" it builds, is an ensemble of decision trees, usually
trained with the “bagging” method. The general idea of the
bagging method is that a combination of learning models
increases the overall result. Random forest has
nearly the same hyperparameters as a decision tree or a
bagging classifier. Fortunately, there's no need to combine a
decision tree with a bagging classifier because you can easily
use the classifier-class of random forest. With random forest,
you can also deal with regression tasks by using the
algorithm's regressor. Random forest adds additional
randomness to the model, while growing the trees. Instead of
searching for the most important feature while splitting a
node, it searches for the best feature among a random subset
of features. This results in a wide diversity that generally
results in a better model. Therefore, in random forest, only a
random subset of the features is taken into consideration by
the algorithm for splitting a node. You can even make trees
more random by additionally using random thresholds for
each feature rather than searching for the best possible
thresholds
B. Neural Networks
A neural network is structured as a set of interconnected
identical units (neurons). The interconnections are used to
send signals from one neuron to the other. In addition, the
interconnections have weights to enhance the delivery
among neurons. The neurons are not powerful by
themselves, however, when connected to others they can
perform complex computations. Neural networks are a set of
algorithms, that are designed to recognize patterns. They
interpret sensory data through a kind of machine perception,
labelling or clustering raw input. The patterns they recognize
are numerical, contained in vectors, into which all real-world
data, be it images, sound, text or time series, must be
translated. Neural networks help us cluster and classify.
Machine learning algorithms that use neural networks
generally do not need to be programmed with specific rules
that define what to expect from the input. The neural
networks learning algorithm instead learns from processing
many labeled examples (i.e., data with "answers") that are
supplied during training and using this answer key to learn
Asian Journal of Convergence in Technology
ISSN NO: 2350-1146 I.F-5.11
Volume VII and Issue I
194
what characteristics of the input are needed to construct the
correct output.
C. Support Vector Machines
Support Vector Machine (SVM) is a supervised machine
learning discriminative model, which conforms to the
principle of drawing separating hyper-plane with maximum
safety space, called margin, to minimize the risk of flawed
predictions. Support Vector Machine” (SVM) is a supervised
machine learning algorithm which can be used for both
classification or regression challenges. However, it is mostly
used in classification problems. Support vector machine is
highly preferred by many as it produces significant accuracy
with less computation power. The goal of the SVM
algorithm is to create the best line or decision boundary that
can segregate n-dimensional space into classes so that we can
easily put the new data point in the correct category in the
future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating
the hyperplane. These extreme cases are called as support
vectors, and hence algorithm is termed as Support Vector
Machine.
D. Logistics Regression
Logistic regression is one of the most popular Machine
Learning algorithms, which comes under the Supervised
Learning technique. It is used for predicting the categorical
dependent variable using a given set of independent
variables. Logistic regression is another technique borrowed
by machine learning from the field of statistics. It is the go-to
method for binary classification problems (problems with
two class values). Logistic Regression is much similar to the
Linear Regression except that how they are used. Linear
Regression is used for solving Regression problems,
whereas Logistic regression is used for solving the
classification problems. Logistic Regression is a significant
machine learning algorithm because it has the ability to
provide probabilities and classify new data using continuous
and discrete datasets. Logistic Regression can be used to
classify the observations using different types of data and can
easily determine the most effective variables used for the
classification.
VI. BLOCK DIAGRAM
Fig. 2. Block Diagram
A. Creating a fake website:
As part of phishing attack, attackers create a fake website
which appears similar to original website. They use the main
features of the original website such as logo, design of a
website to create a fake website so that users cannot suspect
such fake websites.
B. Linking a fake website through email:
Once creation of the fake website is done, attackers send
thousands of e-mails to multiple users and make email
recipients(users) to click a URL which re-directs to the fake
website.
C. Clicking a malicious URL:
The users who were not aware of the malicious URL
provided in the email, clicks it which directs to the fake
website provided by the attackers. This is where the phishing
attack begins.
D. Entering sensitive information:
Once the user is redirected to the fake website, the
sensitive information such as login credentials and other
details are entered by the user in order to access the website
created by the attacker.
E. Compiling the stolen data and using it:
Once the user enters the sensitive information, all the
sensitive data is collected so that the attacker can sell the data
or use it for his/her own purpose [4].
VII. ADVANTAGES
1.
It does not depend on the phishing technique.
2.
It can detect pharming attacks, which are
undetectable by many existing systems.
3.
Some system tries to detect phishing webpage. Some
system detects phishing webpage when user opens
new webpage [7].
a) Hardware requirements: -
4 GB RAM
10GB HDD
Intel 1.66 GHz Processor Pentium
b) Software requirements: -
Windows 7
Python 3.6.0
Visual Studio Code
Fig. 3. Emergence of new malicious attacks.
Asian Journal of Convergence in Technology
ISSN NO: 2350-1146 I.F-5.11
Volume VII and Issue I
195
Fig. 4. Emergence of malicious attacks in India.
There has been a sharp increase in the ransomware family
and its variants, however, the phishing and other attacks are
consistent and there has been no sharp rise. India, however,
has seen its fair share of attacks with Maharashtra being the
most targeted by Phishing attacks. Users from Maharashtra
have also been attacked by Malware of various types [4].
Although there are many methods exist to prevent
phishing attacks, still its wings are spread over the entire
network, the web and duping individuals, organization, and
the society. This research work aims to presents a data mining
method to construct a model to protect against phishing
attacks. The architectural model provides a powerful
approach to identify phishing sites without inducing high
overhead over the browser and work effectively. Identifying
different features helped in recognizing the different E-mails
into different clusters and able to detect the cluster specially
designed by the phishers.
The results by using Gemini as a browser extension for
Firefox, Chrome and Internet Explorer are shown as an
example. This is to conclude that Phishing attacks are very
dangerous threat to individuals, organizations, and the
society. The proposed work is very efficient methodology in
terms of complexity and overhead to detect phishing attacks.
VIII. CONCLUSION
Although there are many methods exist to prevent
phishing attacks, still its wings are spread over the entire
network, the web and duping individuals, organization, and
the society. This research work aims to presents a data
mining method to construct a model to protect against
phishing attacks. The architectural model provides a
powerful approach to identify phishing sites without
inducing high overhead over the browser and work
effectively. Identifying different features helped in
recognizing the different E-mails into different clusters and
able to detect the cluster specially designed by the phishers.
The results by using Gemini as a browser extension for
Firefox, Chrome and Internet Explorer are shown as an
example. This is to conclude that Phishing attacks are very
dangerous threat to individuals, organizations, and the
society. The proposed work is very efficient methodology in
terms of complexity and overhead to detect phishing attacks.
REFERENCES
[1] Choon Lin Tan, Kang Leng Chiew, San Nah Sze , “Phishing
Webpage Detection Using Weighted URL Tokens for Identity
Keywords Retrieval”, in the proceedings of 9th International
Conference on Robotic, Vision, Signal Processing and Power
Applications, pp 133-139, Springer Singapore, 2017.
[2] U Gürtürk, M Baykara, M Karabatak, “Identifying the Visitors with
Data Mining Methods from Web Log Files”, International Journal of
Emerging Technologies in Engineering Research (IJETER), 5(3),
243- 249, 2017.
[3] B. Gupta, A. Tewari, A. K. Jain, and D. P. Agrawal, “Fighting against
phishing attacks: state of the art and future challenges,” Neural
Computing and Applications, vol. 28, no. 12, pp. 36293654, 2017.
[4] A. Aleroud and L. Zhou, “Phishing environments, techniques, and
countermeasures: A survey,” Computers & Security, vol. 68, pp. 160
196, 2017. [Online]. Available: http://
www.sciencedirect.com/science/article/pii/S01 67404817300810.
[5] Dipesh Vaya, Sarika Khandelwal, Teena Habpawat, “A Review on
Visual Cryptography”, International Journal of Computer
Applications, Volume.174 (Issue 05), ISSN: 0975- 8887, September
2017.
[6] The biggest phishing attacks of 2018 and what companies can dot
prevent them in 2019, available at:
https://www.techrepublic.com/article/the- biggest-phishingattacks-of-
2018-and-what- companies-can-do-to-prevent-themin-2
[7] P. Yi, Y. Guan, F. Zou, Y. Yao, W. Wang and T. Zhu, "Web Phishing
Detection Using a Deep Learning Framework", Wireless
Communications and Mobile Computing, vol. 2018, pp. 1-9, 2018
[8] K. L. Chiew, J. S.-F. Choo, S. N. Sze and K. S. C. Yong, "Leverage
Website Favicon to Detect Phishing Websites", Security and
Communication Networks, vol. 2018, pp. 1- 11, 2018.
[9] A. Tewari, A. K. Jain, and B. B. Gupta, “Recent survey of various
defense mechanisms against phishing attacks,” Journal of
Information Privacy and Security, vol. 12, no. 1, pp. 313, 2016.
[10] A. K. Jain and B. B. Gupta, “A novel approach to protect against
phishing attacks at client side using auto-updated white- list,”
EURASIP Journal on Information Security, vol. 2016, article 9, 11
pages, 2016.
[11] M. Moghimi and A. Y. Varjani, “New rule- based phishing detection
method,” Expert Systems with Applications, vol. 53, pp. 231 242,
2016.
[12] G. A. Montazer and S. Yarmohammadi, “Detection of phishing
attacks in Iranian e- banking using a fuzzy-rough hybridsystem,”
Applied Soft Computing, vol. 35, pp. 482492, 2015.
[13] A. Mishra and B. B. Gupta, “Hybrid solution to detect and filter zero-
day phishing attacks,” in Proceedings of the Emerging Research in
Computing, Information, Communication and Applications (ERCICA
'14), Bangalore, India, August 2014.
[14] K. L. Chiew, E. H. Chang, S. N. Sze, and W. K. Tiong, “Utilisation of
website logo for phishing detection,” Computers & Security, vol. 54,
pp. 1626, 2015.
[15] K. Parsons, A. McCormac, M. Pattinson, M. Butavicius, and C.
Jerram, “The design of phishing studies: challenges for researchers,”
Computers & Security, vol. 52, pp. 194206, 2015.
Asian Journal of Convergence in Technology
ISSN NO: 2350-1146 I.F-5.11
Volume VII and Issue I
196
... According to Choudhary et al. (2021), phishing is a crimea culprit sends the faux email, that seems to go back from great and trusty whole or organization, asking to enter personal certification like financial institution identification, username, number, master card information, so forth. The faux e-mails normally appearance astonishingly valid, or even the web website online anywhere the internet person is requested to enter individual data additionally feels like legitimate one. ...
Article
Full-text available
This study is about phishing phenomenon which widely occurs in the society in Malaysia. The impact of the phishing also had discussed. A total of 375 respondents were involved in answering the distributed questionnaires. The respondents were selected among the students of University Malaysia Perlis (UniMAP)-population of 14, 700 students. Selection of samples for respondents is by using the simple random sampling techniques. As for the descriptive analysis, the results showed that the average respondent has knowledge and experience about phishing on the internet. Meanwhile for the inferential analysis, correlation pearson analysis were carried out. There is no significant relationship between types and impacts of phishing attacks on internet users but there is a significant relationship between effects and ways to prevent phishing attacks on internet users.
Article
Full-text available
Web service is one of the key communications software services for the Internet. Web phishing is one of many security threats to web services on the Internet. Web phishing aims to steal private information, such as usernames, passwords, and credit card details, by way of impersonating a legitimate entity. It will lead to information disclosure and property damage. This paper mainly focuses on applying a deep learning framework to detect phishing websites. This paper first designs two types of features for web phishing: original features and interaction features. A detection model based on Deep Belief Networks (DBN) is then presented. The test using real IP flows from ISP (Internet Service Provider) shows that the detecting model based on DBN can achieve an approximately 90% true positive rate and 0.6% false positive rate.
Article
Full-text available
Phishing attack is a cybercrime that can lead to severe financial losses for Internet users and entrepreneurs. Typically, phishers are fond of using fuzzy techniques during the creation of a website. They confuse the victim by imitating the appearance and content of a legitimate website. In addition, many websites are vulnerable to phishing attacks, including financial institutions, social networks, e-commerce, and airline websites. This paper is an extension of our previous work that leverages the favicon with Google image search to reveal the identity of a website. Our identity retrieval technique involves an effective mathematical model that can be used to assist in retrieving the right identity from the many entries of the search results. In this paper, we introduced an enhanced version of the favicon-based phishing attack detection with the introduction of the Domain Name Amplification feature and incorporation of addition features. Additional features are very useful when the website being examined does not have a favicon. We have collected a total of 5,000 phishing websites from PhishTank and 5,000 legitimate websites from Alexa to verify the effectiveness of the proposed method. From the experimental results, we achieved a 96.93% true positive rate with only a 4.13% false positive rate.
Article
Full-text available
In today’s world handling and security of information from attacks becomes very important aspect for the individuals. Researchers are innovating new techniques to secure the information from unwanted intrusions. Various cryptography techniques are discovered and many are yet to be revealed. Here in this paper we are going to review an advanced method of information hiding i.e. Visual Cryptography. Visual Cryptography emerged as a special encryption technique for information hiding using images. In way that encrypted image can be decrypted by the human vision if the correct image key is used. By this cryptographic technique we can encrypt visual information (pictures, text, etc.) in a way that human visual system can perform decryption of encrypted information & no aid of computers needed. In visual cryptography a secret image is transformed into several share images. These share images are meaningful but noisy or distorted images. Combination of these share images can reveal the original secret image. This paper reviews two methods for visual cryptography of color images based on Shamir encryption method variants of k-out-of-n i.e.2-out-of-2, 2-out-of-n, n-out-of-n, and k-out-of-n scheme encryption method.
Article
Full-text available
Most of the anti-phishing solutions are having two major limitations; the first is the need of a fast access time for a real-time environment and the second is the need of high detection rate. Black-list-based solutions have the fast access time but they suffer from the low detection rate while other solutions like visual similarity and machine learning suffer from the fast access time. In this paper, we propose a novel approach to protect against phishing attacks using auto-updated white-list of legitimate sites accessed by the individual user. Our proposed approach has both fast access time and high detection rate. When users try to open a website which is not available in the white-list, the browser warns users not to disclose their sensitive information. Furthermore, our approach checks the legitimacy of a webpage using hyperlink features. For this, hyperlinks from the source code of a webpage are extracted and apply to the proposed phishing detection algorithm. Our experimental results show that the proposed approach is very effective for protecting against phishing attacks as it has 86.02 % true positive rate while less than 1.48 % false negative rate. Moreover, our proposed system is efficient to detect various other types of phishing attacks (i.e., Domain Name System (DNS) poisoning, embedded objects, zero-hour attack).
Article
Full-text available
In the recent years, the phishing attack has become one of the most serious threats faced by Internet users, organizations, and service providers. In a phishing attack, the attacker tries to defraud Internet users and steal their personal information either by using spoofed emails or by using fake websites or both. Several approaches have been proposed in the literature for the detection and filtering of phishing attacks; however, the Internet community is still looking for a complete solution to secure the Internet from these attacks. This article discusses recent developments and protection mechanisms (i.e., detection and filtering) against a variety of phishing attacks (e.g., email phishing, website phishing, zero-day attacks). In addition, the strengths and weaknesses of these approaches is discussed. This article provides a better understanding of the phishing attack problem in the current solution space and also addresses the scope of future research to deal with such attacks efficiently.
Article
Full-text available
In the last few years, phishing scams have rapidly grown posing huge threat to global Internet security. Today, phishing attack is one of the most common and serious threats over Internet where cyber attackers try to steal user’s personal or financial credentials by using either malwares or social engineering. Detection of phishing attacks with high accuracy has always been an issue of great interest. Recent developments in phishing detection techniques have led to various new techniques, specially designed for phishing detection where accuracy is extremely important. Phishing problem is widely present as there are several ways to carry out such an attack, which implies that one solution is not adequate to address it. Two main issues are addressed in our paper. First, we discuss in detail phishing attacks, history of phishing attacks and motivation of attacker behind performing this attack. In addition, we also provide taxonomy of various types of phishing attacks. Second, we provide taxonomy of various solutions proposed in the literature to detect and defend from phishing attacks. In addition, we also discuss various issues and challenges faced in dealing with phishing attacks and spear phishing and how phishing is now targeting the emerging domain of IoT. We discuss various tools and datasets that are used by the researchers for the evaluation of their approaches. This provides better understanding of the problem, current solution space and future research scope to efficiently deal with such attacks.
Article
Phishing has become an increasing threat in online space, largely driven by the evolving web, mobile, and social networking technologies. Previous phishing taxonomies have mainly focused on the underlying mechanisms of phishing but ignored the emerging attacking techniques, targeted environments, and countermeasures for mitigating new phishing types. This survey investigates phishing attacks and antiphishing techniques developed not only in traditional environments such as e-mails and websites, but also in new environments such as mobile and social networking sites. Taking an integrated view of phishing, we propose a taxonomy that involves attacking techniques, countermeasures, targeted environments and communication media. The taxonomy will not only provide guidance for the design of effective techniques for phishing detection and prevention in various types of environments, but also facilitate practitioners in evaluating and selecting tools, methods, and features for handling specific types of phishing problems.
Conference Paper
Phishing is an online identity theft that has threatened Internet users for more than a decade. This paper proposes an anti-phishing technique based on a weighted URL tokens system, which extracts identity keywords from a query webpage. Using the identity keywords as search terms, a search engine is invoked to pinpoint the target domain name, which can be used to determine the legitimacy of the query webpage. Experiments were conducted over 1000 datasets, where 99.20 % true positives and 92.20 % true negatives were achieved. Results suggest that the proposed system can detect phishing webpages effectively without using conventional language-dependent keywords extraction algorithms.
Article
In this paper, we present a new rule-based method to detect phishing attacks in internet banking. Our rule-based method used two novel feature sets, which have been proposed to determine the webpage identity. Our proposed feature sets include four features to evaluate the page resources identity, and four features to identify the access protocol of page resource elements. We used approximate string matching algorithms to determine the relationship between the content and the URL of a page in our first proposed feature set. Our proposed features are independent from third-party services such as search engines result and/or web browser history. We employed support vector machine (SVM) algorithm to classify webpages. Our experiments indicate that the proposed model can detect phishing pages in internet banking with accuracy of 99.14% true positive and only 0.86% false negative alarm. Output of sensitivity analysis demonstrates the significant impact of our proposed features over traditional features. We extracted the hidden knowledge from the proposed SVM model by adopting a related method. We embedded the extracted rules into a browser extension named PhishDetector to make our proposed method more functional and easy to use. Evaluating of the implemented browser extension indicates that it can detect phishing attacks in internet banking with high accuracy and reliability. PhishDetector can detect zero-day phishing attacks too.
Article
Phishing is a security threat which combines social engineering and website spoofing techniques to deceive users into revealing confidential information. In this paper, we propose a phishing detection method to protect Internet users from the phishing attacks. In particular, given a website, our proposed method will be able to detect if it is a phishing website. We use a logo image to determine the identity consistency between the real and the portrayed identity of a website. Consistent identity indicates a legitimate website and inconsistent identity indicates a phishing website. The proposed method consists of two processes, namely logo extraction and identity verification. The first process will detect and extract the logo image from all the downloaded image resources of a webpage. In order to detect the right logo image, we utilise a machine learning technique. Based on the extracted logo image, the second process will employ the Google image search to retrieve the portrayed identity. Since the relationship between the logo and domain name is exclusive, it is reasonable to treat the domain name as the identity. Hence, a comparison between the domain name returned by Google with the one from the query website will enable us to differentiate a phishing from a legitimate website. The conducted experiments show reliable and promising results. This proves the effectiveness and feasibility of using a graphical element such as a logo to detect a phishing website.