ArticlePDF Available

TrCSVM: a novel approach for the classification of melanoma skin cancer using transfer learning

Authors:

Abstract and Figures

Purpose The study aims to cope with the problems confronted in the skin lesion datasets with less training data toward the classification of melanoma. The vital, challenging issue is the insufficiency of training data that occurred while classifying the lesions as melanoma and non-melanoma. Design/methodology/approach In this work, a transfer learning (TL) framework Transfer Constituent Support Vector Machine (TrCSVM) is designed for melanoma classification based on feature-based domain adaptation (FBDA) leveraging the support vector machine (SVM) and Transfer AdaBoost (TrAdaBoost). The working of the framework is twofold: at first, SVM is utilized for domain adaptation for learning much transferrable representation between source and target domain. In the first phase, for homogeneous domain adaptation, it augments features by transforming the data from source and target (different but related) domains in a shared-subspace. In the second phase, for heterogeneous domain adaptation, it leverages knowledge by augmenting features from source to target (different and not related) domains to a shared-subspace. Second, TrAdaBoost is utilized to adjust the weights of wrongly classified data in the newly generated source and target datasets. Findings The experimental results empirically prove the superiority of TrCSVM than the state-of-the-art TL methods on less-sized datasets with an accuracy of 98.82%. Originality/value Experiments are conducted on six skin lesion datasets and performance is compared based on accuracy, precision, sensitivity, and specificity. The effectiveness of TrCSVM is evaluated on ten other datasets towards testing its generalizing behavior. Its performance is also compared with two existing TL frameworks (TrResampling, TrAdaBoost) for the classification of melanoma.
Content may be subject to copyright.
TrCSVM: a novel approach for the
classification of melanoma skin
cancer using transfer learning
Lokesh Singh, Rekh Ram Janghel and Satya Prakash Sahu
Department of Information Technology, National Institute of Technology,
Raipur, India
Abstract
Purpose The study aims to cope with the problems confronted in the skin lesion datasets with less training
data toward the classification of melanoma. The vital, challenging issue is the insufficiency of training data that
occurred while classifying the lesions as melanoma and non-melanoma.
Design/methodology/approach In this work, a transfer learning (TL) framework Transfer Constituent
Support Vector Machine (TrCSVM) is designed for melanoma classification based on feature-based domain
adaptation (FBDA) leveraging the support vector machine (SVM) and Transfer AdaBoost (TrAdaBoost). The
working of the framework is twofold: at first, SVM is utilized for domain adaptation for learning much
transferrable representation between source and target domain. In the first phase, for homogeneous domain
adaptation, it augments features by transforming the data from source and target (different but related)
domains in a shared-subspace. In the second phase, for heterogeneous domain adaptation, it leverages
knowledge by augmenting features from source to target (different and not related) domains to a shared-
subspace. Second, TrAdaBoost is utilized to adjust the weights of wrongly classified data in the newly
generated source and target datasets.
Findings The experimental results empirically prove the superiority of TrCSVM than the state-of-the-art TL
methods on less-sized datasets with an accuracy of 98.82%.
Originality/value Experiments are conducted on six skin lesion datasets and performance is compared
based on accuracy, precision, sensitivity, and specificity. The effectiveness of TrCSVM is evaluated on ten
other datasets towards testing its generalizing behavior. Its performance is also compared with two existing TL
frameworks (TrResampling, TrAdaBoost) for the classification of melanoma.
Keywords Melanoma, Pigmented skin lesion, Transfer learning, Support vectors, TrAdaBoost, Domain
adaptation
Paper type Research paper
1. Introduction
Skin cancer is recently evolving as a fatal disease, and thus, its diagnosis is of keen interest to
the medical practitioners (Perkins et al., 2005). In the case of melanoma detection, the main aim
is the automated classification of pigmented skin lesions as benign or malignant. To perform
such classification, the first task is to collect pigmented skin lesion images for training the
classification model with respective labels. As the distribution ratio of lesion images in
distinct skin lesion datasets might differ; therefore, a huge amount of labeled data is needed
for training the classification models for maintaining an appropriate classification
performance. However, the process of labeling the data is quite expensive. For minimizing
the effort toward melanoma detection, a learning model is needed, which is trained on some
pre-trained images to help learning models for some other lesion images. In such a context,
transfer learning (TL) might save a meaningful amount of labeling effort (Lu et al., 2015). TL
(Weiss et al., 2016) is a branch of machine learning that transfers useful knowledge from one
domain named as a source to the new domain named as a target (Yao and Doretto, 2010). The
challenging issue is how to differentiate the useful knowledge of the source domain in view of
varying distributions and embed it into the target domain (Day and Khoshgoftaar, 2017;
Wang and Deng, 2018). A classifier trained on the labeled data of one domain (source) cannot
be applied to another domain (target) if the domains are different (Zhou and Tsang, 2019). In
this scenario, domain adaptation helps in rescue, which leverages the knowledge from the
The
classification of
melanoma skin
cancer
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2514-9288.htm
Received 4 June 2020
Revised 2 September 2020
Accepted 4 October 2020
Data Technologies and
Applications
© Emerald Publishing Limited
2514-9288
DOI 10.1108/DTA-06-2020-0126
source domain toward improving the learning efficiency into the target domain (Liu et al.,
2020;Whitaker, 2019). The setting of domain adaptation (Wang and Deng, 2018) is shown in
Figure 1, which defines the transformation of data between domains with the type of learning
(inductive and transductive learning) toward mapping the feature space.
Therefore, a TL framework is designed based on feature-based domain adaptation
(FBDA) using the support vector machine (SVM) and TrAdaBoost that overcomes the
variances among the domains, so that a classifier trained on a source domain can generalize
well toward the target domain. For FBDA, SVM is used toward learning the augmented
feature subspace among the source and target domain to match the distributions. In other
words, informative support vectors are extracted from the source and target domain for
generating a new source training dataset and new target test dataset. Leveraging the domain
adaptation in the framework, knowledge obtained from the new source training dataset
(augmented feature subspace) is used to make the distribution closer to the new target test
datasets over the same and different domain mappings. Then boosting-based TL method,
namely TrAdaBoost, is utilized to fine-tune the weights of wrongly classified data on the
newly generated source and target domain. The reason for using AdaBoost is to focus more
on wrongly classified samples in every iteration. The work aims to cope with the challenging
problems confronted in skin lesion datasets with insufficient training data toward the
classification of melanoma as benign or malignant by incorporating FBDA in TL. The key
contributions and originality of the work are as follows:
(1) A FBDATL framework TrCSVM is proposed leveraging the SVM and TrAdaBoost,
which deals with the challenging issues encountered in dermatoscopic skin lesion
datasets with less training data toward the classification of melanoma as benign or
malignant.
(2) Domain adaptation (homogeneous and heterogeneous) is leveraged utilizing the SVM
toward creating a new source training and new target testing dataset by augmenting
features from the source and target domain, respectively. While TrAdaBoost is being
used for fine-tuning the weights, which reduces the weights of wrongly classified data
in the new source domain and increases the weights of wrongly classified data in the
new target domain.
(3) Conventional machine learning methods, namely, SVM, Decision Tree (DT) and Naıve
Bayes (NBs), are employed as base learning models of TrCSVM to learn a
classification model on the new source training dataset.
(4) The performance of TrCSVM is assessed on six dermoscopic pigmented skin lesion
datasets and ten other non-skin lesion benchmark datasets to test the generalizing
behavior of TrCSVM.
(5) The efficacy of TrCSVM in the classification of skin lesions is being compared with
two existing TL methods, namely TrResampling and TrAdaBoost, and its efficiency
is demonstrated toward handling the challenging issues confronted in the
classification of melanoma.
Figure 1.
Domain adaptation
settings
DTA
(6) To the best of our knowledge, TrCSVM is proposed for the first time toward the
classification of dermoscopic pigmented skin lesion images, which is never being
hitherto used in the literature.
The rest of the structure of this paper is described in the following sections. The related work
using TL approaches is discussed in Section 2. A brief overview of ensemble-based TL
methods employed in experimentation is mentioned in Section 3. The methodology carried
out for the experiment is discussed elaborately in Section 4.Section 5 represents the
experimental results along with the summarized dataset details and comparison of designed
methods with the previous work. Discussion is presented in Section 6, while the paper is
concluded in Section 7 at last.
2. Related work
The studies reported in the literature, in general, use TL approaches toward solving the issues
of insufficiency of the dataset, where models trained for a specific source task is reused for a
new destined task. The studies using deep neural network (DNN) for detecting melanoma
usually trains the network from starting or transfers the knowledge using AlexNet(Ho sny et al.,
2018), VGG-16 (Ding et al., 2018), GoogLeNet (Kassem et al., 2020) and ResNet (Chaturvedi et al.,
2020) from ImageNet. Pegah et al. (Ahn et al.,2017) proposed a system for the segmentation and
classification of skin lesions, which detects and segments the vessels into pigmented and non-
pigmented lesions and reduces the presence of blood vessels in the region of skin lesions. The
designed method clusters the hemoglobin part utilizing the k-means approach. Waheed et al.
(2017) developed a machine learning-based model which discriminates the color and texture
features of pigmented skin lesions for classifying the melanoma as benign or malignant. The
investigation conducted in Abhinav Sagar (2020) demonstrates the efficiency of DNN toward
achieving promising accuracy than the medical experts using TL. Though DNN works
effectively on the image classification tasks, the requirement of huge training data poses a
difficulty for medical imaging (Liu et al., 2017a). The difficulty is overcome by introducing
MelaNet, a framework based on DNN in Zunair and Ben Hamza (2020),towarddetecting
melanoma. The working of MelaNet is twofold. At first, toward balancing the training dataset,
the dermatoscopic images are created synthetically for the outnumbered (minority) class. The
designed synthetic images are then utilized for boosting training. Second, a DNN is trained via
reducing the focal-loss function to help the classification model toward learning form tough
samples. Melanoma is differentiated from the nevus lesion in Almaraz-Damian et al. (2020) by
designing a novel computer-aided diagnostic (CAD) system. The developed system differs from
the conventional CAD systems by employing handcrafted features extracted from ABCD rule
and deep learning. TL is used as a feature extractor where features were fused utilizing the MI
metric, which elects highly significant features than the conventional systems. An automated
system was designed in Hosny et al. (2019), which classifies pigmented skin lesions using the
TL and DNN. TL was applied to the AlexNet, where weights were fine-tuned. The classification
layer is replaced by the softmax layer, and data are augmented using fixed and random rotation
angles. This softmax layer classifies the lesions as melanoma or nevus appropriately. It had
been indicated in research that CNN (Naeem et al.,2020) classifies the pigmented lesion images
similar to the dermatologists. Utilizing such a deep learning method a novel system is designed
in El-khatib et al. (2020) using multiple classifiers where every classifier provides the decision
system a specific weight, which assists the system in making the right decision. The system
differentiates the pigmented lesions utilizing the TL models like NN, CNN, GoogleNet, NasNet-
Large and ResNet-101 during the training.
It has been observed from the related studies that effective and accurate melanoma
investigation with higher classification rates plays a vital role in classifying the pigmented
skin lesion. The majority of the work reported in the literature for melanoma detection using
The
classification of
melanoma skin
cancer
TL is conducted either using deep learning models or designing the CAD systems. In the
reported work, predictive modeling is performed on a different but related problem (same
feature space). In comparison, our work bridges this gap by performing predictive modeling
not only on the same feature space but also on different feature space. Toward classifying the
melanoma, our work deals with the problem of small-sized skin lesion datasets by
incorporating FBDA in TL so that a classifier trained on a source domain can generalize well
toward the target domain.
3. Existing ensemble-based transfer learning methods
Boosting-based TL methods employ ensemble approaches over both the source and the
target samples using an updatemechanism, incorporating only the samples of the source
domain, which are useful for the classification of target domain instances. Mapping of such
type is performed by providing higher weights to the samples of source domain to improve
the training of target while the negative transfer is induced by reducing the weight of samples
(Liu et al., 2017b). In this work, an ensemble-based TL framework is designed, which utilizes
the boosting-based TL strategy, namely, TrAdaBoost. The performance of conducted
experiments is also assessed by comparing it with another ensemble-based TL method,
namely, TrResampling, to evaluate the effectiveness of a designed framework. These
approaches are used for adjusting the data of the source and target domain to utilize
informative samples for the better training of a classifier.
3.1 TrAdaBoost: boosting-based transfer learning algorithm
TrAdaBoost (Paper et al., 2013;Pan and Yang, 2010) is an ensemble TL methodology based
on AdaBoost (Xu and Sun, 2012), which changes the weights of source and target data
adaptively. In every iteration, TrAdaBoost reduces the weights of mistakenly classified
different distribution training samples by multiplying the weights. Therefore, in the next
iteration, the wrongly classified different distribution training samples that are not similar to
the same distribution samples will influence the learning procedure less than the current
iteration. After numerous iterations, the training samples of different distribution dataset
comprise higher weights while samples of the different distribution dataset which are not
similar to the samples of the same distribution dataset comprise lower weights. Thus, the
higher weighted samples will help the learning methods for the better training of classifiers in
classification (Dai et al., 2007).
3.2 TrResampling: weighted resampling based transfer learning algorithm
TrResampling (Liu et al., 2017b) is a weighted resampling based TL methodology. In this
method, a new source training set is generated from the actual source training set iteratively
using weighted resampling. In this method, at first, weights are initialized randomly to the
actual source set. Then the higher weighted instances from the actual source set are chosen
with a high probability toward designing the new set. The process continues until the newly
created source training set is equivalent to the size of the actual source set. The labeled
instances of the target dataset are then aggregated to the newly designed set as the new
source training dataset comprising of higher weighted instances obtained from the actual
source set, thereafter, utilized the TrAdaBoost: a boosting-based TL strategy, for adjusting
the influence of data toward developing the model.
4. Proposed methodology
4.1 Preparing actual source and target dataset
This section discusses how the source and target datasets are prepared for training and
testing, respectively, to deal with the problem of the small size of skin lesion dataset using TL
toward the classification of melanoma.
DTA
4.1.1 Actual source dataset. For preparing the actual source training dataset, we have
used the ISIC-2017 challenge official dataset comprising the 2,000 dermoscopic pigmented
skin lesion images. We have applied rotation, vertical-horizontal flips, horizontal-vertical
shear and zoom operations on the images and augmented the dataset into 50,000 images.
After data augmentation, we have extracted 112 features based on shape, boundary
irregularity, texture and color (Dalila et al., 2017). After augmentation, the ISIC-2017 dataset
is now resized into a feature set of 50,000 3112, which is used as the actual source training
dataset in the framework, which is further utilized to generate a new source training dataset
using the SVM.
4.1.2 Actual target dataset. We have used PH2, HAM10000, MED-NODE, Dermatology
Atlas, Dermnet Atlas and Dermis as target datasets for testing. Features of PH2 and
HAM10000 datasets are available publicly, while the rest of the datasets are available as
image datasets; therefore, 112 features are extracted for each dataset to prepare the actual
target datasets.
4.2 Generation of new source training dataset using feature-based domain adaptation
This section discusses the FBDA process utilizing the SVM in the proposed TL framework
toward generating the new source training dataset. In feature-based,an augmented feature
space is learned among the source domain and target domain to match the distributions. The
working of this process is twofold. In the first phase, we have source and target of the same
domain (same feature space) where informative features from both the domains are extracted
and augmented toward learning intermediate presentations. The work conducted in the first
phase is referred to as homogeneous learning for domain adaptation, due to interpolation
among domains. In the second phase, the source and target are of different domains (different
feature space) where features are augmented toward matching the distributions of different
domains. This work is defined as heterogeneous learning for domain adaptation. In both
phases, features augmented from both the domains referred to as augmented feature space
or new source training dataset,which is learned among new source and target datasets.
Figure 2 schematically shows the working of this process. The FBDA process can be defined
as- Utilizing SVM, support vectors are extracted to generate a new target dataset NSTrand
new source training dataset NSTrfrom the actual source set STrand target set TTr;such that
the size of jNSTrjjSTrj, while minimizing the risk of loss of information.
Where,
NSTrfSTr;TTgr(1)
In both phases, the ISIC-2017 skin lesion dataset is used as the actual source dataset, while the
list of target dataset for testing is defined in Table 2.
4.2.1 Ist iteration. In the first iteration, support vectors from the actual source STrand
target TTrare extracted as follows:
SSV1¼SVMðSTrÞ(2)
TSV1¼SVMðTTÞr(3)
The first set of support vectors of source SSV1and target TSV1;obtained in the first
iteration are then deleted from the actual source set, and actual target set, respectively, creates
another training dataset STr2and TTr2as:
STr2¼STrSSV1(4)
TTr2¼TTrTSV1(5)
The
classification of
melanoma skin
cancer
4.2.2 IInd iteration. In the second iteration, SVM is again applied to STr2and TTr2
respectively, to extract another set of support vectors as:
SSV2¼SVMðSTr2Þ(6)
TSV2¼SVMðTTr2Þ(7)
Since not all the informative samples are extracted as support vectors in a single iteration, the
procedure is repeated until nth iteration, which minimizes the information loss by extracting
promising support vectors only.
4.2.3 nth Iteration. For nth iteration, it can be expressed as follows:
SSVn¼SVMðSTrðnÞÞ(8)
TSVn¼SVMðTTrðnÞÞ(9)
Support vectors obtained from the actual source and target domain in each iteration together
makes a new source training dataset NSTr:While after the designing of NSTr, features left at
the end in the old source dataset are then discarded. The new source training dataset NSTris
defined as:
NSTr¼X
n
i¼1
½SSV½iþTSV½i (10)
Figure 2.
The process of feature-
based domain
adaptation
DTA
New target testing dataset NTTris designed by removing the support vectors of the target,
obtained per iteration from the old target dataset, i.e. after the designing of NSTr;the left-over
features of the old target dataset makes the new target testing dataset as:
NTTr¼X
n
i¼1
½TTr½iTSV ½i (11)
The process of extracting the support vectors from the source and target set continues
iteratively, until the size of the new source training dataset NSTrbecomes less than or equal
to the size of the actual source training set STr. The pseudo-code of the FBDA process is
explained in Algorithm 1, while the designed framework is discussed in Algorithm 2 and
pictorially presented in Figure 3.
Algorithm 1. Feature-based domain adaptation using SVM
4.3 Transfer learning framework based on feature-based domain adaptation: TrCSVM
In this work, a TL framework TrCSVM is designed based on FBDA, utilizing SVM and
boosting-based TL method TrAdaBoost, so that a classifier trained on a source dataset
generalizes well to the target dataset for classifying the pigmented skin lesions as benign or
malignant. The new source training set NSTrand new target testing dataset NTTris
designed utilizing the constituent support vector method (CSVM), as discussed in Section
4.1. TrAdaBoost method is then utilized to leverage the knowledge obtained from the new
source training dataset. TrAdaBoost iteratively reduces the weights of misclassified
samples in the source domain set and maximizes the weights of incorrectly classified data
Algorithm 1: Feature-based domain adaptation using SVM
Input:
= Original source training dataset
= Original target training dataset
= New Source training dataset
= New Target testing dataset
= Support vector of source dataset
= Support vector of target dataset
= [ ]
Method:
1. Apply SVM on and to extract support vectors
2. Generate a target testing dataset and new source training dataset with the same size as using the
following steps:
3. =1
4. {
5. [ ] =([ ])// Obtaining support vectors from the actual source dataset by applying SVM
6. [ ] =([ ])// Obtaining support vectors fr om the actual target dataset by applying SVM
7. =[[ ] +[ ]]
=1 // aggregating the support vectors of source and target to generate
the new source training dataset
8. =[[ ] [ ]]
=1 // removing the support vectors of a target from the old target dataset
9. ([ ] []) // checking for the size of new source training datasets
10. ;
11.
12. }
13.
Output: The new source training dataset , and new target testing dataset
The
classification of
melanoma skin
cancer
in the target domain set and gives more focus on the wrongly classified samples. This
updating process is based on the computation of training error over the normalized weights
of the target and employed a procedure conditioned from the standard AdaBoost method.
The weighted majority algorithm (WMA) fine-tunes the weights of source instances by
reducing the weights of wrongly classified source samples iteratively at a constant ratio
and keeps the current weights of properly classified instances of thesource.Thecore
concept is that source samples, which are not correctly classified consistently, used to
converge to 0 and cannot be employed in the output of the final classifier because the
classifier employs N=2 boosting iterations for convergence. Thus, utilizing the designed
framework, a classification model is learned on the re-weighted labeled samples for
classifying the pigmented skin lesions as benign or malignant. The key advantage of the
designed framework, which differs from existing state-of-art-methods, are discussed as
follows:
(1) An augmented feature space is learned among the source domain and target domain
to match the distributions of source and target domains.
(2) The risk of loss of information is minimized by utilizing SVM in the framework, which
generates a new source training dataset by extracting informative support vectors
from source and target domain.
(3) Mis-classified samples are provided more focus on each iteration utilizing the
TrAdaBoost by adjusting the weights of training data.
(4) The risk of overfitting is decreased by aggregating several weak learners utilizing the
TrAdaBoost.
Figure 3.
Proposed framework of
TrCSVM
DTA
Algorithm 2. TrCSVM Framework
4.4 Other base learners
Table 1 discusses the details of base learners, namely, DT, SVM and NBs employed in the
TrCSVM framework.
5. Experimental results
To illustrate the efficacy of the designed framework TrCSVM, experiments are conducted on
six benchmark publicly available pigmented skin lesion datasets, namely PH2, HAM10000,
MED-NODE, Dermatology Atlas, Dermnet Atlas, Dermis and ten other non-skin benchmark
datasets to test the generalizing behavior of TrCSVM. The models are designed using the
sklearn (Scikit-Learn, 2020), pandas (Pandas Pydata, 2020), matplotlib (Matplotlib, 2020),
sklearn (Seaborn Pydata, 2020), numpy (NumPy, 2020), glob (Techbeamers, 2020) libraries of
python. Models are run on the NVidia Quadro P4000 14 core GPU with 8 GB graphics
memory. The experiment aims to evaluate the effectiveness of a designed framework for
classifying pigmented skin lesions as benign or malignant.
Algorithm 2: TrCSVM Framework
Input:
= feature space
= label space
Mapping function =
New source training dataset of samples =
New target testing dataset of samples =
=( , )
= ( , )
The highest number of iterations:
Base classification method:
Weak models for boosting iterations: 2
Method:
for =1 to do
Search for the candidate weak classifier for Y which reduces error for
Update weights of the source through WMA for decreasing the weights of wrongly classified samples
Update weights of the target through AdaBoost employing target error-rate ( )
Normalize weights for
end for
Output: the target classification
S.
No
Base
learners Description
1 Decision
tree
The decision tree aims to develop a model that can predict the target variables value
using decision rules. They are computationally fast to train and test as well and suited
for datasets with mixed attributes (Xia et al., 2017)
2 SVM It is a supervised learning method that can be used for both classifications as well as
regression. Generally, SVM is used in classification problems (Sisodia et al., 2010). It
constructs a single hyperplane or set of hyper-planes in a high-dimensional space for
classification or for detecting outliers
3Naıve Bayes It is a probabilistic classifier based on Bayestheorem with a strong assumption of
independence among each pair of features
Table 1.
Summarized detail of
state-of-the-art base
learners
The
classification of
melanoma skin
cancer
5.1 Data acquisition
Table 2 discusses the summarized details of benchmark publicly available datasets employed
for assessing the effectiveness of the proposed TrCSVM framework. Experiments are
performed on a total of 16 benchmark datasets, out of which six are the pigmented skin lesion
datasets acquired from different sources. In contrast, the rest ten other non-skin datasets are
acquired from the UCI and KEEL repository, which is already validated in Liu et al. (2017b),
Liu and Zhang (2015),Liu et al. (2015) as well. Datasets are acquired from distinct sources and
are organized into two categories, i.e. binary and multiclass, with different characteristics.
The dataset comprises of different category, features, instances and classes. Some datasets
have no missing values, while four datasets out of 16 comprise missing values, as stated in
Table 2. Since the datasets comprise very few missing values, we have deleted the rows with
missing data to avoid the complexity and to reduce the computation time.
5.2 Homogeneous domain adaptive setting
The available pigmented skin lesion target datasets are drawn from the same domain; they
are related but do not exactly match. Therefore, homogeneous domain adaptation TL is used
in this section for building an effective model toward the target domain, till the input feature
space remains similar. In TrCSVM, three machine learning classifiers, namely, DT, NBs and
Dataset
category Datasets
Source/
reference Category Instances
Features/
attributes
No. of
classes
Missing
values
Skin
lesion
PH2
(Mendonça
et al., 2015)
ADDI (2020) Multiclass 200 15 03 NO
HAM10000
(Tschandl
et al., 2018)
ViDIR
(Dataverse,
2020)
10,015 192 07 NO
MED-NODE
(Giotis et al.,
2015)
MED-NODE
(Rug, 2020)
Binary
class
140 112 02 NO
Dermatology
Atlas (Kim
et al., 2004)
Derm. Atlas
(Dermoscopy
Atlas, 2020)
Multiclass 250 112 03 NO
Dermnet -
Atlas (Liao
et al., 2016)
Drmnt.Atlas
(Dermnet,
2020)
Binary
class
180 112 02 NO
Dermis (Xu
et al., 2018)
Dermis (2020) Binary
class
210 112 02 NO
Others
(non-skin
lesion)
Heart-C KEEL (2020) Multiclass 303 13 05 YES
Heart-Statlog KEEL (2020) Binary
class
270 13 02 NO
Hepatitis KEEL (2020) Binary
class
155 19 02 NO
Iris KEEL (2020) Multiclass 150 04 03 NO
Letter KEEL (2020) Multiclass 20,000 16 26 NO
Mushroom KEEL (2020) Binary
class
8,124 22 02 YES
Diabetes UCI (2020) Multiclass 30,201 04 20 NO
Segment UCI (2020) Multiclass 2,310 19 07 NO
Sick UCI (2020) Binary
class
2,800 29 02 YES
Soybean UCI (2020) Multiclass 307 35 19 YES
Table 2.
Summarized detail of
benchmark less-sized
datasets
DTA
SVM are required as base learners by the TrAdaBoost. Ten-fold cross-validation is used for
proper error estimation of employed base learners. Table 3 represents the results of base
learners used in the framework, where TrCSVM with SVM outperforms on all skin datasets
while TrCSVM with NB and DT outperforms on Dermatology Atlas and HAM10000 datasets,
respectively.
As observed in Table 3, from the average value of all the base classifiers, SVM is
considered as the best performing base learner among the rest of the base classifiers toward
handling the un-weighted training samples to advantage the TrCSVM.
For the better demonstration of TL ability of designed methodology, we considered 3%,
10%, 30% and 50% labeled target data of skin lesion datasets, respectively. It has been
observed from Table 4 that the TrCSVM demonstrates good transferability at 3% labeled
target data, i.e. less than 10% in terms of accuracy, precision, sensitivity and specificity.
Results indicate that the TL of TrCSVM is benefitted with less training labeled target data, i.e.
TrCSVM is performing superior when labeled training target data is less than 3%. Figure 4
demonstrates the visual illustration of the TL ability of TrCSVM on 3%, 10%, 30% and 50%
labeled target data on skin lesion datasets based on accuracy. The figure clearly illustrates
the powerful TL ability of TrCSVM with the PH2 dataset at 3% labeled target data.
5.3 Heterogeneous domain adaptive setting
The experiments are conducted toward comparing the performance of TrCSVM with the
existing TL methods under the different domain target data. For experimentation,
heterogeneous domain adaptation TL is used where the task remains the same while the
feature spaces of target and source domain differ. Since the employed other (non-skin)
datasets belong to a different domain, they are not related and do not exactly match.
Therefore, heterogeneous domain adaptation TL is used in this section for building an
effective model under the varying dimensionality of the target dataset.
In order to evaluate the effectiveness of the proposed TL framework TrCSVM, it is being
compared with the existing TL framework, namely, TrResampling (Liu et al.,2017b)and
TrAdaBoost (Wang and Pineau, 2015) on ten other non-skin datasets. Table 5 represents a
comparison of TL ability of proposed TrCSVM with existing TrResampling and TrAdaBoost
at 3%, 10% and 30% labeled target data on ten other datasets acquired from the UCI and KEEL
repository. From Table 5, it has been observed that the TrCSVM provides promising accuracy
at 3% labeled target data on eight datasets out of ten datasets. While Heart-C and Soybean
dataset has achieved better accuracy when the labeled target data are more than 3%, i.e. at 10
and 30%, respectively. The average performances of TrAdaBoost and TrResampling has been
found almost same at 3% labeled target data while Iris and Mushroom datasets has achieved
exactly sameaccuracy. While the average performances of TrAdaBoost and TrResampling has
been observed almost similar at 10 and 30% labeled target data on all non-skin datasets. The
average performance of TrCSVM has been found superior than the TrAdaBoost and
TrResampling on all other non-skin datasets with the 10 and 30 % labeled target data.
Skin lesion datasets TrCSVM-SVM TrCSVM-DT TrCSVM-NB
MED-NODE 89.46 88.35 87.12
PH2 98.82 96.64 96.80
HAM10000 82.23 82.68 81.83
Dermatology Atlas 82.14 83.27 84.91
Dermnet - Atlas 79.24 78.28 77.13
Dermis 87.74 86.53 85.25
Average 86.60 85.95 85.50
Table 3.
Accuracy of distinct
base learners of
TrCSVM
The
classification of
melanoma skin
cancer
Skin lesion datasets
3% labeled target data 10% labeled target data 30% labeled target data 50% labeled target data
Ac Pr Se Sp Ac Pr Se Sp Ac Pr Se Sp Ac Pr Se Sp
MED-NODE 89.5 88.2 89.1 89.2 87.3 88.2 86.7 85.3 84.8 82.4 83.6 82.6 81.6 81.9 80.2 80.2
HAM10000 82.2 80.5 81.8 81.1 81.9 80.6 80.0 80.9 77.2 78.1 76.7 76.4 71.6 70.8 69.8 70.3
Dermatology Atlas 82.1 81.3 81.8 82.2 79.7 78.7 77.9 77.3 75.3 74.1 72.8 73.8 70.9 68.3 69.8 69.9
Dermnet Atlas 79.2 79.5 77.9 77.3 74.2 73.2 72.8 71.9 69.5 68.7 65.3 69.0 65.3 63.3 64.3 64.7
Dermis 87.7 86.7 86.2 85.7 86.4 84.9 85.4 85.8 82.2 79.9 81.7 81.3 78.4 77.4 76.6 76.3
PH2 98.8 97.5 97.3 96.6 98.1 96.9 97.2 96.5 95.8 94.3 94.7 93.2 92.3 90.2 91.3 91.5
Note(s): Ac, Accuracy; Pr, Precision; Se, Sensitivity; Sp, Specificity
Table 4.
Evaluating TrCSVM
using SVM as a base
classifier on labeled
target data over skin
lesion datasets
DTA
5.4 Statistical hypothesis test
Further, we applied a non-parametric statistical test toward clarifying the performances of
TrCSVM, TrResampling and TrAdaBoost. Friedmans non-parametric statistical test is used
for detecting the entire performance of methods based on accuracy.
Friedmans non-parametric test with the Iman-Davenport extension (Sta
˛por, 2017;
Hollander and Wolfe, 2013) is discussed as follows. Let Txy be the rank of the yth of M
classifiers on the xth of Ndatasets.
Ty¼1
NX
N
x¼1
Txy
Where, Tyis the mean rank of the yth classifier. The Friedman test, after that, compares the
mean ranks of the classifiers based on the test statistic.
ZZ¼ðN1Þ
χ
2
Z
NðM1Þ
χ
2
Z
Where,
χ
2
Z¼12 N
MðMþ1Þ"P
M
y¼1
T2
yMðMþ1Þ2
4#
Table 5 represents the experimental results conducting the Friedmans test, which shows
the higher ranking of TrCSVM compared to the TrResampling and TrAdaBoost with respect
Other
datasets
3% labeled target data 10% labeled target data 30% labeled target data
TrCSVM TrRes TrAdB TrCSVM TrRes TrAdB TrCSVM TrRes TrAdB
Diabetes 72.34 70.9 69.87 69.67 66.46 65.86 68.46 67.04 67.09
Heart-C 78.54 79.67 81.93 84.15 82.2 80.71 89.78 86.71 86.53
Heart-
Statlog
75.64 73.21 69.7 78.27 75.11 74.3 78.65 77.57 77.98
Hepatitis 90.12 87.05 85.45 87.26 86.91 85.92 88.63 86.82 83.72
Iris 68.47 67.97 67.97 84.18 81.1 79.09 92.11 88.59 88.21
Letter 64.92 62.28 62.37 70.12 70.75 71.23 81.28 79.09 79.06
Mushroom 98.12 97.76 97.76 99.12 98.88 98.99 99.99 99.97 99.97
Segment 84.63 81.69 78.76 95.28 91.58 91.58 94.28 93.28 92.88
Sick 98.18 95.71 96.04 98.29 96.37 95.65 98.67 97.31 97.28
Soybean 61.26 63.94 71.89 85.19 84.94 81.06 91.24 90.57 90.53
Average 79.22 78.01 78.17 85.15 83.43 82.43 88.30 86.69 86.32
Rank 132 123 123
p-value 0.000000011 0.00000149 0.000000143
Figure 4.
Performance of
TrCSVM on 3%, 10%,
30% and 50% labeled
target data on skin
lesion datasets based
on accuracy
Table 5.
Comparison of
proposed TrCSVM
with other techniques
based on accuracy on
labeled target data on
other datasets
The
classification of
melanoma skin
cancer
to the ratio of labeled target data and the highest performance is represented in italics. The p-
value obtained from the Friedmans test shown in Table 5, TrCSVM shows significant
differences than the TrAdaBoost and TrResampling, which indicates the superiority of
TrCSVM on the employed datasets.
6. Discussion
In this work, we have designed a FBDATL framework utilizing the SVM and TrAdaBoost
toward melanoma classification. The developed domain adaptive TL framework
overcomes the variations between the source and target domains in order that a
classifier trained on one domain (source) generalizes well with the other domain (target).
Our work learns augmented feature subspace where the domain adaptivesettings leverage
TL, toward learning the transferable presentations by incorporating domain adaptation in
the pipeline of TL. Comparing with the TrResampling and TrAdaBoost, TrCSVM has
gained un-beatable classification performance in this work. Unlike the TrResampling and
TrAdaBoost, TrCSVM minimizes the risk of loss of information utilizing the SVM by
extracting useful support vectors from source and target domain. The designed framework
not only fine-tune the weights of training data while focusing more on misclassified
samples per iteration but also decreases the risk of overfitting and reduces the
generalization error as well even after niterations when the training-error has reached to
zero. The work done strongly implies the better TL ability of a designed framework toward
classifying the pigmented skin lesions as benign or malignant. A comparative study of
proposed work with the existing work for the classification of melanoma is reported in
Table 6. Results shown in italics in Table 6 illustrate the superior TL ability of TrCSVM in
terms of accuracy toward the melanoma classification.
7. Conclusion
TheworkaimstodesignaTLframeworkbasedonFBDAtocopewiththechallenging
issues confronted in skin lesion datasets with less training data toward classifying the
melanoma as benign or malignant. The work comprises a newly designed TL framework,
namely TrCSVM utilizing the SVM and boosting-based TL method TrAdaBoost. In this
work, a domain-adaptative approach is designed utilizing the SVM that generates a new
source training dataset by feature augmentation from both domains while minimizing the
risk of loss of information. Our aim of utilizing SVM for generating an augmented features
sub-space is based on the de facto method of finding informative samples, i.e. support
vectors from the source and target domain. Then boosting-based TL method, namely
TrAdaBoost, is utilized in the framework for fine-tuning of the weights of wrongly
classified data on the source and target domain. The results empirically prove the superior
S.N. References Classification method Accuracy Specificity Sensitivity Precision
1(Uddin and Bansal,
2020)
DenseNet201 þSVM 92 ––
2(Salido, 2018) AlexNet 93 91 91
3(Rodrigues et al., 2020) DenseNet201 þKNN 93.16 93.15 93.16 93.25
4(Saad et al., 2019) Resnet-50 95 80 93.24 93.75
5(Hosny et al., 2018) AlexNet 98.61 98.93 98.33 97.73
6(Akram et al., 2020) ECNCA 98.8 97.45 97 99
7 Proposed method TrCSVM 98.82 98.91 98.12 96.68
Table 6.
A comparative study of
proposed work with
the existing work for
the classification of
melanoma
DTA
TL ability of the proposed framework not merely on six pigmented skin lesion datasets but
also on ten other datasets when compared with the existing techniques with an improved
classification performance. Thus, utilizing the designed framework, a new classification
model is learned on the re-weighted labeled samples for classifying the pigmented skin
lesions as benign or malignant.
References
Abhinav Sagar, D.J. (2020), Convolutional neural networks for classifying melanoma images,
bioRxiv, pp. 1-12.
ADDI (2020), ADDI - automatic computer-based diagnosis system for dermoscopy images, available
at: https://www.fc.up.pt/addi/ph2_database.html (accessed 13 May 2020).
Ahn, E., Kim, J., Bi, L., Kumar, A., Li, C., Fulham, M. and Feng, D.D. (2017), Saliency - based lesion
segmentation via background detection in dermoscopic images,IEEE Journal of Biomedical
and Health Informatics, Vol. 21 No. 6, pp. 1685-1693.
Akram, T., Lodhi, H.M., Naqvi, S.R., Naeem, S., Alhaisoni, M., Ali, M., Haider, S. and Qadri, N.N. (2020),
A multilevel features selection framework for skin lesion classification,Human-Centric
Computing and Information Sciences, Vol. 10, pp. 1-26.
Almaraz-Damian, J.A., Ponomaryov, V., Sadovnychiy, S. and Castillejos-Fernandez, H. (2020),
Melanoma and nevus skin lesion classification using handcraft and deep learning feature
fusion via mutual information measures,Entropy, Vol. 22 No. 4, pp. 1-23, doi: 10.3390/
E22040484.
Chaturvedi, S.S., Gupta, K. and Prasad, P.S. (2020), Skin lesion analyser: an efficient seven-way multi-
class skin cancer classification using MobileNet,International Conference on Advanced
Machine Learning Technologies and Applications (AMLTA), Springer, Singapore, pp. 1-12.
Dai, W., Yang, Q., Xue, G.R. and Yu, Y. (2007), Boosting for transfer learning,ACM International
Conference Proceeding Series, Vol. 227, pp. 193-200, doi: 10.1145/1273496.1273521.
Dalila, F., Zohra, A., Reda, K. and Hocine, C. (2017), Segmentation and classification of melanoma and
benign skin lesions,Optik, Vol. 140, pp. 749-761, doi: 10.1016/j.ijleo.2017.04.084.
Dataverse (2020), The HAM10000 dataset, a large collection of multi-source dermatoscopic images of
common pigmented skin lesions ViDIR Dataverse, available at: https://dataverse.harvard.
edu/dataset.xhtml?persistentId5doi:10.7910/DVN/DBW86T (accessed 13 May 2020).
Day, O. and Khoshgoftaar, T.M. (2017), A survey on heterogeneous transfer learning,Journal of Big
Data, Vol. 29 No. 4, pp. 1-42, doi: 10.1186/s40537-017-0089-0.
Dermis (2020), DermIS, available at: https://www.dermis.net/dermisroot/en/home/index.htm
(accessed 13 May 2020).
Dermnet (2020), Dermatology education jjust another WordPress site, available at: http://www.
dermnet.com/ (accessed 13 May 2020).
Dermoscopy Atlas (2020), Dermoscopy atlas, available at: http://www.dermoscopyatlas.com/
(accessed 13 May 2020).
Ding, P., Zhang, Y., Deng, W., Jia, P. and Kuijper, A. (2018), A light and faster regional convolutional
neural network for object detection in optical remote sensing images,ISPRS Journal of
Photogrammetry and Remote Sensing, Vol. 14, 1 June 2017, pp. 208-218, doi: 10.1016/j.isprsjprs.
2018.05.005.
El-khatib, H., Popescu, D. and Ichim, L. (2020), Deep learning based methods for automatic
diagnosis of skin lesions,Sensors, Vol. 20, No. 6, pp. 25-30, doi: 10.3390/s20061753.
Giotis, I., Molders, N., Land, S., Biehl, M., Jonkman, M.F. and Petkov, N. (2015), MED-NODE: a
computer-assisted melanoma diagnosis system using non-dermoscopic images,Expert Systems
with Applications, Vol. 42 No. 19, pp. 6578-6585, doi: 10.1016/j.eswa.2015.04.034.
The
classification of
melanoma skin
cancer
Hollander, M. and Wolfe, D.A. (2013), Nonparametric Statistical Methods, John Wiley and Sons,
New York.
Hosny, K.M., Kassem, M.A. and Foaud, M.M. (2018), Skin cancer classification using deep learning
and transfer learning,2018 9th Cairo International Biomedical Engineering Conference,
CIBEC, pp. 90-93.
Hosny, K.M., Kassem, M.A. and Foaud, M.M. (2019), Classification of skin lesions using transfer
learning and augmentation with Alex-net,PLoS One, Vol. 14 No. 5, pp. 1-17, doi: 10.1371/
journal.pone.0217293.
Kassem, M.A., Hosny, K.M. and Fouad, M.M. (2020), Skin lesions classification into eight classes for
ISIC 2019 using deep convolutional neural network and transfer learning,IEEE Access, Vol. 8,
pp. 114822-114832, doi: 10.1109/ACCESS.2020.3003890.
KEEL (2020), KEEL: a software tool to assess evolutionary algorithms for data mining problems
(regression, classification, clustering, pattern mining and so on), available at: https://sci2s.ugr.
es/keel/datasets.php (accessed 13 May 2020).
Kim, G.R., Aronson, A.R., Mork, J.G., Cohen, B.A. and Lehmann, C.U. (2004), Application of a medical
text indexer to an online dermatology atlas,Medinfo, Vol. 11 No. Pt 1, pp. 287-291.
Liao, H., Li, Y. and Luo, J. (2016), Skin disease classification versus skin lesion characterization:
achieving robust diagnosis using multi-label deep neural networks,Proceedings of
International Conference on Pattern Recognition, pp. 355-360, doi: 10.1109/ICPR.2016.7899659.
Liu, X. and Zhang, H. (2015), Bagging based ensemble transfer learning,Journal of Ambient
Intelligence and Humanized Computing, Vol. 7, pp. 29-36 doi: 10.1007/s12652-015-0296-5.
Liu, X., Wang, G., Cai, Z. and Zhang, H. (2015), A multiboosting based transfer learning algorithm,
Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol. 19 No. 3,
pp. 381-388.
Liu, L., Yan, R., Maruvanchery, V., Kayacan, E., Chen, I. and Tiong, L.K. (2017), Transfer learning on
convolutional activation feature as applied to a building quality assessment robot,
International Journal of Advanced Robotic Systems, Vol. 3 No. 14, pp. 1-12, doi: 10.1177/
1729881417712620.
Liu, X., Liu, Z., Wang, G., Cai, Z. and Zhang, H. (2017), Ensemble transfer learning algorithm,IEEE
Access, Vol. 6, pp. 2389-2396, doi: 10.1109/ACCESS.2017.2782884.
Liu, F., Member, S., Zhang, G. and Lu, J. (2020), Heterogeneous domain adaptation: an unsupervised
approach,IEEE Transactions on Neural Networks and Learning Systems, pp. 1-15.
Lu, J., Behbood, V., Hao, P., Zuo, H., Xue, S. and Zhang, G. (2015), Transfer learning using
computational intelligence: a survey,Knowledge-Based Systems, Vol. 80, pp. 14-23.
Matplotlib (2020), Matplotlib 3.1.2 documentation, available at: https://matplotlib.org/3.1.1/tutorials/
index.html (accessed 23 Mar 2020).
Mendonca, T.F., Celebi, M.E., Mendonca, T. and Marques, J.S. (2015), PH2: a public database for the
analysis of dermoscopic images,Dermoscopy Image Analysis, April 2015, pp. 1-24, doi: 10.1201/
b19107-14.
Naeem, A., Farooq, M.S., Khelifi, A. and Abid, A. (2020), Malignant melanoma classification using
deep learning: datasets, performance measurements, challenges and opportunities,IEEE
Access, Vol. 8, pp. 110575-110597, doi: 10.1109/ACCESS.2020.3001507.
Numpy (2020), NumPy NumPy, available at: https://numpy.org/ (accessed 23 March 2020).
Pan, S.J. and Yang, Q. (2010), A survey on transfer learning,IEEE Transactions on Knowledge and
Data Engineering, Vol. 22 No. 10, pp. 1345-1359, doi: 10.1109/TKDE.2009.191.
Pandas Pydata (2020), Pandas Python data analysis library, available at: https://pandas.pydata.
org/ (accessed 23 March 2020).
Paper, C., Yao, Y. and Doretto, G. (2013), Boosting for transfer learning with multiple sources,
December, doi: 10.1109/CVPR.2010.5539857.
DTA
Perkins, J.L., Liu, Y., Mitby, P.A., Neglia, J.P., Hammond, S., Stovall, M., Meadows, A.T., Hutchinson,
R., Dreyer, Z., Robison, L.L. and others (2005), Nonmelanoma skin cancer in survivors of
childhood and adolescent cancer: a report from the Childhood Cancer Survivor Study,Journal
of Clinical Oncology, Vol. 23 No. 16, pp. 3733-3741, doi: 10.1200/JCO.2005.06.237.
Rodrigues, D.d.A., Ivo, R.F., Satapathy, S.C., Wang, S., Hemanth, J. and Rebouças Filho, P.P. (2020), A
new approach for classification skin lesion based on transfer learning, deep learning, and IoT
system,Pattern Recognition Letters, pp. 8-15.
Rug (2020), Dermatology database used in MED-NODE, available at: http://www.cs.rug.nl/
imaging/databases/melanoma_naevi/ (accessed 13 May 2020).
Saad, M., Islam, S.M.R. and Fazal, F.B. (2019), Deep residual network-based melanocytic lesion
classification with transfer learning,2019 5th International Conference on Advances in
Electronics Engineering (ICAEE 2019), pp. 160-164, doi: 10.1109/ICAEE48663.2019.8975418.
Salido, J.A.A. and Ruiz, C. Jr. (2018), Using deep learning for melanoma detection in dermoscopy
images,International Journal of Machine Learning and Computing, Vol. 8 No. 1, pp. 61-68,
doi: 10.18178/ijmlc.2018.8.1.664.
Scikit-Learn (2000), Scikit-learn: machine learning in Python scikit-learn 0.22.2 documentation,
available at: https://scikit-learn.org/stable/ (accessed 23 March 2020).
Seaborn Pydata (2020), An introduction to seaborn seaborn 0.10.0 documentation, available at:
https://seaborn.pydata.org/introduction.html (accessed 23 March 2020).
Sisodia, D., Shrivastava, S.K. and Jain, R.C. (2010), ISVM for face recognition,Proceedings 2010
International Conference on Computational Intelligence and Communication Networks, CICN,
pp. 554-559, doi: 10.1109/CICN.2010.109.
Sta
˛por, K. (2017), Evaluation of classifiers: current methods and future research directions,Annals
Computer Science and Information Systems, Vol. 12, pp. 37-40, doi: 10.15439/2017F530.
Techbeamers (2020), Python glob module GlobMethod explaned with examples, available at:
https://www.techbeamers.com/python-glob/ (accessed 23 March 2020).
Tschandl, P., Rosendahl, C. and Kittler, H. (2018), Data descriptor: the HAM10000 dataset, a large
collection of multi-source dermatoscopic images of common pigmented skin lesions,Scientific
Data, Vol. 5, pp. 1-9, doi: 10.1038/sdata.2018.161.
UCI (2020), UCI machine learning repository: data sets, available at: https://archive.ics.uci.edu/ml/
datasets.php (accessed 13 May 2020).
Uddin, M.S. and Bansal, J.C. (2020), Automatic skin lesion segmentation and melanoma detection:
transfer learning approach with U-net and DCNN-SVM,Studies in Computational Intelligence,
Vol. 669, December, pp. 1-481, doi: 10.1007/978-981-13-7564-4.
Waheed, Z., Waheed, A., Zafar, M. and Riaz, F. (2017), An efficient machine learning approach for the
detection of melanoma using dermoscopic images,International Conference on
Communication, Computing and Digital Systems (C-CODE), pp. 316-319.
Wang, M. and Deng, W. (2018), Deep visual domain Adaptation : a survey,Neurocomputing,
Vol. 312, pp. 135-153.
Wang, B. and Pineau, J. (2015), Online boosting algorithms for anytime transfer and multitask
learning,Proceedings of the National Conference on Artificial Intelligence, Vol. 4, pp. 3038-3044.
Weiss, K., Khoshgoftaar, T.M. and Wang, D. (2016), A survey of transfer learning,Journal of Big
Data, Vol. 3, pp. 1-40, doi: 10.1186/s40537-016-0043-6.
Whitaker, U.K.L. (2019), Transfer learning: domain adaptation,Deep Learning for NLP and Speech
Recognition, pp. 495-535.
Xia, Y., Liu, C., Li, Y.Y. and Liu, N. (2017), A boosted decision tree approach using Bayesian hyper-
parameter optimization for credit scoring,Expert Systems with Applications, Vol. 78,
pp. 225-241, doi: 10.1016/j.eswa.2017.02.017.
The
classification of
melanoma skin
cancer
Xu, Z. and Sun, S. (2012), Multi-source transfer learning with multi-view adaboost,International
Conference on Neural Information Processing, pp. 332-339.
Xu, H., Lu, C., Berendt, R., Jha, N. and Mandal, M. (2018), Automated analysis and classification of
melanocytic tumor on skin whole slide images,Computerized Medical Imaging and Graphics,
Vol. 66, June 2017, pp. 124-134, doi: 10.1016/j.compmedimag.2018.01.008.
Yao, Y. and Doretto, G. (2010) Boosting for transfer learning with multiple sources,Proceedings of
the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
pp. 1855-1862, doi: 10.1109/CVPR.2010.5539857.
Zhou, J.T. and Tsang, I.W. (2019), Multi-class heterogeneous domain adaptation,Journal of Machine
Learning Research, Vol. 20, pp. 1-31.
Zunair, H. and Ben Hamza, A. (2020), Melanoma detection using adversarial training and deep
transfer learning,Physics in Medicine and Biology, Nos 1-11, pp. 1-12.
Further reading
Codella, N.C.F., Gutman, D., Celebi, M.F., Helba, B., Marchetti, M.A., Dusza, S.W., Kalloo, A., Liopyris,
K., Mishra, N., Kittler, N. and Halpern, A. (2018), Skin lesion analysis toward melanoma
detection: a challenge at the international symposium on biomedical imaging (ISBI) 2016,
hosted by the international skin imaging collaboration (ISIC),Proceedings of the International
Symposium on Biomedical Imaging, 2018-April, pp. 168-172, doi: 10.1109/ISBI.2018.8363547.
Li, Y. and Shen, L. (2018), Skin lesion analysis towards melanoma detection using deep learning
network,Sensors, Vol. 18 No. 2, p. 556.
Mendonca, T., Ferreira, P.M., Marques, J.S., Marcal, A.R.S. and Rozeira, J. (2013), PH
2
a dermoscopic
image database for research and benchmarking,Annual International Conferences of the IEEE
Engineering in Medicine and Biology Society, pp. 5437-40, doi: 10.1109/EMBC.2013.6610779.
About the authors
Lokesh Singh received the M.E. degree in Computer Science and Engineering from the Institute of
Engineering and Technology, University of Devi Ahilya Vishwavidhyalaya, Indore in 2010. He earned
his B.E. degree in Computer Science and Engineering from MIT Ujjain. He is currently a research scholar
in Information Technology Department, NIT Raipur. His research interests include Machine Learning,
Deep Learning and Image Processing. Lokesh Singh is the corresponding author and can be contacted
at: lsingh.phd2017.it@nitrr.ac.in
Rekh Ram Janghel is serving as an Assistant Professor in the Department of Information
Technology at National Institute of Technology Raipur. He did Ph.D. from Indian Institute of
Information Technology and Management Gwalior and M. Tech from National Institute of Technology,
Raipur (C.G.) in 2007 and B. Tech from Rungta College of Engineering and Technology, Bhilai (C.G) in
2005. He secured the first position in his post-graduation from NIT Raipur. His area of research includes
Deep Learning, Machine Learning, Biomedical Healthcare System, Expert Systems, Neural Networks,
Hybrid Computing and Soft Computing. He has numerous publications in various international journals
and conferences.
Satya Prakash Sahu received the B.E. and M.Tech. Degrees in Computer Science and Engineering
from the Rajiv Gandhi Technological University, Bhopal, India and the Ph.D. degree in Information
Technology from the National Institute of Technology Raipur, India. He is an Assistant Professor in the
Department of Information Technology, NIT Raipur. His research of interest includes artificial
intelligence, machine learning, image processing, medical imaging and soft computing. He has authored
more than 20 research papers in national and international conferences and journals.
For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com
DTA
... Thus, when the available data is rare with unequal class distribution, transfer learning (TL) approaches (Jasil and Ulagamuthalvi, 2021) compensate for data scarcity utilizing auxiliary data (Al-Stouhi, 2013). Transfer AdaBoost (TrAdaBoost) (Dai et al., 2007) and Transfer Constituent Support Vector Machine (TrCSVM) (Singh et al., 2020), boosting-based TL approaches, apply ensemble approaches over the instances of both ends (source and target) alongside an update procedure, incorporating those source examples that are useful in the target example's classification. These approaches conduct such mapping by assigning higher weights to source samples, which improves the target's training and decreases the weights of those instances that induce negative transfer. ...
... To conduct a comparison, one machine learning method AdaBoost (Freund and Schapire, 1999) and two TL methods TrAdaBoost (Pan and Yang, 2010) and TrCSVM (Singh et al., 2020) Table 3. Summarized details of datasets with absoluterarity and high-class imbalance ratio Absolute-rarity in skin lesion datasets adequate towards training. The models in the experiment, trained on dataset "A", are utilized to transfer knowledge to a model trained for classifying the datasets "B" and "C" and vice versa. ...
Article
Full-text available
Purpose Automated skin lesion analysis plays a vital role in early detection. Having relatively small-sized imbalanced skin lesion datasets impedes learning and dominates research in automated skin lesion analysis. The unavailability of adequate data poses difficulty in developing classification methods due to the skewed class distribution. Design/methodology/approach Boosting-based transfer learning (TL) paradigms like Transfer AdaBoost algorithm can compensate for such a lack of samples by taking advantage of auxiliary data. However, in such methods, beneficial source instances representing the target have a fast and stochastic weight convergence, which results in “weight-drift” that negates transfer. In this paper, a framework is designed utilizing the “Rare-Transfer” (RT), a boosting-based TL algorithm, that prevents “weight-drift” and simultaneously addresses absolute-rarity in skin lesion datasets. RT prevents the weights of source samples from quick convergence. It addresses absolute-rarity using an instance transfer approach incorporating the best-fit set of auxiliary examples, which improves balanced error minimization. It compensates for class unbalance and scarcity of training samples in absolute-rarity simultaneously for inducing balanced error optimization. Findings Promising results are obtained utilizing the RT compared with state-of-the-art techniques on absolute-rare skin lesion datasets with an accuracy of 92.5%. Wilcoxon signed-rank test examines significant differences amid the proposed RT algorithm and conventional algorithms used in the experiment. Originality/value Experimentation is performed on absolute-rare four skin lesion datasets, and the effectiveness of RT is assessed based on accuracy, sensitivity, specificity and area under curve. The performance is compared with an existing ensemble and boosting-based TL methods.
... These innovative features are expected to contribute to improved skin lesion classification. Singh et al. [39] focus on a novel approach called TrCSVM for the classification of melanoma skin cancer. Transfer learning forms the basis of this approach. ...
Article
Full-text available
Melanoma skin cancer, an aggressive neoplasm arising from the malignant proliferation of melanocytes, represents a formidable challenge in the field of oncology due to its high metastatic potential and significant mortality rates. The timely identification of skin cancer plays a pivotal role in prevention and can substantially mitigate the incidence of certain skin malignancies, notably squamous cell carcinoma and melanoma, which tend to have a higher likelihood of successful treatment when detected in their early stages. Various researchers have suggested both automated and conventional methods for precise lesion segmentation to diagnose medical conditions associated with melanoma lesions. Nevertheless, the substantial visual resemblance among lesions and the significant intraclass variations pose challenges, resulting in the reduced accuracy in terms of performance. To alleviate these issues, we have proposed an automated skin cancer diagnosis framework known as Multi-scale GC-T 2. In our work, we have utilized DermIS and DermQuest datasets in which several pre-processing techniques are applied in terms of Noise reduction and Data Augmentation using Median Enhanced Weiner Filter (MEWF) for enhancing the image quality. Besides, an Enriched Manta-Ray Optimization Algorithm (ENMAR) is adapted for ensuring the quality of pre-processed images. Also, for minimizing the model complexity, an appropriate lesion area is segmented accurately by the integration of semantic segmentation and DRL approach (i.e., Advanced Deep Q Network (AdDNet) and HAar-U-Net (HAUNT)). Following that, we designed a classifier Multi-scale GC-T 2 , where appropriate features are extracted using Multi-scale Graph Convolution Network (M-GCN). We have then proposed a punishment and reward mechanism for enhancing the feature processing, and tri-movement attention mechanism is utilized for minimizing feature dimensionality. Finally, the feature maps are fused using tri-level feature fusion module, and the sigmoid function is incorporated for classifying skin cancer. The proposed Multi-scale GC-T 2 research is carried out using the MATLAB 2020A, and the performance of the proposed model is validated by evaluating metrics, such as accuracy, sensitivity, specificity and f1-score. The experimental results unequivocally highlight the proposed Multi-scale GC-T 2 framework's superiority over existing models.
... A novel approach is reported in the work of Singh et al. [10], they have reported the transfer learning (TL) framework Transfer Constituent Support Vector Machine (TrCSVM) from this method they have obtained the 98.82% overall recognition accuracy. ...
Article
Full-text available
Skin cancer with a high fatality rate is called melanoma. Due to the great degree of similarities among the many forms of skin lesions, a proper diagnosis cannot be made. Dermatologists can treat patients and save their lives by accurately classifying skin lesions in their early stages. This paper proposes a model for highly accurate skin lesion classification. The proposed model made use of transfer learning models known as GoogleNet and vgg16. This model efficiently distinguished between benign and malignant cancerous skin lesions, those are the two distinct classes of skin diseases. The 1800 benign cancer images and 1498 malignant cancer images that were retrieved from the internet were taken into account for this proposed strategy. The VGG16 has obtained the highest recognition accuracy in the result accessing, with recognition rates of 99.62% for training and 84.97% for validation.
... A total of 3672 images from different sources were used to evaluate the diagnosis efficiency of the proposed algorithm which achieved an accuracy of 96.47%. Lokash et al. [40] introduced a framework named Transfer Constituent Support Vector Machine (TrCSVM) based on transfer learning (TL) for the classification of melanoma from skin lesions using feature-based domain adaption (FBDA). The presented framework comprises of support vector machine (SVM) and Transfer AdaBoost (TrAdaBoost). ...
... Human aging may originate from the decrease in functional human body productivity, the increment in the risk of cardiovascular episodes, chronic degenerative diseases such as melanomas or skin lesions and muscle-skeletal diseases, among others [26][27][28][29]. In consequence, geriatric medical cares are indispensable for reducing physical accidents or clinic problems that put at risk the life of patients [10]. ...
Article
This study describes the development (design, construction, instrumentation, and control) of a nursing mobile robotic device to monitor vital signals in home-cared patients. The proposed device measures electrocardiography potentials, oxygen saturation, skin temperature, and non-invasive arterial pressure of the patient. Additionally, the nursing robot can supply assistance in the gait cycle for people who require it. The robotic device’s structural and mechanical components were built using 3D-printed techniques. The instrumentation includes electronic embedded devices and sensors to know the robot’s relative position with respect to the patient. With this information together with the available physiological measurements, the robot can work in three different scenarios: (a) in the first one, a robust control strategy regulates the mobile robot operation, including the tracking of the patient under uncertain working scenarios leading to the selection of an appropriate sequence of movements; (b) the second one helps the patients, if they need it, to perform a controlled gait-cycle during outdoors and indoors excursions; and (c) the third one verifies the state of health of the users measuring their vital signs. A graphical user interface (GUI) collects, processes, and displays the information acquired by the bioelectrical amplifiers and signal processing systems. Moreover, it allows easy interaction between the nursing robot, the patients, and the physician. The proposed design has been tested with five volunteers showing efficient assistance for primary health care. Main stages of the home-care nursing controlled mobile robot
... A total of 3672 images from different sources were used to evaluate the diagnosis efficiency of the proposed algorithm which achieved an accuracy of 96.47%. Lokash et al. [40] introduced a framework named Transfer Constituent Support Vector Machine (TrCSVM) based on transfer learning (TL) for the classification of melanoma from skin lesions using feature-based domain adaption (FBDA). The presented framework comprises of support vector machine (SVM) and Transfer AdaBoost (TrAdaBoost). ...
Article
Skin cancer is a deadly disease, and its early diagnosis enhances the chances of survival. Deep learning algorithms for skin cancer detection have become popular in recent years. A novel framework based on deep learning is proposed in this study for the multiclassification of skin cancer types such as Melanoma, Melanocytic Nevi, Basal Cell Carcinoma and Benign Keratosis. The proposed model is named as SCDNet which combines Vgg16 with convolutional neural networks (CNN) for the classification of different types of skin cancer. Moreover, the accuracy of the proposed method is also compared with the four state-of-the-art pre-trained classifiers in the medical domain named Resnet 50, Inception v3, AlexNet and Vgg19. The performance of the proposed SCDNet classifier, as well as the four state-of-the-art classifiers, is evaluated using the ISIC 2019 dataset. The accuracy rate of the proposed SDCNet is 96.91% for the multiclassification of skin cancer whereas, the accuracy rates for Resnet 50, Alexnet, Vgg19 and Inception-v3 are 95.21%, 93.14%, 94.25% and 92.54%, respectively. The results showed that the proposed SCDNet performed better than the competing classifiers.
... Because the distribution ratio of lesion pictures in different skin lesion datasets may vary, a large quantity of labelled data is required to train the classification models and maintain adequate classification performance. Dermoscopy imaging methods are utilised to examine the skin lesion at a deeper level, which aids in melanoma diagnosis [14]. In dermatology, for example, ocular evaluation of a skin lesion reveals a substantial mistake rate when compared to gold standard pathology. ...
Article
Full-text available
To distinguish melanoma from other skin illnesses, doctors examine pigmented lesions on the skin. Damaged DNA causes cells to expand uncontrollably, and the rate of growth is currently increasing rapidly. Melanoma is a kind of skin cancer induced by UV radiation from the sun, with a survival rate of about 15-20 percent. Increased UV light on the earth's surface is also aiding the spread of skin cancer throughout the globe. Melanoma is diagnosed late, resulting in severe malignancy and spread to other bodily organs such as the liver, lungs, and brain. Melanoma diagnosis from dermoscopic skin samples automatically is a difficult problem. The purpose of Computer Vision, Machine Learning, and Deep Learning in the era of digital pictures is to extract information from them and develop new knowledge. Deep convolutional neural network (DCNN) models have been extensively studied for skin disease detection, with some achieving diagnostic results that are equivalent to or even better than dermatologists. Pre-processing involves first applying a filter or kernel to reduce noise and artefacts, then trained on a variety of tiny, unbalanced datasets to ensure that the moderately complicated models outperform the bigger models. Finally, to minimise overfitting, regularization DropOut is introduced.
... Deep learning algorithms extract relevant feature vectors in images and automatically classify them with pre-trained networks. Transfer learning is applied to pre-trained networks [56][57][58] to extract resources, and supervised learning [59,60] to classify samples. ...
Article
Full-text available
Skin cancer is a complex public health problem and one of the most common types of cancer worldwide. A biopsy of the skin lesion gives the definitive diagnosis of skin cancer. However, before the definitive diagnosis, specialists observe some symptoms that justify the request for a biopsy and consider a early diagnosis. Early diagnosis of skin cancer is subject to errors due to the lack of experience of specialists and similar characteristics with other diseases. This work proposes a CNN architecture, called EfficientAttentionNet, to provide early diagnosis of melanoma and non-melanoma skin lesions. The methodology represents the stages of development of the proposed classification model and the benefits of each stage. In the first step, the set of images from the International Society for Digital Skin Imaging (ISDIS) is pre-processed to eliminate the hair around the skin lesion. Then, a Generative Adversarial Networks (GAN) model generates synthetic images to balance the number of samples per class in the training set. In addition, a U-net model creates masks for regions of interest in the images. Finally, EfficientAttentionNet training with the mask-based attention mechanism to classify skin lesions. The proposed model achieved high performance, being a reference for future research in the classification of skin lesions.
Article
Full-text available
Skin cancer is the most prevalent kind of cancer in people. It is estimated that more than 1 million people get skin cancer every year in the world. The effectiveness of the disease’s therapy is significantly impacted by early identification of this illness. Preprocessing is the initial detecting stage in enhancing the quality of skin images by removing undesired background noise and objects. This study aims is to compile preprocessing techniques for skin cancer imaging that are currently accessible. Researchers looking into automated skin cancer diagnosis might use this article as an excellent place to start. The fully convolutional encoder–decoder network and Sparrow search algorithm (FCEDN-SpaSA) are proposed in this study for the segmentation of dermoscopic images. The individual wolf method and the ensemble ghosting technique are integrated to generate a neighbour-based search strategy in SpaSA for stressing the correct balance between navigation and exploitation. The classification procedure is accomplished by using an adaptive CNN technique to discriminate between normal skin and malignant skin lesions suggestive of disease. Our method provides classification accuracies comparable to commonly used incremental learning techniques while using less energy, storage space, memory access, and training time (only network updates with new training samples, no network sharing). In a simulation, the segmentation performance of the proposed technique on the ISBI 2017, ISIC 2018, and PH2 datasets reached accuracies of 95.28%, 95.89%, 92.70%, and 98.78%, respectively, on the same dataset and assessed the classification performance. It is accurate 91.67% of the time. The efficiency of the suggested strategy is demonstrated through comparisons with cutting-edge methodologies.
Article
Full-text available
Melanoma is a type of skin cancer with a high mortality rate. The different types of skin lesions result in an inaccurate diagnosis due to their high similarity. Accurate classification of the skin lesions in their early stages enables dermatologists to treat the patients and save their lives. This paper proposes a model for a highly accurate classification of skin lesions. The proposed model utilized the transfer learning and pre-trained model with GoogleNet. The model parameters are used as initial values, and then these parameters will be modified through training. The latest well-known public challenge dataset, ISIC 2019, is used to test the ability of the proposed model to classify different kinds of skin lesions. The proposed model successfully classified the eight different classes of skin lesions, namely, melanoma, melanocytic nevus, basal cell carcinoma, actinic keratosis, benign keratosis, dermatofibroma, vascular lesion, and Squamous cell carcinoma. The achieved classification accuracy, sensitivity, specificity, and precision percentages are 94.92%, 79.8%, 97%, and 80.36%, respectively. The proposed model can detect images that do not belong to any one of the eight classes where these images are classified as unknown images.
Article
Full-text available
Melanoma remains the most harmful form of skin cancer. Convolutional neural network (CNN) based classifiers have become the best choice for melanoma detection in the recent era. The research has indicated that classifiers based on CNN classify skin cancer images equivalent to dermatologists, which has allowed a quick and life-saving diagnosis. This study provides a systematic literature review of the latest research on melanoma classification using CNN. We restrict our study to the binary classification of melanoma. In particular, this research discusses the CNN classifiers and compares the accuracies of these classifiers when tested on non-published datasets. We conducted a systematic review of existing literature, identifying the literature through a systematic search of the IEEE, Medline, ACM, Springer, Elsevier, and Wiley databases. A total of 5112 studies were identified out of which 55 well-reputed studies were selected. The main objective of this study is to collect state of the art research which identify the recent research trends, challenges and opportunities for melanoma diagnosis and investigate the existing solutions for the diagnosis of melanoma detection using deep learning. Moreover, proposed taxonomy for melanoma detection has been presented that summarizes the broad variety of existing melanoma detection solutions. Lastly, proposed model, challenges and opportunities have been presented which helps the researchers in the domain of melanoma detection.
Article
Full-text available
In this paper, a new Computer-Aided Detection (CAD) system for the detection and classification of dangerous skin lesions (melanoma type) is presented, through a fusion of handcraft features related to the medical algorithm ABCD rule (Asymmetry Borders-Colors-Dermatoscopic Structures) and deep learning features employing Mutual Information (MI) measurements. The steps of a CAD system can be summarized as preprocessing, feature extraction, feature fusion, and classification. During the preprocessing step, a lesion image is enhanced, filtered, and segmented, with the aim to obtain the Region of Interest (ROI); in the next step, the feature extraction is performed. Handcraft features such as shape, color, and texture are used as the representation of the ABCD rule, and deep learning features are extracted using a Convolutional Neural Network (CNN) architecture, which is pre-trained on Imagenet (an ILSVRC Imagenet task). MI measurement is used as a fusion rule, gathering the most important information from both types of features. Finally, at the Classification step, several methods are employed such as Linear Regression (LR), Support Vector Machines (SVMs), and Relevant Vector Machines (RVMs). The designed framework was tested using the ISIC 2018 public dataset. The proposed framework appears to demonstrate an improved performance in comparison with other state-of-the-art methods in terms of the accuracy, specificity, and sensibility obtained in the training and test stages. Additionally, we propose and justify a novel procedure that should be used in adjusting the evaluation metrics for imbalanced datasets that are common for different kinds of skin lesions.
Article
Full-text available
Skin lesion datasets consist predominantly of normal samples with only a small percentage of abnormal ones, giving rise to the class imbalance problem. Also, skin lesion images are largely similar in overall appearance owing to the low inter-class variability. In this paper, we propose a two-stage framework for automatic classification of skin lesion images using adversarial training and transfer learning toward melanoma detection. In the first stage, we leverage the inter-class variation of the data distribution for the task of conditional image synthesis by learning the inter-class mapping and synthesizing under-represented class samples from the over-represented ones using unpaired image-to-image translation. In the second stage, we train a deep convolutional neural network for skin lesion classification using the original training set combined with the newly synthesized under-represented class samples. The training of this classifier is carried out by minimizing the focal loss function, which assists the model in learning from hard examples, while down-weighting the easy ones. Experiments conducted on a dermatology image benchmark demonstrate the superiority of our proposed approach over several standard baseline methods, achieving significant performance improvements. Interestingly, we show through feature visualization and analysis that our method leads to context based lesion assessment that can reach an expert dermatologist level.
Article
Full-text available
Abstract Melanoma is considered to be one of the deadliest skin cancer types, whose occurring frequency elevated in the last few years; its earlier diagnosis, however, significantly increases the chances of patients’ survival. In the quest for the same, a few computer based methods, capable of diagnosing the skin lesion at initial stages, have been recently proposed. Despite some success, however, margin exists, due to which the machine learning community still considers this an outstanding research challenge. In this work, we come up with a novel framework for skin lesion classification, which integrates deep features information to generate most discriminant feature vector, with an advantage of preserving the original feature space. We utilize recent deep models for feature extraction, and by taking advantage of transfer learning. Initially, the dermoscopic images are segmented, and the lesion region is extracted, which is later subjected to retrain the selected deep models to generate fused feature vectors. In the second phase, a framework for most discriminant feature selection and dimensionality reduction is proposed, entropy-controlled neighborhood component analysis (ECNCA). This hierarchical framework optimizes fused features by selecting the principle components and extricating the redundant and irrelevant data. The effectiveness of our design is validated on four benchmark dermoscopic datasets; PH2, ISIC MSK, ISIC UDA, and ISBI-2017. To authenticate the proposed method, a fair comparison with the existing techniques is also provided. The simulation results clearly show that the proposed design is accurate enough to categorize the skin lesion with 98.8%, 99.2% and 97.1% and 95.9% accuracy with the selected classifiers on all four datasets, and by utilizing less than 3% features.
Article
Full-text available
The main purpose of the study was to develop a high accuracy system able to diagnose skin lesions using deep learning–based methods. We propose a new decision system based on multiple classifiers like neural networks and feature–based methods. Each classifier (method) gives the final decision system a certain weight, depending on the calculated accuracy, helping the system make a better decision. First, we created a neural network (NN) that can differentiate melanoma from benign nevus. The NN architecture is analyzed by evaluating it during the training process. Some biostatistic parameters, such as accuracy, specificity, sensitivity, and Dice coefficient are calculated. Then, we developed three other methods based on convolutional neural networks (CNNs). The CNNs were pre-trained using large ImageNet and Places365 databases. GoogleNet, ResNet-101, and NasNet-Large, were used in the enumeration order. CNN architectures were fine-tuned in order to distinguish the different types of skin lesions using transfer learning. The accuracies of the classifications were determined. The last proposed method uses the classical method of image object detection, more precisely, the one in which some features are extracted from the images, followed by the classification step. In this case, the classification was done by using a support vector machine. Just as in the first method, the sensitivity, specificity, Dice similarity coefficient and accuracy are determined. A comparison of the obtained results from all the methods is then done. As mentioned above, the novelty of this paper is the integration of these methods in a global fusion-based decision system that uses the results obtained by each individual method to establish the fusion weights. The results obtained by carrying out the experiments on two different free databases shows that the proposed system offers higher accuracy results.
Article
Full-text available
Domain adaptation leverages the knowledge in one domain - the source domain - to improve learning efficiency in another domain - the target domain. Existing heterogeneous domain adaptation research is relatively well-progressed, but only in situations where the target domain contains at least a few labeled instances. In contrast, heterogeneous domain adaptation with an unlabeled target domain has not been well-studied. To contribute to the research in this emerging field, this paper presents: (1) an unsupervised knowledge transfer theorem that guarantees the correctness of transferring knowledge; and (2) a principal angle-based metric to measure the distance between two pairs of domains: one pair comprises the original source and target domains and the other pair comprises two homogeneous representations of two domains. The theorem and the metric have been implemented in an innovative transfer model, called a Grassmann-Linear monotonic maps-geodesic flow kernel (GLG), that is specifically designed for heterogeneous unsupervised domain adaptation (HeUDA). The linear monotonic maps meet the conditions of the theorem and are used to construct homogeneous representations of the heterogeneous domains. The metric shows the extent to which the homogeneous representations have preserved the information in the original source and target domains. By minimizing the proposed metric, the GLG model learns the homogeneous representations of heterogeneous domains and transfers knowledge through these learned representations via a geodesic flow kernel. To evaluate the model, five public datasets were reorganized into ten HeUDA tasks across three applications: cancer detection, credit assessment, and text classification. The experiments demonstrate that the proposed model delivers superior performance over the existing baselines.
Article
Melanoma skin cancer is one of the most common diseases in the world. It is essential to diagnose melanoma at an early stage. Visual inspection during the medical examination of skin lesions is not a simple task, as there is a similarity between lesions. Also, medical experience and disposition can result in inaccurate diagnoses. Technologies such as the Internet of Things (IoT) have helped to create effective health systems. Doctors can use them anywhere, with the guarantee that more people can be diagnosed without prejudice to subjective factors. Transfer Learning and Deep Learning are increasingly significant in the clinical diagnosis of different diseases. This work proposes the use of Transfer Learning and Deep Learning in an IoT system to assist doctors in the diagnosis of common skin lesions, typical nevi, and melanoma. This work uses Convolutional Neural Networks (CNNs) as resource extractors. The CNN models used were: Visual Geometry Group (VGG), Inception, Residual Networks (ResNet), Inception-ResNet, Extreme Inception (Xception), MobileNet, Dense Convolutional Network (DenseNet), and Neural Architecture Search Network (NASNet). For the classification of injuries, the Bayes, Support Vector Machines (SVM), Random Forest (RF), Perceptron Multilayer (MLP), and the K-Nearest Neighbors (KNN) classifiers are used. This study used two datasets: the first provided by the International Skin Imaging Collaboration (ISIC) at the International Biomedical Imaging Symposium (ISBI); the second is PH². For ISBI-ISIC, this study examined lesions between nevi and melanomas. In PH², this work analyzed the diagnosis based on lesions of common nevus, atypical nevi, and melanomas. The DenseNet201 extraction model, combined with the KNN classifier achieved an accuracy of 96.805 % for the ISBI-ISIC dataset and 93.167 % for the PH². Thus, an approach focused on the IoT system is reliable and efficient for doctors who assist in the diagnosis of skin lesions.
Chapter
Domain adaptation is a form of transfer learning, in which the task remains the same, but there is a domain shift or a distribution change between the source and the target. As an example, consider a model that has learned to classify reviews on electronic products for positive and negative sentiments, and is used for classifying the reviews for hotel rooms or movies. The task of sentiment analysis remains the same, but the domain (electronics and hotel rooms) has changed. The application of the model to a separate domain poses many problems because of the change between the training data and the unseen testing data, typically known as domain shift. For example, sentences containing phrases such as “loud and clear” will be mostly considered positive in electronics whereas negative in hotel room reviews. Similarly, usage of keywords such as “lengthy” or “boring” which may be prevalent in domains such as book reviews might be completely absent in domains such as kitchen equipment reviews.