Content uploaded by Abdulkadir Albayrak
Author content
All content in this area was uploaded by Abdulkadir Albayrak on Mar 12, 2019
Content may be subject to copyright.
A Hybrid Method of Superpixel Segmentation
Algorithm and Deep Learning Method in
Histopathological Image Segmentation
Abdulkadir Albayrak, Gokhan Bilgin
Department of Computer Engineering
Yildiz Technical University, 34220 Istanbul, Turkey
{albayrak, gbilgin}@yildiz.edu.tr
Abstract—Manual analysis of cell morphology in high resolu-
tional histopathological images is a tedious and time consuming
task for pathologists. In recent years, computer assisted diag-
nostic systems have gained considerable importance in order to
assist the pathologists for analyzing cellular structures. In this
study, the simple linear iterative clustering (SLIC) superpixel
segmentation method and convolutional neural network are
combined to segment the cellular structures in histopathological
images. The proposed study is mainly composed of two stages.
First, SLIC superpixel method was used as a pre-segmentation
algorithm to perform segmentation of cellular superpixels and
non-cellular superpixels. Then convolutional neural networks
(CNN) based deep learning algorithm is used to classify those
superpixels in order to obtain the final segmentation of the
whole image. The overall accuracy of the system at classifying
the superpixels was observed to be 0.9876. The analysis and
confusion matrix of the study was also presented in experimental
studies section.
Index Terms—Histopathological images, cell segmentation,
SLIC superpixel algorithm, CNN, deep learning.
I. INTRODUCTION
The morphological examination of cellular structures is
very crucial step in terms of early detection and disease
progression. Especially in recent years, with the development
of imaging technologies, it became possible to reach more
detailed information about the cells. Analyzing the cells one-
by-one is very important when it is thought that cancerous
cells have different characteristics than healthy cells. The
pathologists examine the morphological changes in suspicious
tissues and organs with high resolution imaging devices, which
makes it easier to diagnose the disease. Depending on the
imaging device, it may be possible to scan a very large area
of the related tissue. Therefore, analysis of a large number
of cellular structures can be difficult and time consuming
for pathologists. This problem has led to the need for the
development of secondary decision support systems which can
analyze the images obtained from imaging devices.
In recent years, especially in medical field it has started to
develop secondary decision support systems for pathologists
to be used in disease detection with pathological examination.
These systems use image processing and machine learning
techniques to distinguish cellular structures from background
areas such as fat and connective tissue. A lot of study has
recently been reported in the literature in this area. These
studies are mostly based on thresholding, cluster-based and
graph-based segmentation methods [1]–[4].
As an alternative to thresholding-based, clustering-based
and graph-based algorithms, superpixel based algorithms have
begun to be used for image segmentation in recent years. The
application of superpixel segmentation method in biomedical
images is still limited although it is widely applied in the field
of computer vision especially the salience detection [5]. While
superpixel approach is applied in some biomedical image
processing areas such as segmentation of brain MRI images,
optic disc segmentation, glaucoma screening, there are not a
few studies on superpixel approach in digital histopathological
images [6]–[9]. Superpixel approach is important for the
histopathological image processing field when it is thought
that cellular structures have different characteristics from local
regions.
Unlike the superpixel method, deep learning algorithms
have been used extensively in medical image processing field.
Deep learning is an neural network approach which can be
used both for the classification and segmentation of digital
images [10]–[12]. The contribution of recently developed
hardware technology is very important. It has achieved promis-
ing results in the fields of content based image retrieval,
object recognition and biomedical image processing after used
in handwriting recognition [13]. In our proposed study, the
superpixels obtained from high resolution histopathological
images were classified by a convolutional neural network
(CNN) algorithm according to the labels given in the ground
truth.
The paper is organized as follows: Section II describes
the proposed method which first creates the superpixels and
classify those superpixels with the proposed convolutional
neural network. In section III, data set description, parameter
settings and experimental results are provided. In final section
conclusion and future works are discussed.978-1-5386-5150-6/18/$31.00 c
2018 IEEE
II. MATE RI AL A ND ME TH OD S
A. Pre-segmentation of Histopathological Images with Simple
Linear Iterative Clustering (SLIC) Algorithm
In the last decades, the development of imaging technology
has led to an increase in the number of medical digital images.
The extraordinary increase in data is the biggest obstacle for
getting and creating labeled data data. However, with the
development of image processing techniques and machine
learning methods, it has become possible for these data to
be processed without labeling or labeled with a very small
amount.
In this study, SLIC superpixel method was used as pre-
segmentation algorithm to perform segmentation of cellular
and non-cellular structures in digital histopathological images
[14]. The superpixel method used in this study is a method
that performs the segmentation process depending on the pixel
intensity values (the values of R, G, and B value of pixels) and
spatial distance of the pixels (x and y coordinate information)
to the center of superpixels in the image. The Equation 1
represents the process performed by the SLIC segmentation
method for each pixel in the proposed study.
drgb =q(rj−ri)2+ (gj−gi)2+ (bj−bi)2(1)
where, jrepresents the center pixel, and irepresents the
value to be clustered. The value of drgb is the distance of
the corresponding pixel to the center. r, g and brepresent the
brightness values of the respective pixels. The Equation 2 also
represents the distance of the coordinates of each pixel to the
related cluster center:
dxy =q(xj−xi)2+ (yj−yi)2(2)
where, xjand yjare the horizontal and vertical coordinate
information of each center pixel, and xiand yivalues are the
coordinate information of each pixel to be clustered:
ds=drgb +m/N ×dxy (3)
the value of dsis the sum of the (x, y) plane distance
normalized by the grid interval N and the lab distance. Within
here, normalization is done so that the calculation of the
coordinate information does not directly affect the brightness
interval. The value of mis defined to set the compactness of
superpixels.
The local shape of each superpixel obtained in an image de-
pends on the mparameter. As can be shown in the Equation 3
that the low value of the mparameter reduces the effect of
coordinate information. This ensures that each superpixel has
a convex or concave shape so that the related superpixels will
not be an optimal input image for the deep learning method
in the next step. If the mparameter has a high value, each
superpixel will have almost a square shape. Since it is aimed to
classify the superpixels by the deep learning method after the
pre-segmentation process, it can be said that an approximate
quadratic structure making it possible the superpixel to be
classified with a higher success rate by deep learning method.
B. Classification of Superpixels via Convolutional Neural Net-
works
Convolutional neural networks (CNN) is one of the deep
learning algorithms proposed by LeCun et. al. in 1998 [15].
The CNN was implemented to small-size data at the time when
it was first proposed due to memory constraint. However, with
the development of hardware technology, it has begun to be
applied to large-scale data in recent years [16], [17]. There are
basically six process steps in the CNN algorithm:
•Image Input Layer: At this stage, each image to be used
for training or testing passes through a certain preliminary
process in order to become an appropriate input. This pre-
processing could be re-sizing the image to the appropriate
sizes or converting the image from one channel to three
channel (from grayscale to color image) if needed.
•Convolutional Layer: At this stage, each image in the
data set which is used for training is convolved with k
number of generalized kernels to create activation maps.
The activation maps are then used for further analysis
at the next stages. The first convolutional layer provides
low level features to be extracted. The padding and stride
size could be change depending on the size of the input
image.
•Rectified Linear Unit (ReLU): The activation layer (which
is also defined as ReLU layer )is generally applied after
the convolution layer. The main purpose of this layer
is to transfer the activation maps obtained from the
convolution layer to a non-linear plane to better analysis.
•Pooling Layer: The down sampling is done at this stage
to provide a hierarchical analysis to the data. More high-
level features could be extracted if the number of layers
are increased after this stage. The maximum or average
of the relevant values is selected while down-sampling
the data.
•Fully Connected Layer: this layer provides the corre-
lations of particular class to the high level features
which were extracted at the previous layers. Several fully
connected layers could be used in a CNN algorithm. The
key point is that the number of output of the last fully
connected layer must be equal with the number of classes.
•Classification (Softmax) Layer: Softmax layer is the final
output layer in CNN algorithm that performs multi-class
classification.
The CNN algorithm proposed in this paper consists of
two convolutional layers, two ReLU layers, two max-pool
layers and three fully connected layers followed by a softmax
classification layer.
III. EXP ER IM EN TAL STU DI ES
A. Data Set Description
The data set used in this study is obtained from Beck Lab-
oratory at Harvard University. The data set consisted of high-
resolution histopathological images of renal cell carcinoma
selected from The Cancer Genome Atlas (TCGA) data portal
and publicly available for usage. There are 810 high-resolution
400 ×400, histopathological images of 10 kidney renal cell
carcinomas. Images were scanned using 40×magnification.
TCGA is a high-scale cancer research organization financed
by the American National Cancer Institute and the National
Human Genome Research Institute. In addition, TCGA con-
ducts surveys to find solutions to the 25 most common cancer
types. In addition to collecting molecular and clinical data,
TCGA also obtains whole slide images (WSI) under cancer
research. Fig. 1 represents sample images taken from the data
set and the ground truth images annotated by pathologists of
those images. The data set was introduced in details in [18].
Fig. 1: Image samples of renal cell carcinoma taken from the
data set
The proposed study investigates the success of combination
of superpixel segmentation algorithm and CNN algorithm on
cell segmentation in histopathological images. The study is
basically composed of two steps; obtaining of superpixels by
the SLIC method and the classification of these superpixels
by the CNN method. The processing steps followed in this
proposed method is represented in Fig. 2.
B. Pre-segmentation of Images by Using (SLIC) Method
In the first phase of the study, each input image in the
data set was smoothed with 5×5median filter to remove
noise and various light effects. Each of the cleaned image then
divided into approximately 3000 superpixels. The correspond-
ing superpixels have been labeled as cellular or non-cellular
according to the ground-truth accompanied with the data set.
The superpixels which have unproportional dimensions (such
as 1×10,10 ×2etc.) were eliminated due to the shape
irregularity. Determination of the number of superpixels and
adaption of the obtained superpixels as input of deep learning
model two important key points need to be considered. An
average cell has approximately 30 ×30 sizes, so the optimal
number of superpixels for a 400 ×400 image in this data
set can range from 250 to 750. So, the interval of values
TABLE I: OVERALL RESULTS O F TH E CONFUSION MATRIX
OBTAINED FOR THE PROP OS ED CNN MODEL
Predicted
Cell. SP Noncell. SP Time
Actual Cell. SP 0.9451 0.0549 ∼7750 sec.
Noncell. SP 0.0083 0.9917
were kept wide because the similarity of neighbour pixels
(i.e. a large fat tissue could be a large single superpixel) may
change from image to image. Unlike the number of superpixel,
the shape of the superpixel is also crucial. Considering the
spatial similarities of neighbour pixels, a superpixel which
includes those pixels may have a convex or concave shape.
This may negatively affect the classification performance of
the CNN algorithm since re-sizing the image could change
the shape of the superpixel. For this reason, it is proposed
that the histopathological images should be divided into a high
number of superpixels to ensure that the superpixel remains
at a certain size. In the first phase of this work, each image
is divided into 3000 superpixels. This number was determined
empirically, but in future studies it may become possible to
automate this number by performing a detailed analysis of
the number of superpixels. As can be seen in Equation 3, the
mparameter represents the ’compactness’ of a superpixels.
The higher value of m, the more regular shape of superpixels.
Assigning a small number of mmeans to reduce the effect of
coordinate information. If a quadratic-like image can be given
as input to the CNN algorithm, this contributes to increasing
the classification success of the algorithm. Depending on the
pW idth ×Height/NumberofSuperpixels equation, giv-
ing approximately 50 to mvalue provides superpixels to have
quadratic dimensions. In this study, the mparameter is set to
50. Depending on the parameters stated in the first phase of the
study, cellular and non-cellular superpixels were determined
and in this way a two-class structure was established.
In the second phase of the proposed study, cellular and
non-cellular superpixels are classified by using the proposed
CNN algorithm. Each superpixel obtained in the first stage
has approximately the same aspect ratio (12 ×13). In this
context, they have not undergone too many formal changes
to be appropriate for the CNN algorithm. The sizes of each
superpixel has resized to 15×15 to be used as an regular input
image to the proposed CNN algorithms since the dimensions
of the input images of the CNN algorithm are 15 ×15. Table
I shows the success of the proposed CNN algorithm when the
parameters specified in Section II were used.
C. Experimental Results
Table I represents overall results of the confusion matrix
obtained for the proposed CNN algorithm when trained and
tested on five iteration, 100 epoch and 5-fold cross-validation
for each iteration. The column information in the table in-
dicates cellular superpixels, non-cellular superpixels, and the
time performance of the algorithm, respectively. The row
information represents cellular superpixels and non-cellular su-
perpixels, respectively. Based on the results, the classification
Fig. 2: The processing steps followed in this proposed method. Each image in the data set is divided into cellular and non-cellular
superpixels (left). The obtained superpixels are classified with the proposed deep learning algorithm (right)
performance of the proposed CNN algorithm was obtained to
be 0.9451 at cellular superpixels and 0.9917 at non-cellular
superpixels. The overall accuracy of the system was observed
to be 0.9876. When the run time was evaluated, the overall
classification of the proposed CNN method was approximately
7750 seconds. The number of iterations and the number of
epochs is crucial for CNN algorithm to model the data and to
classify it with accurate results. As the number of iterations
and the number of epochs in each iteration increase, the
success of the CNN algorithm to learn the data increases. This
is also important for the proposed algorithm to show that the
result is not accidental.
All experiments in this paper have implemented on a work-
station with a 64-bit Windows 10 operating system, Intel (R)
Core (TM) i5-6200K CPU 2.30 GHz, 8 GB of RAM, NVIDIA
GeForce 940M graphics card processor and MATLAB version
2018a.
IV. CONCLUSION AND FUTURE WOR KS
In this study, SLIC superpixel segmentation method and
convolutional neural network have been combined for seg-
menting the cellular structures in high resolutional histopatho-
logical images. The proposed study composed of two stages:
Firstly, SLIC superpixel spatial clustering method was used
as a pre-segmentation algorithm to perform segmentation of
cellular superpixels and non-cellular superpixels. Then CNN
based deep learning algorithm was used to classify those
superpixels in order to obtain the final segmentation of the
whole image. Kidney renal cell carcinomas histopathological
images of The Cancer Genome Atlas (TCGA) data portal was
used in this study to evaluate the success of the proposed
method. The overall accuracy of the system was observed to
be 0.9876. The proposed CNN algorithm trained and tested
on five iteration, 100 epoch and 5-fold cross-validation for
each iteration. The results showed that combining the SLIC
superpixel segmentation algorithm and CNN algorithm could
perform accurate results in segmentation of histopathological
images. Although the use of the CNN algorithm may be
slower compared to other algorithms (such as thresholding-
based, clustering-based algorithms) due to the long duration
of the training phase, this can be compensated in the test
phase. A more detailed study which compare the combination
of superpixel and CNN algorithm with other well known
segmentation algorithms could be a good alternative study to
compare the time and segmentation performance. In addition,
the proposed algorithm can be improved by performing more
detailed analysis of the number of superpixels and the m
parameter which effects the shape of each superpixels.
REFERENCES
[1] S. Biswas and D. Ghoshal, “Blood cell detection using thresholding
estimation based watershed transformation with sobel filter in frequency
domain,” Procedia Computer Science, vol. 89, pp. 651–657, 2016.
[2] K. Y. Win and S. Choomchuay, “Automated segmentation of cell nuclei
in cytology pleural fluid images using otsu thresholding,” in Interna-
tional Conference on Digital Arts, Media and Technology,ICDAMT’17.
IEEE, 2017, pp. 14–18.
[3] G. Wu, X. Zhao, S. Luo, and H. Shi, “Histological image segmenta-
tion using fast mean shift clustering method,” Biomedical Engineering
Online, vol. 14, no. 1, p. 24, 2015.
[4] A. Belsare, M. Mushrif, M. Pangarkar, and N. Meshram, “Breast
histopathology image segmentation using spatio-colour-texture based
graph partition method,” Journal of Microscopy, vol. 262, no. 3, pp.
260–273, 2016.
[5] Z. Ren, S. Gao, L.-T. Chia, and I. W.-H. Tsang, “Region-based saliency
detection and its application in object recognition,” IEEE Transactions
on Circuits and Systems for Video Technology, vol. 24, no. 5, pp. 769–
779, 2014.
[6] D. Mahapatra and J. M. Buhmann, “A field of experts model for optic
cup and disc segmentation from retinal fundus images,” in IEEE 12th
International Symposium on Biomedical Imaging, ISBI’15. IEEE, 2015,
pp. 218–221.
[7] S. Thorat and S. Jadhav, “Optic disc and cup segmentation for glaucoma
screening based on superpixel classification,” Int. Journal of Advanced
Computer Science and Applications, vol. 4, pp. 167–72, 2015.
[8] A. Albayrak and G. Bilgin, “Superpixel approach in high resolution
histopathological image segmentation,” in IEEE 25th Signal Processing
and Communications Applications Conference, SIU’17. IEEE, 2017,
pp. 1–4.
[9] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi,
M. Ghafoorian, J. A. van der Laak, B. van Ginneken, and C. I. Sánchez,
“A survey on deep learning in medical image analysis,” Medical Image
Analysis, vol. 42, pp. 60–88, 2017.
[10] E. Ahmed, M. Jones, and T. K. Marks, “An improved deep learning
architecture for person re-identification,” in IEEE Conference on Com-
puter Vision and Pattern Recognition, CVPR’15, 2015, pp. 3908–3916.
[11] I. Higgins, L. Matthey, X. Glorot, A. Pal, B. Uria, C. Blundell,
S. Mohamed, and A. Lerchner, “Early visual concept learning with
unsupervised deep learning,” arXiv preprint arXiv:1606.05579, 2016.
[12] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on
point sets for 3d classification and segmentation,” IEEE Conference on
Computer Vision and Pattern Recognition, CVPR’17, vol. 1, no. 2, p. 4,
2017.
[13] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew, “Deep
learning for visual understanding: A review,” Neurocomputing, vol. 187,
pp. 27–48, 2016.
[14] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk,
“Slic superpixels compared to state-of-the-art superpixel methods,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 34,
no. 11, pp. 2274–2282, 2012.
[15] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
applied to document recognition,” Proceedings of the IEEE, vol. 86,
no. 11, pp. 2278–2324, 1998.
[16] N. Hatipoglu and G. Bilgin, “Cell segmentation in histopathological
images with deep learning algorithms by utilizing spatial relationships,”
Medical & Biological Engineering & Computing, vol. 55, no. 10, pp.
1829–1848, 2017.
[17] Y. Lv, Y. Duan, W. Kang, Z. Li, and F.-Y. Wang, “Traffic flow
prediction with big data: a deep learning approach,” IEEE Transactions
on Intelligent Transportation Systems, vol. 16, no. 2, pp. 865–873, 2015.
[18] H. Irshad, L. Montaser-Kouhsari, G. Waltz, O. Bucur, J. Nowak, F. Dong,
N. W. Knoblauch, and A. H. Beck, “Crowdsourcing image annotation
for nucleus detection and segmentation in computational pathology:
evaluating experts, automated methods, and the crowd,” in Pacific
Symposium on Biocomputing, PB’14. NIH Public Access, 2014, pp.
294–305.