ArticlePDF Available

Eye Tracking-Based Diagnosis and Early Detection of Autism Spectrum Disorder Using Machine Learning and Deep Learning Techniques

February 2022
Electronics 11(4):530

February 2022
11(4):530

DOI:10.3390/electronics11040530

License
CC BY 4.0

Authors:

Ebrahim Senan

Dr. Babasaheb Ambedkar Marathwada University

Taha Rassem

Bournemouth University

Mohammed A. H. Ali

University of Malaya

Show all 7 authorsHide

Eye tracking is a useful technique for detecting autism spectrum disorder (ASD). One of the most important aspects of good learning is the ability to have atypical visual attention. The eye-tracking technique provides useful information about children’s visual behaviour for early and accurate diagnosis. It works by scanning the paths of the eyes to extract a sequence of eye projection points on the image to analyse the behaviour of children with autism. In this study, three artificial-intelligence techniques were developed, namely, machine learning, deep learning, and a hybrid technique between them, for early diagnosis of autism. The first technique, neural networks [feedforward neural networks (FFNNs) and artificial neural networks (ANNs)], is based on feature classification extracted by a hybrid method between local binary pattern (LBP) and grey level co-occurrence matrix (GLCM) algorithms. This technique achieved a high accuracy of 99.8% for FFNNs and ANNs. The second technique used a pre-trained convolutional neural network (CNN) model, such as GoogleNet and ResNet-18, on the basis of deep feature map extraction. The GoogleNet and ResNet-18 models achieved high performances of 93.6% and 97.6%, respectively. The third technique used the hybrid method between deep learning (GoogleNet and ResNet-18) and machine learning (SVM), called GoogleNet + SVM and ResNet-18 + SVM. This technique depends on two blocks. The first block used CNN to extract deep feature maps, whilst the second block used SVM to classify the features extracted from the first block. This technique proved its high diagnostic ability, achieving accuracies of 95.5% and 94.5% for GoogleNet + SVM and ResNet-18 + SVM, respectively.

Methodology for diagnosing ASD datasets by using the proposed systems.

…

Samples from (ASD and TD dataset. (a) SD Original images (b)Pre-processing ASD images (c) TD original images (d) Pre-processing TD images [24].

…

Structure of GoogLeNet model.

…

Structure of ResNet-18 model.

…

+11

Hybrid technique between deep learning and machine learning. (a) GoogleNet + SVM and (b) ResNet-18 + SVM.

…

Figures - uploaded by Ebrahim Senan

Content may be subject to copyright.

Content uploaded by Ebrahim Senan

Content may be subject to copyright.

Available via license: CC BY 4.0

Content may be subject to copyright.





Citation: Ahmed, I.A.; Senan, E.M.;

Rassem, T.H.; Ali, M.A.H.; Shatnawi,

H.S.A.; Alwazer, S.M.; Alshahrani, M.

Eye Tracking-Based Diagnosis and

Early Detection of Autism Spectrum

Disorder Using Machine Learning

and Deep Learning Techniques.

Electronics 2022,11, 530. https://

doi.org/10.3390/electronics11040530

Academic Editors: Cecilia Di Ruberto,

Alessandro Stefano, Albert Comelli,

Lorenzo Putzu and Andrea Loddo

Received: 15 January 2022

Accepted: 6 February 2022

Published: 10 February 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

electronics

Article

Eye Tracking-Based Diagnosis and Early Detection of Autism

Spectrum Disorder Using Machine Learning and Deep

Learning Techniques

Ibrahim Abdulrab Ahmed 1, *,† , Ebrahim Mohammed Senan 2, *,† , Taha H. Rassem 3 ,*, † ,

Mohammed A. H. Ali 4, † , Hamzeh Salameh Ahmad Shatnawi 1, Salwa Mutahar Alwazer 1

and Mohammed Alshahrani 1

1Computer Department, Applied College, Najran University, Najran 66462, Saudi Arabia;

hsshatnawi@nu.edu.sa (H.S.A.S.); smalwazer@nu.edu.sa (S.M.A.); moaalshahrani@nu.edu.sa (M.A.)

2Department of Computer Science & Information Technology, Dr. Babasaheb Ambedkar Marathwada

University, Aurangabad 431004, India

3Faculty of Science and Technology, Bournemouth University, Poole BH12 5BB, UK

4Department of Mechanical Engineering, Faculty of Engineering, University of Malaya,

Kuala Lumpur 50603, Malaysia; hashem@um.edu.my

*Correspondence: iaalqubati@nu.edu.sa (I.A.A.); senan1710@gmail.com (E.M.S.);

tahahussein@ieee.org (T.H.R.)

† These authors contributed equally to this work.

Abstract:

Eye tracking is a useful technique for detecting autism spectrum disorder (ASD). One of

the most important aspects of good learning is the ability to have atypical visual attention. The

eye-tracking technique provides useful information about children’s visual behaviour for early

and accurate diagnosis. It works by scanning the paths of the eyes to extract a sequence of eye

projection points on the image to analyse the behaviour of children with autism. In this study, three

artiﬁcial-intelligence techniques were developed, namely, machine learning, deep learning, and a

hybrid technique between them, for early diagnosis of autism. The ﬁrst technique, neural networks

[feedforward neural networks (FFNNs) and artiﬁcial neural networks (ANNs)], is based on feature

classiﬁcation extracted by a hybrid method between local binary pattern (LBP) and grey level co-

occurrence matrix (GLCM) algorithms. This technique achieved a high accuracy of 99.8% for FFNNs

and ANNs. The second technique used a pre-trained convolutional neural network (CNN) model,

such as GoogleNet and ResNet-18, on the basis of deep feature map extraction. The GoogleNet and

ResNet-18 models achieved high performances of 93.6% and 97.6%, respectively. The third technique

used the hybrid method between deep learning (GoogleNet and ResNet-18) and machine learning

(SVM), called GoogleNet + SVM and ResNet-18 + SVM. This technique depends on two blocks. The

ﬁrst block used CNN to extract deep feature maps, whilst the second block used SVM to classify the

features extracted from the ﬁrst block. This technique proved its high diagnostic ability, achieving

accuracies of 95.5% and 94.5% for GoogleNet + SVM and ResNet-18 + SVM, respectively.

Keywords:

autism spectrum disorder; eye tracking; machine learning; neural networks;

convolutional

neural network; GLCM; local binary pattern

1. Introduction

Autism spectrum disorder (ASD) is a neurodevelopmental disorder that affects chil-

dren who manifest heterogeneous characteristics, such as differences in behaviour, prob-

lems communicating with others and social disabilities [

]. According to the World Health

Organization, ASD affects one in 160 children worldwide [

]. It also appears in childhood

and persists into adulthood. Many symptoms of ASD appear, such as genetic, cognitive,

neurological and cognitive factors [

]. These symptoms appear in childhood but the diag-

nosis of ASD is not made until 2–3 years after the onset of symptoms, usually at the age

Electronics 2022,11, 530. https://doi.org/10.3390/electronics11040530 https://www.mdpi.com/journal/electronics

Electronics 2022,11, 530 2 of 27

of 4 years [

]. Detecting autism is a difﬁcult task that requires effort and a long period

to improve cases. For early detection of autism, many behavioural and physiological

techniques have been used to identify autism effectively and accurately in children [

Predictive indicators to inform parents of the behaviour, physiological status and course of

their children very early are also needed, along with informing scientiﬁc research centres

about ﬁnding appropriate solutions and treatments.

Although clinical and physiological characteristics are not identiﬁed early, some of

the vital behavioural characteristics have a high ability to determine autism and its degree.

Eye-tracking technology is one of the most important and promising indicators for ASD

because it is fast, inexpensive, easy to analyse and applicable to all ages. Eye movement

tracking is an investigational procedure that generates, tracks and captures points and

calculates eye movement through these points. Many studies have demonstrated that eye

movements have a strong effect on the response to visual and verbal cues as biomarkers

of ASD [

]. Some studies have also shown that early detection of ASD by tracking eye

movements has correlations with clinical testing [

]. Some of these correlations are due to

genetic factors [

]. In addition, diagnosis by eye tracking is useful in short-term detection

of children with ASD.

Eye tracking is a sensitive tool for examining behaviour and customising eyesight

to process a range of information about visual stimuli [

]. In previous years, researchers

focused on diagnosing ASD through eye tracking and on the biological and behavioural

patterns of eye movement, especially in children exposed to multiple developmental

disorders, including ASD [

]. Eye-tracking technology, as a biomarker for assessing

children with autism, has many advantages. Firstly, it provides ease of eye tracking for

young children, which means early detection of autism risks. Secondly, eye-tracking data

provide a range of information that is used as biomarkers, which indicate atypical visual

focus [

]. Thirdly, eye-tracking technology is an easy and straightforward measure that

is related to the screening tools used to diagnose ASD [

]. Thus, eye-tracking models

that measure non-social and social orientation performance have signiﬁcant correlations

in ASD. These models take advantage of social attention deﬁcits to detect ASD at an early

age. However, evidence regarding positive eye-tracking outcomes as a strong predictor of

long-term outcomes in children with ASD is lacking.

One of the greatest challenges in ASD is the heterogeneous response to treatment and

the search for effective treatments to improve responses in children with autism. From this

point, researchers and experts have set out a search for the development of new treatment

methods that target children who are less responsive to the currently available treatment

methods. In this study, the developed artiﬁcial intelligence systems to enhance the early

detection of autism disease in an eye-tracking dataset containing two classes showed that

children in the ﬁrst class tend to show lower social visual attention (SVA) than second-class

typical development (TD).

The major contributions in this work are as follows:

•

This research aimed to diagnose an ASD dataset and distinguish cases of autism from

cases of TD.

•

This research also aimed to enhance images, remove all noises from the eye-tracking

path area and extract the paths of eye points falling on the image, with high efﬁciency

using overlapping ﬁlters.

•

The most important representative features from the areas of the eye tracks were

extracted using local binary pattern (LBP) and grey level co-occurrence matrix (GLCM)

algorithms. The features of the two methods were merged into one vector, called

hybrid, and classiﬁed using two classiﬁers, FFNN and ANN.

•

The dataset was balanced; also, the parameters of the GoogleNet and ResNet-18 deep

learning models were adjusted and modiﬁed to extract the deep feature maps for

diagnosing autism with high efﬁciency.

Electronics 2022,11, 530 3 of 27

•

A hybrid technology between deep learning models (GoogleNet and ResNet-18) and

machine learning algorithms (SVM) was developed to obtain superior results for

ASD diagnosis.

•

Artiﬁcial intelligence techniques, namely, machine learning, deep learning and hybrid

techniques, could help experts and autism treatment centres in the early detection of

autism in children.

The rest of the paper is organised as follows: Section 2describes a set of relevant

previous studies. Section 3shows the analysis of the materials and methods for the ASD

dataset, and contains subsections for three proposed systems. Section 4presents the results

achieved by machine learning, deep learning and the hybrid technique between them.

Section 5provides discussion and comparison between the proposed systems. Section 6

concludes the paper.

2. Related Work

Here, a set of recent relevant previous studies is reviewed.

Lord. et al. presented a method called autism diagnostic observation schedule for

assessing autism. It is a set of features and unstructured observational tasks in which

the doctor evaluates the response of children through some desirable and undesirable

situations, with a focus on behaviour that indicates autism [

]. Schopler et al. presented

a scale for assessing autism through the child autism rating scale, which is based on the

doctor’s assessment of children’s behaviour through two scales, a social communication

questionnaire and a social responsiveness scale; it is a more reliable measure according to

opinions [

]. Moore. et al. introduced a system for tracking eye data, extracting features

and training them on machine learning classiﬁers. The study was applied to 71 people

divided into 31 with autism and 40 controls. The authors used different stimuli to evaluate

the performance of the system. The system achieved an accuracy of 74% [

]. Thorup et al.

presented two eye-tracking behaviours, namely, referential looking and joint attention, for

autism detection.

Recent studies using dynamic stimuli indicated a difference between eye and head

movement during the application of joint attention tasks; this difference is a unique charac-

teristic of people with autism. The results concluded that relying on the eye and the head

signal is better than relying on eye movement. Eye movement tracking is a powerful visual

attention assessment tool for understanding the behaviour of people with autism [

Jones et al. investigated differential attention and visual ﬁxation with characteristic features

of visual stimuli. They focused on distinguishing between children with ASD and those

without, rather than determining the severity of the autism spectrum [

]. Bacon et al. re-

vealed that biomarkers, such as genetics, physiological behaviour, neurodevelopment and

eye tracking, contribute to the diagnosis of autism. Eye tracking was used to assess social

stimuli in several samples, such as children with ASD, Down syndrome, Rett syndrome

and Williams syndrome [17].

Mazumdar et al. presented a method that was based on extracting and classifying

eye-tracking features through machine learning algorithms. The features were extracted

from display behaviour, image content and scene centres. The system achieved high

performance in distinguishing children with autism spectrum from typically developing

children [

]. Belen et al. presented the EyeXplain Autism method that enables clinicians to

track eyes, analyse data and interpret data extracted by DNN [

]. Oliveira et al. proposed

a computational method that was based on integrating the concepts of visual attention

and artiﬁcial intelligence techniques through the analysis of eye-tracking data. These data

were categorised by a machine learning algorithm and the system reached an accuracy of

90% [

]. Li et al. proposed a sparsely grouped input variables for neural network (SGIN)

method for identifying stimuli that differentiate grouping with clinical features [

]. Yaneva

et al. presented an approach to detecting autism in adults by eye-tracking. Eye movements

were recorded, and machine learning algorithms were trained to detect autism. Effects

Electronics 2022,11, 530 4 of 27

were detected based on eyesight and other variables. The system achieved an accuracy of

74% [22].

3. Materials and Methods

In this section, techniques, methods and materials for evaluating an ASD dataset

were analysed, as shown in Figure 1. Data were collected from patients with autism

spectrum disorder and others developing it. All images were subjected to pre-processing.

After optimised images were obtained, three techniques were applied. Firstly, a Neural

Networks technique based on extracting features from the segmented eye-tracking area

was applied using a snake model. Then, features were extracted by LBP and GLCM hybrid

methods. Subsequently, the features of the two methods were combined into one vector

for each image and diagnosed using ANN and FFNN algorithms. The second technique

for detecting ASD used convolutional neural network (CNN) models, such as GoogleNet

and ResNet-18. The third technique was a hybrid technique between the two techniques of

machine learning (SVM) and deep learning (GoogleNet and ResNet-18), which are called

GoogleNet + SVM and ResNet-18 + SVM, one of the contributions of this study.

Electronics2022,11,xFORPEERREVIEW4of28



Effectsweredetectedbasedoneyesightandothervariables.Thesystemachievedanac‐

curacyof74%.[22].

3.MaterialsandMethods

Inthissection,techniques,methodsandmaterialsforevaluatinganASDdataset

wereanalysed,asshowninFigure1.Datawerecollectedfrompatientswithautismspec‐

trumdisorderandothersdevelopingit.Allimagesweresubjectedtopre‐processing.Af‐

teroptimisedimageswereobtained,threetechniqueswereapplied.Firstly,aNeuralNet‐

workstechniquebasedonextractingfeaturesfromthesegmentedeye‐trackingareawas

appliedusingasnakemodel.Then,featureswereextractedbyLBPandGLCMhybrid

methods.Subsequently,thefeaturesofthetwomethodswerecombinedintoonevector

foreachimageanddiagnosedusingANNandFFNNalgorithms.Thesecondtechnique

fordetectingASDusedconvolutionalneuralnetwork(CNN)models,suchasGoogleNet

andResNet‐18.Thethirdtechniquewasahybridtechniquebetweenthetwotechniques

ofmachinelearning(SVM)anddeeplearning(GoogleNetandResNet‐18),whichare

calledGoogleNet+SVMandResNet‐18+SVM,oneofthecontributionsofthisstudy.



Figure1.MethodologyfordiagnosingASDdatasetsbyusingtheproposedsystems.

3.1.Dataset

Inthiswork,thedatasetfromtheFigsharedatarepositorywasused;theseimages

werecollectedandpreparedbyCaretteetal.[23].Thedatasetcontains547imagesofchil‐

drendividedintotwoclasses:ASD,whichcontains219images,andTD,whichcontains

328images.Inaddition,imageswerecollectedfrom59childrenasfollows:29children

withASD(25malesand4females)and30childrenwithTD(13malesand17females)

[24].Figure2depictssamplesfromASDandTDdataset.

Figure 1. Methodology for diagnosing ASD datasets by using the proposed systems.

3.1. Dataset

In this work, the dataset from the Figshare data repository was used; these images

were collected and prepared by Carette et al. [

]. The dataset contains 547 images of

children divided into two classes: ASD, which contains 219 images, and TD, which contains

328 images. In addition, images were collected from 59 children as follows: 29 children

with ASD (25 males and 4 females) and 30 children with TD (13 males and 17 females) [

Figure 2depicts samples from ASD and TD dataset.

Electronics 2022,11, 530 5 of 27

Electronics2022,11,xFORPEERREVIEW5of28



Figure2.Samplesfrom(ASDandTDdataset.(a)SDOriginalimages(b)Pre–processingASDimages

(c)TDoriginalimages(d)Pre–processingTDimages[24].

3.2.AverageandLaplacianFilters

Mostimagescontainnoisecausedbymanyfactors,whetherwhentakingimagesor

storingthem.Thisnoisemustbetreatedtoobtainimportantandaccuratefeatures.Image

optimisationisthefirststepinimageprocessingtorepairdamagedfeaturesthathavean

effectonthediagnosticprocess.Severalfiltersareusedtoenhanceimages,removenoise

andincreaseedgecontrast.Inthiswork,allimageswereenhancedbyaverageandLapla‐

cianfilters.Firstly,anaveragefilterwasappliedtoalltheimages.Thefilterworksata

sizeof5×5andkeepsmovingontheimageuntilthewholeimageisprocessedand

smoothedbyreducingthedifferencesbetweenthedifferentpixelsandreplacingeachcen‐

tralpixel,withtheaveragevalueoftheadjacent24pixels.Equation(1)describesthemech‐

anismofhowtosmoothanimagewithanaveragefilter[25].

𝑓

𝐿1

𝐿𝑦𝐿1,



 (1)

wheref(L)istheenhancedimage(output),y(L−1)isthepreviousinputandListhenum‐

berintheaveragefilter.

Secondly,aLaplacianfilter,whichdetectsedgesandshowstheedgesofscenestaken

fromeyetracking,wasapplied.Equation(2)showshowtheLaplacianfilterworksonthe

image.

∇ 

𝑓

𝜕 

𝑓

𝜕  𝑥 𝜕 

𝑓

𝜕  𝑦,(2)

wherefrepresentsasecond‐orderdifferentialequationandx,yisthecoordinateinatwo‐

dimensionalmatrix.

Finally,afullyoptimisedimagewasobtainedbysubtractingtheimageoptimisedby

theLaplacianfilterfromtheresultantimagebytheaveragefilter,asinEquation(3).

Imege enhanced 

𝑓

𝐿  ∇ 

𝑓

(3)

Figure2bshowssomesamplesofthedatasetaftertheenhancementprocess.

Figure 2. Samples from (ASD and TD dataset [24].

3.2. Average and Laplacian Filters

Most images contain noise caused by many factors, whether when taking images or

storing them. This noise must be treated to obtain important and accurate features. Image

optimisation is the ﬁrst step in image processing to repair damaged features that have

an effect on the diagnostic process. Several ﬁlters are used to enhance images, remove

noise and increase edge contrast. In this work, all images were enhanced by average and

Laplacian ﬁlters. Firstly, an average ﬁlter was applied to all the images. The ﬁlter works

at a size of 5

5 and keeps moving on the image until the whole image is processed and

smoothed by reducing the differences between the different pixels and replacing each

central pixel, with the average value of the adjacent 24 pixels. Equation (1) describes the

mechanism of how to smooth an image with an average ﬁlter [25].

f(L)=1

L−1

∑

i=0

y(L−1), (1)

where f(L) is the enhanced image (output), y(L

−

1) is the previous input and Lis the

number in the average ﬁlter.

Secondly, a Laplacian ﬁlter, which detects edges and shows the edges of scenes taken

from eye tracking, was applied. Equation (2) shows how the Laplacian ﬁlter works on

the image.

∇2f=∂2f

∂2x+∂2f

∂2y, (2)

where frepresents a second-order differential equation and x,yis the coordinate in a

two-dimensional matrix.

Finally, a fully optimised image was obtained by subtracting the image optimised by

the Laplacian ﬁlter from the resultant image by the average ﬁlter, as in Equation (3).

Imege enhanced =f(L)− ∇ 2f(3)

Figure 2b shows some samples of the data set after the enhancement process.

Electronics 2022,11, 530 6 of 27

All images were enhanced to be inputted into the following three proposed systems.

3.3. First Proposed System Using Neural Networks Techniques

3.3.1. Snake Algorithm (Segmentation)

The snake algorithm is one of the latest algorithms for appropriate segmentation and

identiﬁcation of the region of interest (ROI) and isolation of it from the rest of the image

for further analysis. The algorithm moves along the edges of the ROI, where the

{

curve

is represented by setting the function

∅

Ω→

and the model starts with zero at the ﬁrst

boundary region of the ROI I[

];

Ω→

represents the lesion region. The curve

{

divides

each subregion f

k⊂Ω

into two subregions inside and outside f,fwith

, as shown in

Equation (4), where

inside {=f={x∈fk:∅(x)>0}(4)

outside {=f={x∈Ω:∅(x)<0∪x∈Ω\fk}

∪xrepresents sub-regions belonging to the Ω(lesion region).

The snake model begins by a growing contour inward. ROI boundaries are determined

by a ﬁrst contour determination and eye-tracking scene map calculation. The model moves

within the ROI boundary when

> 0 is set, whilst the area outside the ROI is calculated by

subtracting the currently calculated subregion from the previously calculated subregion, as

shown in Equation (5).

f0=f1+f1⇒f1=f0−f1

f2=f1−f2(5)

f3=f2−f3

The outer sub-area is calculated using Equation (6).

f k =f k −1−f k (6)

The segmentation algorithm is applied by the snake model that moves toward the

boundary of the object through the external energy that moves toward the boundary of

the object when the level is set to zero. The following Equation (7) describes the energy

function of the φfunction.

fspq(∅)=λLs pq (∅)+νAspq(∅)(7)

where

is constant and

λ>

0. The terms

Lspq(∅)

and

Aspq

are defined in

Equations (8) and (9).

Lspq(∅)=ZΩs pq(I)δε(∅)|∇∅|dx (8)

Aspq(∅)=ZΩs pq(I)Hε(∅) (−∅)dx, (9)

where H

(

∅

) indicates the Heaviside function and

δε

(

∅

) indicates the univariate Dirac

delta function. When the zero level Ccurve is pushed to a smooth plane, Lspq (

∅

) is

reduced. The energy in spq(I) speeds up moving curves and deﬁnes the boundaries of a

scene for eye tracking. The parameter v of Aspq (

∅

) is a positive or negative value that

depends on where the snake model is in an ROI. When the value of v is positive, the snake

model is located outside the ROI and if the value of v is negative, the snake model is located

inside the ROI to speed up the determination of the scene region. The spq(I) function is a

mathematical expression that has two values [1,

−

1], either inside or outside an ROI. When

set to a value of 1, the snake model is outside the object and modiﬁes the compressive force

Electronics 2022,11, 530 7 of 27

to shrink the contour, and when set to

−

1, the contour expands when inside the object, as

shown in Equation (10).

spq(I)=((I(x)−IGF I )Mk

max(|I(x)−IGF I |)I(x)6=0,

0I(x)=0, (10)

The

Hε(∅)

function and

δε(∅)

function are the smoothed part of the entire image, as

calculated using Equations (11) and (12).

Hε(z)=1

21+2

πarctanz

ε(11)

ﬃ”(z)=ε

π(z2+ε2)(12)

An algorithm stops when the pixel value is similar between two consecutive contours

and this point is called the algorithm stop point, as in Equation (13).

∑row

i=0

col

∑

j=0

i,j<stopping value

100 ∑row

i=0

col

∑

j=0

oldMk

i,j, (13)

then the snake model stops running.

Here,

oldMk

i,j

indicates the last computed mask for the snake model;

i,j

refers to the

new mask of the snake model; and row and col are the max numbers in the images. The

model stops moving between the value ranges of 98 and 100 obtained by calculating the

average intensity of the initial contour. Figure 3describes samples from the dataset after

segmentation and ROI by the eye-tracking path.

Electronics2022,11,xFORPEERREVIEW7of28



𝑠𝑝𝑞𝐼 I𝑥𝐼 𝑀

max |I𝑥𝐼| 𝐼𝑥 0,

0 𝐼𝑥0, (10)

The𝐻∅ functionand𝛿∅ functionarethesmoothedpartoftheentireimage,as

calculatedusingEquations(11)and(12).

H𝑧1

2  1 2

𝜋arctan𝑧

𝜀  (11)

δ𝑧 𝜀

𝜋 𝑧 𝜀  (12)

Analgorithmstopswhenthepixelvalueissimilarbetweentwoconsecutivecontours

andthispointiscalledthealgorithmstoppoint,asinEquation(13).

If

𝑀

,

 stopping value

100   𝑜𝑙𝑑𝑀,





 ,











 (13)

thenthesnakemodelstopsrunning.

Here,𝑜𝑙𝑑𝑀,

indicatesthelastcomputedmaskforthesnakemodel;𝑀,

refersto

thenewmaskofthesnakemodel;androwandcolarethemaxnumbersintheimages.

Themodelstopsmovingbetweenthevaluerangesof98and100obtainedbycalculating

theaverageintensityoftheinitialcontour.Figure3describessamplesfromthedataset

aftersegmentationandROIbytheeye‐trackingpath.



Figure3.ASDdatasetaftersegmentationandselectionofROI.

3.3.2.MorphologicalMethod

Themorphologicalmethodisamethodforfurtherimageoptimisationaftersegmen‐

tation.Afterthefragmentationprocess,holesthatdonotbelongtotheROIareleft.Thus,

theseholesmustberemoved.Manymorphologicalmethods,suchaserosion,dilation,

andopeningandclosing,createstructuralelementsofaspecificsize,movethemtoevery

positionoftheimageandreplacethetargetpixelwithsuitablepixelsonthebasisofadja‐

centpixels.Inthisstudy,twomethodswereused.Thefirstmethodistheadjacentunion

test,whichiscalled‘fits’.Thesecondmethod,called‘hits’,teststheadjacentintersection.

Figure 3. ASD dataset after segmentation and selection of ROI.

3.3.2. Morphological Method

The morphological method is a method for further image optimisation after segmenta-

tion. After the fragmentation process, holes that do not belong to the ROI are left. Thus,

these holes must be removed. Many morphological methods, such as erosion, dilation, and

opening and closing, create structural elements of a speciﬁc size, move them to every posi-

tion of the image and replace the target pixel with suitable pixels on the basis of adjacent

pixels. In this study, two methods were used. The ﬁrst method is the adjacent union test,

which is called ‘ﬁts’. The second method, called ‘hits’, tests the adjacent intersection. The

Electronics 2022,11, 530 8 of 27

morphological processes produced improved binary images, as shown in Figure 4, which

illustrates samples of the dataset before and after the morphological process [27].

Electronics2022,11,xFORPEERREVIEW8of28



Themorphologicalprocessesproducedimprovedbinaryimages,asshowninFigure4,

whichillustratessamplesofthedatasetbeforeandafterthemorphologicalprocess[27].



Figure4.Someimagesofthedatasetbeforeandafterthemorphologicalmethod(a)ASDclass,(b)

TDclass.

3.3.3.FeatureExtraction

Inthiswork,themostimportantrepresentativefeaturesfromtheROIwereextracted

bytwoalgorithms,LBPandGLCM.Then,theextractedfeaturesfromthetwoalgorithms

werecombinedwitheachothertoproducestrongrepresentativefeatures.Extractingthe

featuresfromseveralmethodsandcombiningthemaretwoofthemostimportantrecent

methodsthathaveaneffectonaccuratediagnosis.TheLBPalgorithmisoneofthemeth‐

odstoextractfeatures.Inthisstudy,theLBPwassettoasizeof4×4.Themethodselects

thecentralpixel(𝑔)oneatatimeanddeterminestheneighboringpixels(𝑔),whichare

15pixels,andreplacesthecentralpixelwithadjacentpixelsinaccordancewithEquation

(14).Thus,eachcentralpixelwasreplacedbyadjacentpixelsandtheprocesswasrepeated

untilallpixelswerereplacedbytheimage;203featureswereextractedforeachimage.

𝐿𝐵𝑃 𝑥,𝑦, 𝑠𝑔𝑔.2,



 (14)

wherePrepresentsthenumberofpixelsintheimageandthebinarythreshold𝑥 isde‐

terminedasinEquation(15).

𝑠𝑥0, 𝑥0

1, 𝑥0(15)

ThefeatureswerethenextractedbytheGLCMalgorithm,whichshoweddifferent

levelsofgreylevelsintheROI.Thealgorithmextractedfeaturesfromtheregionofthe

eye‐trackingtrack.Thealgorithmcollectedspatialinformationthatdeterminestherela‐

tionshipbetweenthecentrepixelandadjacentpixelsinaccordancewithdistancedand

angleθ.Thefourrepresentationsoftheangleare0°,45°,90°and135°,andthevalueofd

iseither1whentheangleisθ=0°orθ=90°orthevalueofd=√2whentheangleisθ=

45°orθ=135°.TheGLCMalgorithmproduced13essentialfeaturesforeachimage.Figure

5describesthehybridisationoffeaturesextractedbytheLBPandGLCMalgorithms.

Figure 4.

Some images of the dataset before and after the morphological method (

) ASD class,

(b) TD class.

3.3.3. Feature Extraction

In this work, the most important representative features from the ROI were extracted

by two algorithms, LBP and GLCM. Then, the extracted features from the two algorithms

were combined with each other to produce strong representative features. Extracting

the features from several methods and combining them are two of the most important

recent methods that have an effect on accurate diagnosis. The LBP algorithm is one of

the methods to extract features. In this study, the LBP was set to a size of 4

4. The

method selects the central pixel (

) one at a time and determines the neighboring pixels

(

), which are

15 pixels,

and replaces the central pixel with adjacent pixels in accordance

with

Equation (14).

Thus, each central pixel was replaced by adjacent pixels and the process

was repeated until all pixels were replaced by the image; 203 features were extracted for

each image.

LBP (xc,yc)R,P=

P−1

∑

P=0

s(gp−gc.2P, (14)

where Prepresents the number of pixels in the image and the binary threshold

determined as in Equation (15).

s(x)=0, x<0

1, x≥0(15)

The features were then extracted by the GLCM algorithm, which showed different

levels of grey levels in the ROI. The algorithm extracted features from the region of the eye-

tracking track. The algorithm collected spatial information that determines the relationship

between the centre pixel and adjacent pixels in accordance with distance d and angle

The four representations of the angle are 0

◦

, 45

◦

, 90

◦

and 135

◦

, and the value of d is either

1 when the angle is

= 0

◦

= 90

◦

or the value of d =

√

2 when the angle is

= 45

◦

= 135

◦

. The GLCM algorithm produced 13 essential features for each image. Figure 5

describes the hybridisation of features extracted by the LBP and GLCM algorithms.

Electronics 2022,11, 530 9 of 27

Electronics2022,11,xFORPEERREVIEW9of28



Figure5.HybridLBPandGLCMalgorithms.

3.3.4.Classification

Inthissection,theASDdatasetwasevaluatedbasedontwoneuralnetworkalgo‐

rithms,namely,anANNandanFFNN.

ANNandFFNNAlgorithms

AnANNisacomputerisedneuralnetworkthatconsistsofaninputlayerwithmany

neuronsinit;manyhiddenlayersinwhichinterconnectedneuronsarecarriedoutand

wheremanycomplexarithmeticoperationsareperformedtosolvetheproblemstobe

solved;andanoutputlayerthatcontainsneuronswiththesamenumberofclassestobe

classified[28].Thealgorithmanalysesandinterpretsmanylargeandcomplexdatato

produceclearpatterns.Eachneuronisassociatedwiththeotherbyspecificwweightsthat

havearoleinreducingtheerrorbetweenthepredictedandactualoutput.TheANNal‐

gorithmupdatestheweightsineachiterationuntilaminimumsquarederrorisobtained

betweentheactualoutputXandthepredictedY,asdescribedbyEquation(16).

MSC 1

n



  𝑋𝑌 (16)

whereNisthenumberofdatapoints.

Inthisstudy,anANNwasevaluatedontheASDdataset.Atotalof216features

(neuralcells)wereinputtedintotheinputlayer,trainedthrough10interconnectedhidden

layerswithcertainweights,andthenfedtotheoutputlayerthatcontainstwoclasses

(neural),ASDandTD.Figure6describestheANNarchitectureoftheASDdataset,in

which216featureswereenteredandprocessedthrough10layersandtwoclasseswere

produced.



Figure6.ArchitectureoftheANNandFFNNalgorithmsforASDdataset.

Figure 5. Hybrid LBP and GLCM algorithms.

3.3.4. Classiﬁcation

In this section, the ASD dataset was evaluated based on two neural network algorithms,

namely, an ANN and an FFNN.

ANN and FFNN Algorithms

An ANN is a computerised neural network that consists of an input layer with many

neurons in it; many hidden layers in which interconnected neurons are carried out and

where many complex arithmetic operations are performed to solve the problems to be

solved; and an output layer that contains neurons with the same number of classes to be

classiﬁed [

]. The algorithm analyses and interprets many large and complex data to

produce clear patterns. Each neuron is associated with the other by speciﬁc w weights

that have a role in reducing the error between the predicted and actual output. The ANN

algorithm updates the weights in each iteration until a minimum squared error is obtained

between the actual output X and the predicted Y, as described by Equation (16).

MSC =1

∑

i=1

(Xi−Yi)2(16)

where N is the number of data points.

In this study, an ANN was evaluated on the ASD dataset. A total of 216 features

(neural cells) were inputted into the input layer, trained through 10 interconnected hidden

layers with certain weights, and then fed to the output layer that contains two classes

(neural), ASD and TD. Figure 6describes the ANN architecture of the ASD dataset, in which

216 features

were entered and processed through 10 layers and two classes

were produced.