ArticlePDF Available

A fuzzy-ontology oriented case-based reasoning framework for semantic diabetes diagnosis

Authors:

Abstract and Figures

Objective: Case-based reasoning (CBR) is a problem-solving paradigm that uses past knowledge to interpret or solve new problems. It is suitable for experience-based and theory-less problems. Building a semantically intelligent CBR that mimic the expert thinking can solve many problems especially medical ones. Methods: Knowledge-intensive CBR using formal ontologies is an evolvement of this paradigm. Ontologies can be used for case representation and storage, and it can be used as a background knowledge. Using standard medical ontologies, such as SNOMED CT, enhances the interoperability and integration with the health care systems. Moreover, utilizing vague or imprecise knowledge further improves the CBR semantic effectiveness. This paper proposes a fuzzy ontology-based CBR framework. It proposes a fuzzy case-base OWL2 ontology, and a fuzzy semantic retrieval algorithm that handles many feature types. Material: This framework is implemented and tested on the diabetes diagnosis problem. The fuzzy ontology is populated with 60 real diabetic cases. The effectiveness of the proposed approach is illustrated with a set of experiments and case studies. Results: The resulting system can answer complex medical queries related to semantic understanding of medical concepts and handling of vague terms. The resulting fuzzy case-base ontology has 63 concepts, 54 (fuzzy) object properties, 138 (fuzzy) datatype properties, 105 fuzzy datatypes, and 2640 instances. The system achieves an accuracy of 97.67%. We compare our framework with existing CBR systems and a set of five machine-learning classifiers; our system outperforms all of these systems. Conclusion: Building an integrated CBR system can improve its performance. Representing CBR knowledge using the fuzzy ontology and building a case retrieval algorithm that treats different features differently improves the accuracy of the resulting systems.
Content may be subject to copyright.
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Contents
lists
available
at
ScienceDirect
Artificial
Intelligence
in
Medicine
j
o
ur
na
l
ho
mepage:
www.elsevier.com/locate/aiim
A
fuzzy-ontology-oriented
case-based
reasoning
framework
for
semantic
diabetes
diagnosis
Shaker
El-Sappagha,
Mohammed
Elmogyb,,
A.M.
Riadc
aDepartment
of
Mathematics,
College
of
Science,
King
Saud
University,
PO
2455,
Riyadh,
Saudi
Arabia
bInformation
Technology
Department,
Faculty
of
Computers
&
Information,
Mansoura
University,
PO
35516,
Mansoura,
Egypt
cInformation
Systems
Department,
Faculty
of
Computers
&
Information,
Mansoura
University,
PO
35516,
Mansoura,
Egypt
a
r
t
i
c
l
e
i
n
f
o
Article
history:
Received
30
October
2014
Received
in
revised
form
2
June
2015
Accepted
5
August
2015
Keywords:
Case-based
reasoning
Knowledge
based
system
Fuzzy
ontology
Semantic
retrieval
Diabetes
diagnosis
Standard
SNOMED
CT
terminology
a
b
s
t
r
a
c
t
Objective:
Case-based
reasoning
(CBR)
is
a
problem-solving
paradigm
that
uses
past
knowledge
to
inter-
pret
or
solve
new
problems.
It
is
suitable
for
experience-based
and
theory-less
problems.
Building
a
semantically
intelligent
CBR
that
mimic
the
expert
thinking
can
solve
many
problems
especially
medical
ones.
Methods:
Knowledge-intensive
CBR
using
formal
ontologies
is
an
evolvement
of
this
paradigm.
Ontologies
can
be
used
for
case
representation
and
storage,
and
it
can
be
used
as
a
background
knowledge.
Using
standard
medical
ontologies,
such
as
SNOMED
CT,
enhances
the
interoperability
and
integration
with
the
health
care
systems.
Moreover,
utilizing
vague
or
imprecise
knowledge
further
improves
the
CBR
semantic
effectiveness.
This
paper
proposes
a
fuzzy
ontology-based
CBR
framework.
It
proposes
a
fuzzy
case-base
OWL2
ontology,
and
a
fuzzy
semantic
retrieval
algorithm
that
handles
many
feature
types.
Material:
This
framework
is
implemented
and
tested
on
the
diabetes
diagnosis
problem.
The
fuzzy
ontol-
ogy
is
populated
with
60
real
diabetic
cases.
The
effectiveness
of
the
proposed
approach
is
illustrated
with
a
set
of
experiments
and
case
studies.
Results:
The
resulting
system
can
answer
complex
medical
queries
related
to
semantic
understanding
of
medical
concepts
and
handling
of
vague
terms.
The
resulting
fuzzy
case-base
ontology
has
63
concepts,
54
(fuzzy)
object
properties,
138
(fuzzy)
datatype
properties,
105
fuzzy
datatypes,
and
2640
instances.
The
system
achieves
an
accuracy
of
97.67%.
We
compare
our
framework
with
existing
CBR
systems
and
a
set
of
five
machine-learning
classifiers;
our
system
outperforms
all
of
these
systems.
Conclusion:
Building
an
integrated
CBR
system
can
improve
its
performance.
Representing
CBR
knowledge
using
the
fuzzy
ontology
and
building
a
case
retrieval
algorithm
that
treats
different
features
differently
improves
the
accuracy
of
the
resulting
systems.
©
2015
Elsevier
B.V.
All
rights
reserved.
1.
Introduction
Diabetes
is
a
complex,
chronic
illness
requiring
continuous
medical
care
with
multifactorial
risk-reduction
strategies
beyond
glycemic
control.
According
to
World
Health
Organization
(WHO),
diabetes
will
be
the
seventh
leading
cause
of
death
in
2030
[1].
Globally,
about
336
million
people
are
living
with
type
2
diabetes
mellitus,
and
this
figure
is
set
to
rise
to
over
552
million
by
2030
[2].
In
2014,
9%
of
adults
18
years
and
older
had
diabetes
[1].
There
are
three
main
types
of
diabetes.
The
first
type
is
type
1
diabetes
mel-
litus
or
insulin
dependent
diabetes
mellitus;
this
type
occurs
when
the
pancreas
cannot
produce
sufficient
insulin.
The
second
type
is
Corresponding
author.
Tel.:
+0020
1098889791;
fax:
+0020
502223754.
E-mail
address:
melmogy@mans.edu.eg
(M.
Elmogy).
type
2
diabetes
mellitus
or
insulin-independent
diabetes
mellitus;
this
type
occurs
when
the
body
cannot
effectively
use
the
produced
insulin.
The
third
type
is
gestational
diabetes,
which
occurs
in
preg-
nant
women.
A
patient
of
diabetes
symptoms
but
not
really
diabetic
is
called
a
pre-diabetes
patient.
The
early
diagnosis
of
diabetes
is
critical
in
its
care
process
because
the
early
care
can
prevent
long-term
microvascular
com-
plications
such
as
retinopathy,
nephropathy
and
neuropathy,
and
cardiovascular
disease.
Moreover,
the
early
diagnosis
can
prevent
the
pre-diabetes
patient
to
become
a
diabetic.
At
present,
the
results
for
early
detection
of
diabetes
are
not
highly
accurate.
There-
fore,
there
is
a
need
to
develop
a
diagnosis
system
for
diabetes
that
has
better
accuracy.
Clinical
decision
support
systems
(CDSS)
can
help
in
this
regard.
Existing
rule-based
diagnose
diabetes
systems
are
mainly
based
on
the
A1C
criteria
or
plasma
glucose
criteria,
either
the
fasting
plasma
glucose
(FPG)
or
the
2-h
plasma
glucose
http://dx.doi.org/10.1016/j.artmed.2015.08.003
0933-3657/©
2015
Elsevier
B.V.
All
rights
reserved.
180
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
(2-h
PG)
value
after
a
75-g
oral
glucose
tolerance
test
(OGTT).
For
example,
they
take
decisions
using
rules
such
as
if
(A1C
6.5%
or
FPG
126
mg/dL
or
2-h
PG
200
mg/dL)
then
the
patient
is
dia-
betic
[3].
However,
diabetes
diagnosis
is
more
complicated
than
these
direct
decisions.
Diabetes
is
related
to
other
diseases
includ-
ing
renal
diseases,
heart
diseases,
foot
diseases,
etc.
Moreover,
it
has
symptoms
related
to
hyperglycemia
or
hypoglycemia.
The
true
or
false
decisions
about
these
symptoms,
e.g.
thirst
=
true,
is
not
enough.
Diabetes
diagnosis
is
a
theory-less
and
unstructured
problem,
and
it
depends
on
the
physician’s
experience.
For
experience-based
problem
solving,
case
based
reasoning
(CBR)
is
one
of
the
most
suit-
able
AI
techniques
for
decision
support
[4].
CBR
imitates
human
reasoning,
and
it
is
suitable
when
we
cannot
formulate
a
problem
in
a
set
of
generalized
rules.
It
is
appropriate
in
a
medical
con-
text
where
symptoms
represent
the
problem,
and
diagnosis
and
treatment
represent
the
solution.
The
CBR
paradigm
has
been
suc-
cessfully
used
in
various
medical
fields
from
lung
disease
and
eating
disorders
to
diabetes
and
Alzheimer’s
disease
[5].
Many
pieces
of
research
utilized
CBR
for
diabetes
diagnosis
[6–9].
Although
any
CBR
system
relies
on
a
set
of
specific
previous
experiences,
its
rea-
soning
power
can
be
improved
by
general
knowledge
about
the
domain
[10].
Ontologies
can
enhance
the
capabilities
of
CBR
by
cre-
ating
knowledge
intensive-CBR
(KI-CBR)
systems
[11].
It
can
play
many
roles
in
CBR
such
as
background
domain
ontology,
case-base
ontology,
semantic
similarity
measurement,
and
others
[12].
Ontol-
ogy
can
enhance
CBR
systems
in
many
dimensions,
as
shown
in
Fig.
1.
In
this
figure,
we
suggest
three
types
of
KI-CBRs
paradigms.
In
part
(a)
of
Fig.
1,
the
case-base
is
stored
in
a
traditional
database,
and
the
domain
knowledge
is
stored
in
an
ontology.
In
part
(b),
the
case-base
is
stored
in
a
crisp
ontology,
and
the
domain
knowledge
is
stored
in
an
ontology.
In
part
(c),
the
case-base
is
stored
in
a
fuzzy
ontology,
and
the
domain
knowledge
is
stored
in
an
ontol-
ogy.
We
have
selected
the
most
complicated
and
recent
approach
(part
c).
For
diabetes
diagnosis,
researchers
made
efforts
toward
diabetes
ontology
development
[13].
Nevertheless,
the
literature
of
ontology-based
CBR
for
diabetes
is
not
rich
with
studies
[7,8].
The
most
critical
steps
in
CBR
paradigm
are
the
case
repre-
sentation
and
case
retrieval.
We
concentrate
on
these
two
main
steps
to
improve
the
performance
of
medical
CBR.
The
case
base
building
process
reduces
the
efforts
and
time
to
build
the
system’s
knowledge
base
compared
to
rule-based
systems.
No
generalized
knowledge
is
required
to
build
a
successful
CBR
system.
However,
the
collection
of
cases
for
patients
requires
the
integration
between
the
CDSS
system
and
the
distributed
electronic
health
record
(EHR)
environment.
As
a
result,
the
standardization
of
CBR
knowledge
and
data
is
critical
to
achieving
interoperability.
Interoperability
between
EHR
systems
and
CDSS
facilitates
the
automatic
collection
of
knowledge
from
patients’
EHRs,
supports
the
integration
of
CDSS
in
the
healthcare
environment,
and
eases
the
physician’s
querying
process.
EHR
uses
standards
as
Health
Level
7s
reference
information
model
(HL7
RIM)
[14]
and
systematized
nomenclature
of
medicine-clinical
terms
(SNOMED-CT)
[15],
SCT
for
short,
ontology
for
data
storage
and
exchange,
which
can
be
utilized
in
CBR.
RIM
can
be
used
as
a
standard
case-base
structure,
and
SCT
can
be
used
as
background
knowledge
to
enhance
semantic
retrieval
[16,17].
El-Sappagh
et
al.
[9]
proposed
a
standard
data
model
for
diabetes
case-base.
SCT
is
a
huge
ontology,
which
affects
the
performance
of
the
CBR
retrieval
algorithm.
Creating
a
reference
set
from
SCT
for
diabetes
is
required.
El-Sappagh
et
al.
[18]
proposed
a
diabetes
diagnosis
OWL2
standard
ontology
from
an
SCT
reference
set.
As
far
as
we
know,
there
are
no
studies
utilize
SCT
reference
sets
in
CBR
systems
for
diabetes
diagnosis,
which
is
considered
as
a
required
issue
for
semantic
retrieval
and
integration
of
CDSS
in
EHR
environment.
Using
the
created
SCT-based
OWL2
for
semantic
retrieval
requires
the
encoding
of
the
case-base
unstructured
knowledge
with
the
same
code.
The
encoding
process
is
not
a
straightforward
process,
and
it
requires
a
methodology.
El-Sappagh
et
al.
[19]
proposed
an
encoding
methodology
and
utilized
it
to
encode
the
case-base
contents.
Physicians
often
describe
patients
using
imperfect
and
linguis-
tic
data,
and
their
knowledge
and
natural
language
have
a
great
deal
of
imprecision
and
vagueness.
As
Zadeh
[20]
argued
much
of
the
knowledge
that
humans
acquire
through
experience
is
perception-based
and
thus
subject
to
imprecision
and
inaccuracy.
Such
knowledge,
when
not
treated
in
some
suitable
way
that
can
consider
and
convey
its
inherent
imprecision,
usually
leads
to
the
poor
effectiveness
of
the
knowledge-based
systems
that
use
it.
As
a
result,
KI-CBR
paradigm
must
handle
the
imprecise
knowledge
representation
and
reasoning
[21].
The
existing
fuzzy
CBR
systems
utilize
imprecise
knowledge
through
the
use
of
fuzzy
logic
for
case
representation
and
relevant
fuzzy
pattern
matching
techniques
for
similarity
assessment
[22].
A
survey
of
existing
systems
of
fuzzy
CBR
in
diabetes
diagnosis
indicates
that
there
are
few
works
in
this
field.
However,
the
lack
of
representation
of
this
knowledge
in
onto-
logical
restricts
the
effectiveness
of
these
systems
because
they
did
not
take
advantage
of
the
reasoning
capabilities
that
ontolo-
gies
provide.
The
fuzzy
ontology
focuses
on
assigning
a
meaning
to
the
fuzziness
of
the
ontology’s
components.
It
is
an
important
characteristic
as
it
makes
the
fuzzy
ontology’s
imprecision
explicit,
thus
facilitating
more
efficient
knowledge
acquisition
and
ontology
reuse.
Moreover,
it
enables
the
definition
of
more
effective
seman-
tic
similarity
measures,
which
facilitate
case
retrieval.
For
diabetes,
the
existing
fuzzy
CBR
systems
have
not
used
fuzzy
ontology
or
even
crisp
ontology
as
background
domain
knowledge
or
case-
base
ontologies
[8].
On
the
other
hand,
ontologies
and
fuzzy
logic
have
been
utilized
in
diabetes
in
other
reasoning
methods
such
as
rule-based
expert
systems
[23].
In
this
paper,
we
present
a
fuzzy
KI-CBR
framework
that
handles
and
exploits
imprecise
knowledge
through
the
effective
integration
of
fuzzy
logic
in
the
ontology-based
CBR
paradigm.
Fuzzy
case-base
ontology
and
a
fuzzy
semantic
retrieval
algorithm
are
proposed
and
integrated
to
build
an
intelligent
CBR
for
diabetes
diagnosis.
This
approach
introduces
fuzzy
semantics
to
CBR
in
two
places.
The
first
is
the
representation
of
imprecise
knowledge
itself,
and
the
second
is
case
retrieval.
In
particular,
our
proposed
framework
Fig.
1.
KI–CBR
frameworks.
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
181
is
built
using
a
fuzzy
ontology
that
supports
the
representation
of
imprecise
case-specific
knowledge
while
the
retrieval
of
cases
is
enabled
by
proposing
a
highly
customizable
fuzzy
semantic
simi-
larity
framework.
As
most
of
the
CBR
studies
did
not
implement
the
entire
cycle
[12,24],
we
concentrate
on
the
most
critical
and
most
related
steps
(i.e.,
case
representation
and
retrieval).
Case
adapta-
tion,
reuse,
retention,
and
case-base
maintenance
will
be
handled
in
other
works.
Importantly,
our
system
is
implemented
in
six
modules:
Case
source
preparation,
case
base
ontology
engineering,
terminology
server,
fuzzy
case-base
ontology
population,
case
retrieval
engine,
and
case
query
parser.
We
implement
and
test
the
proposed
frame-
work
on
a
real
case-base.
The
system
has
a
user-friendly
interface;
it
supports
the
selection
of
standard
medical
concepts
from
an
SCT
dialog,
and
it
implements
the
clinical
distance
in
the
case
retrieval
process.
As
a
result,
the
system
achieves
a
high-level
performance
com-
pared
to
the
traditional
CBR
systems,
other
CBR
systems
in
the
literature,
and
machine
learning
algorithms.
The
system’s
accuracy
is
97.67%.
Therefore,
it
is
highly
accurate
and
can
be
applied
in
a
real
medical
environment.
To
this
end,
the
remainder
of
the
paper
is
organized
as
follows:
Section
2
provides
studies
related
to
KI-CBR,
especially
for
diabetes,
and
show
its
limitations.
Section
3
is
a
set
of
preliminaries
including
our
dataset
description.
Section
4
illustrates
the
research
method-
ology
used
in
the
study.
Section
5
is
the
proposed
CBR
framework.
Implementation
and
evaluation
are
discussed
in
Section
6.
Finally,
Section
7
concludes
the
paper
and
highlights
future
work
direc-
tions.
2.
Related
work
The
physician
can
depend
on
clinical
practice
guidelines
(CPG)
to
diagnose
diabetes.
However,
CPGs
are
long
plaintext
documents.
Some
languages
such
as
Arden
syntax
can
be
used
for
represent-
ing
and
sharing
this
medical
knowledge.
It
can
convert
CPGs
into
actionable
rules
to
implement
rule-based
CDSS
systems.
Samwald
et
al.
[25]
proposed
a
development
environment
including
a
com-
piler
and
rule
engine
for
Arden
Syntax
rules.
However,
diabetes
diagnosis
is
an
ill-formed,
theory-less,
and
experience
based
prob-
lems.
Depending
on
rules,
is
not
suitable
because
there
will
be
many
exceptional
cases.
Rules
results
often
require
adaptations
by
a
physician.
Rules
cannot
be
customized
for
specific
patient
condi-
tions.
It
is
time-consuming
to
build
and
maintain
a
large
rule-base.
CBR
is
one
of
the
most
suitable
AI
technique
for
the
experience
based
problems
because
it
is
easier
for
an
expert
physician
to
formulate
specific
cases
that
to
formulate
generalized
rules.
Tra-
ditional
CBR
has
been
used
for
diabetes
diagnosis
in
many
studies
[4–7].
An
evolution
of
this
paradigm
is
the
ontology-based-CBR
[21].
This
approach
is
generally
more
effective
in
retrieving
sim-
ilar
cases
than
traditional
ones
[10].
Ontology
plays
many
roles
to
enhance
CBR
semantics
ranging
from
case
storage
and
representa-
tion
to
case
adaptation
and
reuse
[11].
Moreover,
case
semantic
retrieval
algorithms
can
be
improved
by
using
case-base
and
domain
background
knowledge
in
the
form
of
ontologies
[26,27].
2.1.
Regarding
the
role
of
ontology
in
diabetes
management
In
the
diabetes
domain,
ontology
has
been
used
in
many
CDSSs
[13,23,28–30].
For
example,
Chen
et
al.
[13]
introduced
an
ontology
for
diabetes
drugs
and
an
ontology
for
patients’
symptoms.
These
ontologies
utilized
semantic
web
rule
language
(SWRL)
and
Java
expert
system
shell
(JESS)
to
determine
potential
prescriptions
for
the
patients.
Rahimi
et
al.
[28]
developed
a
type
2
diabetes
mel-
litus
(T2DM)
ontology
(DMO)
to
diagnose
and
manage
patients
with
diabetes,
and
they
proposed
an
algorithm
to
query
the
ePBRN
data
repository
to
diagnose
T2DM.
Sherimon
et
al.
[29]
proposed
a
dynamic
adaptive
questionnaire
ontology
for
gathering
the
dia-
betic
patient’s
medical
history.
Hayuhardhika
et
al.
[30]
developed
an
ontology
for
diabetes
disease
and
used
a
weighted
tree
similar-
ity
algorithm
for
diagnosis.
However,
regarding
diabetes
diagnosis,
none
of
these
ontologies
is
designed
for
CBR,
and
few
studies
have
used
ontology
in
CBR
[6,8].
In
diabetes
diagnosis
systems,
ontolo-
gies
have
not
been
utilized
in
neither
case-base
nor
background
knowledge
nor
case
retrieval.
Jaya
and
Uma
[7]
have
listed
the
roles
of
ontology
in
a
diabetes
diagnosis
CBR.
El-Sappagh
et
al.
[31]
proposed
a
case-base
ontology
engineering
methodology,
and
they
proposed
a
diabetes
case-base
ontology.
However,
there
are
no
decision
support
capabilities
provided
in
the
study.
The
result-
ing
OWL2
ontology
can
be
utilized
in
the
current
study
to
store
and
retrieve
cases
semantically.
In
addition,
this
ontology
is
crisp
and
cannot
handle
the
existed
vagueness
in
diabetes
diagnosis
environ-
ment
[20].
2.2.
Regarding
the
encoding
of
medical
data
Some
of
medical
knowledge
is
stored
in
the
unstructured
form.
This
knowledge
is
not
suitable
for
CBR.
To
enhance
the
semantic
intelligence
of
a
CBR
system,
the
case-base
textual
contents
have
to
be
encoded
in
a
formal
way.
Samwald
et
al.
[32]
asserted
that
the
building
CDSS
system
requires
the
encoding
of
clinical
data
by
using
ontologies.
They
developed
a
CDSS
for
pharmacogenomic
knowl-
edge
representation
and
reasoning
based
on
an
OWL2
ontology
[33].
However,
using
standard
medical
ontologies,
such
as
SCT,
sup-
ports
the
implementation
of
semantically
intelligent
case
retrieval
algorithms
[34],
enhances
the
interoperability
and
seamlessly
inte-
gration
between
CDSS
and
EHR
environment
[16],
and
supports
the
creation
of
standard
encoded
case-base
knowledge
[35].
As
a
result,
the
unstructured
medical
data
of
EHR
are
standardized
into
a
unified
form,
which
facilitate
the
automatic
collection
of
cases
knowledge
of
the
distributed
EHR
environments.
Moreover,
the
CBR
system
becomes
more
intelligent
by
interpreting
the
meaning
of
medical
concepts.
In
addition,
case
retrieval
algorithm
can
calcu-
late
the
clinical
distance
between
patients
rather
than
geometric
or
semantic
distances.
To
the
best
of
our
knowledge,
standard
medi-
cal
ontologies
such
as
SCT
have
not
been
used
in
diabetes
diagnosis
CBR
systems.
El-Sappagh
et
al.
[18]
proposed
an
OWL2
ontology
for
SCT
to
be
used
as
background
domain
knowledge
with
CBR.
In
addition,
this
ontology
can
be
used
to
encode
the
diabetes
case-base
unstructured
knowledge
into
a
formal
and
standard
form
[19].
2.3.
Regarding
the
fuzzification
of
medical
data
Diagnosis
of
diabetes
depends
on
the
physician’s
experience
and
the
patient’s
description
of
his
case.
Most
medical
data
are
described
using
vague
terms
(i.e.,
partially
known)
[36].
Vague-
ness
can
be
handled
using
fuzzy
logic
(FL)
[20].
FL
is
useful
for
CBR
because
CBR
is
fundamentally
analogical
reasoning,
which
can
operate
with
linguistic
expressions.
FL
facilitates
the
knowledge
elicitation
from
a
domain
expert,
eases
the
transfer
of
knowl-
edge
between
domains,
and
enhances
the
similarity
measurement.
Fuzzy
logic
has
been
integrated
with
CBR
in
hybrid
systems
[37,38]
and
used
for
calculating
the
fuzzy
similarity
between
cases
[22].
However,
there
are
no
real
studies
in
the
literature
for
fuzzy-CBR
systems
for
diabetes
diagnosis.
Thirugnanam
et
al.
[39]
built
a
hybrid
CDSS
system
for
diabetes
diagnosis
using
a
neural
network,
fuzzy,
and
CBR.
This
study
used
the
fuzzy
and
CBR
reasoning
mech-
anisms
separately,
and
no
fuzziness
has
been
added
to
enhance
the
CBR
functionality.
Most
CBR
systems
in
the
literature
utilized
FL
in
case
retrieval
step
only.
Building
a
fuzzy
case-base
knowledge
is
required
to
support
fuzziness
in
CBR
systems.
However,
these
182
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
hybrid
systems
have
not
benefited
from
fuzzy
ontology
reasoning
capabilities
in
CBR
system.
2.4.
Regarding
the
role
of
fuzzy
ontology
in
CDSS
As
crisp
ontology
has
proved
its
roles
in
CBR,
fuzzy
ontologies
can
extend
the
capabilities
of
crisp
ontologies
[40].
Crisp
ontolo-
gies
are
not
suitable
to
address
imprecise
and
vague
knowledge,
which
is
inherent
in
real
world
domains
[41].
Fuzzy
ontology
can
come
from
two
sources:
mapping
of
a
fuzzy
database
[42]
or
as
an
extension
of
crisp
ontology
[40].
Fuzzy
ontology
has
been
used
in
medical
and
non-medical
systems
[41,43–47].
Ali
et
al.
[47]
pro-
posed
T2FOBOMIE;
this
system
is
an
opinion
mining
system
based
on
a
type-2
fuzzy
rough
ontology.
Rodríguez
et
al.
[41]
proposed
a
fuzzy
ontology-based
system
for
modeling
human
behavior.
Tor-
shizi
et
al.
[43]
utilized
fuzzy
ontology
to
build
an
intelligent
rule-based
system
to
determine
the
severity
of
Benign
Prostatic
Hyperplasia
and
recommend
the
appropriate
therapies.
Carlsson
et
al.
[44]
discussed
the
capabilities
of
fuzzy
ontology
over
crisp
one
and
utilized
it
in
a
knowledge
mobilization
application.
Mezei
et
al.
[45]
asserted
that
fuzzy
ontology
is
critical
to
building
actionable
knowledge
to
aid
complex
decisions,
and
they
proposed
a
fuzzy
wine
ontology.
Molinera
et
al.
[46]
proposed
a
decision
support
system
for
recommending
smartphones
using
fuzzy
ontologies.
Lee
and
Wang
[23]
used
fuzzy
ontology
to
build
a
diabetes
diag-
nosis
CDSS
system.
This
system
is
based
on
rule-based
reasoning
paradigm.
It
used
the
freely
available
Pima
Indians
dataset,
which
is
not
diabetes
representative
data.
It
achieved
the
accuracy
of
91.2%.
2.5.
Regarding
the
role
of
fuzzy
ontology
in
CBR
For
CBR
systems,
many
studies
have
utilized
fuzzy
ontology
for
case
base
representation
and
fuzzy
retrieval
processes
[21,48].
Alexopoulos
et
al.
[21]
proposed
a
fuzzy
ontology-based
CBR
system
using
fuzzy
algebra.
Ali
et
al.
[48]
proposed
a
type-2
fuzzy
ontology-based
CBR
system
for
collision
avoidance
of
autonomous
underwater
vehicles.
Fuzzy
ontology
can
enhance
CBR
in
different
ways
such
as
physician
can
more
easily
define
experience
cases
using
natural-like
language,
cases
can
be
indexed
more
efficiently,
and
fuzzy-semantic
retrieval
algorithms
can
be
implemented.
Diabetes
has
utilized
fuzzy
ontologies
in
many
fields
[49];
however,
there
is
no
fuzzy
ontology-based
CBR
for
diabetes
management.
CBR
effectiveness
is
further
improved
if
ontology-
based
CBR
systems
can
utilize
vague
or
imprecise
knowledge
[21].
We
argue
that
there
is
a
difference
between
ontology-based
fuzzy
CBR
[50]
and
fuzzy-ontology
based
CBR
[21].
The
former
builds
a
fuzzy
CBR
system
and
uses
crisp
ontology
to
enhance
its
functionality,
but
the
later
builds
a
fuzzy
ontology
for
its
case-base.
Alexopoulos
et
al.
[21]
have
concentrated
only
on
fuzzy
properties
using
fuzzy
algebra.
Fuzzy
Ontologies
can
extend
query
cases.
Fuzzy-ontology-based
KI-CBR
is
a
yet
unstudied
topic,
especially
in
the
medical
domains
such
as
diabetes
diagnosis.
Moreover,
there
are
no
studies
on
diabetes
diagnosis,
which
incorporate
subsets
of
standard
ontologies
such
as
SCT,
unified
medical
language
system
(UMLS),
gene
ontology
(GO),
international
classification
of
diseases
(ICD),
disease
ontology,
or
logical
observation
identifiers
names
and
codes
(LOINC)
as
the
background
domain
knowledge
[51].
In
addition,
in
our
study,
we
are
the
first
to
separate
case-base
ontology
from
background
knowledge
ontology.
This
separation
has
a
great
role
in
the
medical
domain
because
the
case
base
and
domain
ontologies
are
usually
huge;
moreover,
many
standard
ontologies
can
be
simultaneously
utilized
in
the
same
CBR
system.
As
shown
in
Fig.
2,
the
purpose
of
this
paper
is
to
propose,
imple-
ment,
and
test
a
fuzzy
KI-CBR
framework
using
characteristics
of
ontology,
fuzzy
logic,
and
standard
medical
terminology
(i.e.,
SCT).
To
accomplish
this
purpose,
the
major
contributions
to
performing
this
research
can
be
summarized
as
follows:
We
propose
an
integrated
fuzzy
knowledge-intensive
CBR
frame-
work.
This
system
(shown
in
Fig.
5)
is
distinctive
in
its
novel
architecture
and
can
be
applied
in
the
development
of
a
variety
of
CDSS
systems.
We
introduce
an
efficient
way
to
develop
the
case-base
fuzzy
ontology,
which
is
the
backbone
of
the
proposed
system.
This
ontology
is
built
based
on
our
previously
published
crisp
ontol-
ogy
[31]
and
the
top-level
CBR
crisp
ontology
namely
CBROnto
proposed
by
[52].
The
step-by-step
tutorial
on
the
fuzzy
ontol-
ogy
development
process
can
be
helpful
for
interested
readers
to
conduct
experiments.
The
proposed
fuzzy
ontology
is
the
first
in
the
medical
domain.
We
propose
a
fuzzy
semantic
retrieval
algorithm
for
retrieving
cases
from
the
fuzzy
ontology
according
to
the
physician
new
coming
problems.
This
hybrid
algorithm
is
accurate
and
takes
into
account
the
types
of
patient’s
features
including
numerical,
fuzzy,
ordinal,
lexical,
and
semantic
types.
Moreover,
the
fuzzy
types
are
represented
in
a
fuzzy
ontology,
and
the
semantic
types
are
based
on
a
standard
diabetes
diagnosis
SCT
ontology.
To
perform
the
case
study,
we
develop
a
JAVA-based
prototype
based
on
the
most
popular
CBR
APIs
(i.e.
JCOLIBRI).
The
internal
intelligent
processes
of
the
prototype
control
the
query
processes.
The
physician
enters
the
patient
description
data
in
a
new
case
QV.
The
system
converts
the
query
case
crisp
vector
into
a
fuzzy
semantic
vector
QFSV.
The
QFSV is
passed
to
the
retrieval
engine,
which
retrieves
the
most
similar
cases
to
the
QFSV case
based
on
the
clinical
distances
between
patients.
The
experimental
results
that
are
generated
by
utilizing
this
prototype
advocate
the
effi-
ciency
of
the
proposed
architecture.
The
proposed
framework
utilizes
our
previously
proposed
ontologies
such
as
the
crisp
case-base
ontology
[31]
and
the
Fuzz
y CBR Ontology
Fuzzy-CBR KI-CBR
Fuzzy-ontology based CBR
Domain
ontolog
y
Case-base
ontolog
y
Fuzzy ca
se-base
ontolog
y
Utilizes
Subclass of
Fig.
2.
Our
research
focus.
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
183
Case C1Case C
m
Case Ci
fi1
. . .
fin
Problem Pi
Solution Si
Case bas
e CB
Query cas
e C
q
fq1
. . .
fqn
Problem Pq
Solution Sq
?
Retrieve
Similar
ity
Fig.
3.
The
correspondence
between
stored
cases
and
query
case.
diabetes
standard
ontology
from
SCT
taxonomy
[18].
In
addition,
to
prepare
our
case-base
contents,
we
utilize
the
case-base
stan-
dard
data
model
[9],
use
the
pre-processing
step
to
handle
noisy
data,
select
relevant
features,
and
calculate
the
weight
vector
[35].
We
utilized
our
encoding
methodology
to
encode
the
case-base
unstructured
knowledge
into
standard
form
[19].
Moreover,
we
utilize
our
fuzzification
methodology
to
fuzzify
the
case-base
vague
concepts.
3.
Preliminaries
To
make
the
article
self-contained,
in
this
section
we
define
some
concepts,
definitions,
and
terminologies
before
discussing
the
proposed
framework.
3.1.
Case
base
reasoning
Generally,
CBR
is
an
AI
technique
for
solving
a
problem
by
remembering
similar
past
experiences.
For
example,
physicians
look
for
groups
of
known
symptoms
and
engineers
take
many
of
their
ideas
from
previously
successful
solutions.
The
main
concept
of
CBR
is
“similar
problems
have
similar
solutions.”
CBR
knowledge
is
formed
in
a
case-base
of
previous
experiences
(either
success
or
failure).
It
does
not
depend
on
the
explicit
model
of
the
problem
as
in
rule
base
reasoning
for
the
inference
process,
but
it
simply
uti-
lizes
the
experience
captured,
in
the
same
way,
the
expert
usually
inputs
and
processes
it.
The
newly
solved
problems
can
be
added
as
a
new
experience
in
the
CBR
system’s
experience-base
(case-base),
which
supports
the
auto-learning
process.
The
CBR
can
be
defined
as
a
cyclic
process
named
“the
four
Rs”
[54]:
(i)
Retrieve
the
most
similar
cases,
(ii)
Reuse
the
cases
that
might
solve
the
problem,
(iii)
Revise
the
proposed
solution
if
necessary,
and
(iv)
Retain
the
new
solution
as
part
of
a
new
case.
The
most
important
aspects
of
CBR
system
are
the
case-base
knowledge
representation
and
the
case
retrieval
algorithm,
and
these
are
our
contributions
in
the
current
paper.
Definition
1.
A
case-base
CB
is
a
finite
set
of
cases{C1,C2,.
.
.Cm},
where
m
is
the
number
of
cases
in
the
CB.
Definition
2.
A
case
is
a
case
is
a
contextualized
piece
of
knowl-
edge
representing
an
experience.
The
ith
experience
case
Ci
CB
is
formally
defined
as
Ci=
Pi,
Si,
where
Piand
Sirespectively
repre-
sent
the
case
problem
description
and
the
case
solution
features.
Definition
3.
A
case
retrieval
algorithm
is
an
algorithm
that
takes
as
input
(query
case
Cq,
case
base
CB,
and
features
weighting
vec-
tor
W);
it
calculates
the
level
of
similarity
between
Cqand
every
case
in
CB;
and
finally
it
returns
the
solution
of
the
most
simi-
lar
cases.
The
k-nearest
neighbour
(k-NN)
is
the
most
applicable
retrieval
algorithm.
More
formally,
let
Cq=
Pq,
X
be
a
query
case,
where
Pqis
the
query
case’s
problem
and
X
denote
is
its
solution.
It
should
be
mentioned
that
the
main
objective
of
the
CBR
system
is
to
determine
the
value
of
X,
which
is
unknown
before
the
execution
of
the
case
retrieval
process.
In
general,
multiple
features
describe
the
problem
situations
of
both
the
case-base
historical
cases
and
the
target
case.
Let
N
=
{1,
2.
.
.n},
is
the
total
number
of
attributes.
Let
f
=
{f1,
f2.
.
.fn}
be
a
finite
set
of
n
features
concerning
the
prob-
lem
situations
of
both
the
historical
cases
and
the
target
case,
where
fjdenotes
the
jth
attribute,
j
N.
Let
W=
(w1,
w2.
.
.wn)T
be
a
weights
vector
of
case
features
which
determine
the
features
importance,
where
wjdenotes
the
weight
or
the
importance
degree
of
attribute
fj,
such
that n
j=1wj=
1
and
0
wj
1,
j
N.
Let
C1=
(fi1,
fi2.
.
.fin)Tbe
a
vector
of
feature
values
for
the
problem
situation
of
historical
case
Ci,
where
jdenotes
the
consequence
of
historical
problem
situation
Ciconcerning
attribute
fj,
i
M,
j
N.
Let
Cq=
(fq1,
fq2.
.
.fqn)Tbe
a
vector
of
feature
values
for
the
prob-
lem
situation
of
target
case
Cq,
where
fqjdenotes
the
consequence
of
current
problem
situation
Cqconcerning
attribute
fj,
j
N.
As
shown
in
Fig.
3,
the
correspondence
between
query
case’
and
the
historical
cases’
features
can
be
easily
defined.
The
case
retrieval
algorithm
depends
on
the
level
of
similar-
ity
between
two
cases
SIM Ci,
Cq,
i.e.
the
global
similarity,
where
SIM Ci,
Cq[0,
1].
The
similarity
function
SIM Ci,
Cqis
the
col-
lection
of
feature-level
similarities
sim fij,
fqj,
the
local
similarity,
where
sim fij,
fqj[0,
1].
Many
studies
of
existing
CBR
assume
that
all
features
are
of
the
same
datatype
(e.g.
numerical)
and
pro-
vide
as
single
local
similarity
function
sim fij,
fqjto
measure
the
similarity
between
fij and
fqj.
This
is
not
the
normal
case
[55].
In
our
study,
we
propose
one
of
the
most
complete
similarity
measure,
which
takes
into
account
the
numerical,
nominal,
ordinal,
fuzzy,
and
semantic
feature
types,
as
shown
in
Fig.
4.
The
global
similarity
between
the
two
cases
SIM Ci,
Cqcan
be
defined
by
a
distance
method.
The
most
widely
used
measures
are
the
Euclidean
distance
or
Hamming
distance,
as
shown
in
the
following
equation:
sim (Ci,
Cq)=
1
j
w2
j×
dist2fij ,
fqjif
the
Euclidean
distance
used,
1
j
wj×
dist fij,
fqjif
the
Hamming
distance
used.
(1)
where
sim fij,
fqjfunction
is
defined
in
terms
of
the
function
dist fij,
fqj.
3.2.
Case
representation
The
contents
of
a
case-base
must
be
defined
in
the
first
beginning
of
a
CBR
system.
These
contents
determine
all
of
the
sub-
sequent
steps
such
as
case-base
ontology,
case
base
fuzzy
ontology,
and
case
retrieval.
After
checking
with
the
domain
experts,
CPGs,
and
handbooks
of
case
histories
in
diabetes
diagnosis
domain,
our
case
will
contain
the
features
described
in
Table
1.
The
data
have
been
obtained
and
managed
by
the
hospitals
of
Mansoura
Uni-
versity,
Mansoura,
Egypt.
All
the
features
that
affect
the
diabetes
diagnosis
have
been
collected
by
our
domain
experts.
Some
data
are
collected
from
a
diagnostic
biochemical
lab
(AutoLab,
Mansoura,
Egypt).
The
used
data
set
was
collected
from
January
2010
through
184
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Data-pr
operty range-types
Numeric Nominal SCT instance Ordinal
Fuzzy
A=?
B=?
Integer or
decimal
A= “XXX
B= “YYY”
A
B
a>b>c>d
A= a
B= c A=?
B=?
Fuzzy
number
Fig.
4.
The
case
feature
types.
1 Patient
case
Kidn
ey
fun
ction
test
Ser
um urea
Serum uric
acid
Serum creatinin
e
Serum sod
ium
Serum potassium
CaseID
Liver
disease
Nephropathy
Disease
Diabetes
CaseID
Cancer
type
Kidne
y
di
sease
H
erc
hol
estremia
Diagnos
is
CaseID
Total bil
irubin
Direct bili
rub
in
SGOT
AST
CaseID
SGPT ALT
Alk
phosphatase
γ
GT
Total
p
rotein
Alb
umin
Liver fun
ction
test
Triglycer
ides
HDL cholester
ol
CaseID
Lipi
d profile
LDL cholesterol
Total cholesterol
Global symptoms
CA12
5
Thi
rst
Vision CaseID
Hun
ger
Urinati
on frequency
Fati
g
ue
Birth
AFP
Serum
FERRITIN
Ameno
rrhea
Dysmenorrhea
Urin
ation
sympto
m
CaseID
Protein
Bloo
d
Bili
rubin
Glucose Keton
es
Urolibingen
PusRB
cs
Crystals
Hematological pro file
Lymphocyte
s
CaseID
Redcel
l
count
Hb
g
Haematocrit
MCV
MCH
MCHC
Platelet cou nt
White cell
count
Mono
c
y
te
s
Eosi
nophi
ls
Basophils
Pr
ot
hr
o
m
b
in
I
N
R
CaseID Diabetes lab test
HbA1C FPG
2hPG
Ge
n
de
r
BMI
CaseID
Age
Residence
Occ
upation
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
Fig.
5.
Diabetes
diagnosis
and
other
related
complaints
case
base
data
model.
August
2013.
There
are
67
eligible
patients,
who
enrolled
in
this
study.
However,
seven
control
subjects
were
excluded
due
to
lim-
ited
blood
samples
for
testing
AFP.
Our
data
set
contains
70
features
for
describing
diabetic
patients
and
for
linking
diabetes
with
other
disorders
such
as
cancer,
kidney
diseases,
and
liver
diseases.
The
data
set
is
distributed
as
33.3%
pre-diabetic
patients,
53%
diabetic
patients,
and
13.7%
normal
patients.
Table
1
shows
descriptions
of
considered
features
in
this
study.
3.3.
The
structure
of
a
diabetes
diagnosis
case
Fig.
5
shows
an
Entity
Relationship
(ER)
model
for
all
entities
and
attributes
used
in
our
data
set.
This
data
model
is
compatible
with
HL7
RIM
[56].
This
compatibility
facilitates
the
integration
with
EHR
and
supports
the
auto
collection
of
cases.
Moreover,
this
data
model
has
been
fuzzified
with
our
proposed
fuzzification
method-
ology
into
a
fuzzy
ER
model,
then
converted
to
a
fuzzy
case-base
database,
which
was
the
source
of
instances
for
our
proposed
fuzzy
case-base
ontology.
These
entities
and
attributes
were
enriched
by
entities
and
attributes
in
diabetes
diagnosis
CPGs
as
in
the
National
Guidelines
Clearing
House1.
Entities
and
features
related
to
dia-
betes
treatment,
medications,
and
drugs
are
out
of
scope.
Definition
4.
Diabetes
diagnosis
cases
are
defined
according
to
our
data
model.
A
case
C
=
P,
S
is
defined
as
follows:
P
=
{LFT,
1http://www.guideline.gov/.
LP,
GS,
A,
B,
R,
G,
O,
KFT,
LT,
US,
HP,
DI}
where
LFT
=
liver
func-
tion
tests,
LP
=
lipid
profile,
GS
=
global
symptoms,
A
=
age,
B
=
BMI,
R
=
residence,
G
=
gender,
O
=
occupation,
KFT
=
kidney
function
tests,
LT
=
lab
tests,
US
=
urination
symptoms,
HP
=
haematological
profile,
and
DI
=
{L
+
N
+
C
+
H}
where
L
=
probable
liver
problem,
N
=
probable
nephropathy
problem,
C
=
probable
cancer
type,
and
H
=
probable
hypercholesterolemia
problem.
S(P)
is
the
solution
part
describes
the
diagnosis
of
diabetes
including
diabetic,
predia-
betic,
gestational–diabetic,
and
prediabetic–gestational.
S
=DD
where
DD
=
diabetes
diagnosis.
Our
diagnostic
features
can
be
numerical
features
(e.g.,
age,
lab
tests,
BMI
and
so
on),
ordinal
fea-
tures
(e.g.,
features
in
Global
symptoms
table
in
Fig.
5),
and
text
features
(e.g.,
sex,
occupation,
etc.).
All
these
features
have
not
been
encoded
in
SCT
concepts
because
their
coding
will
not
enhance
the
semantic
retrieval
algorithm
of
CBR.
On
the
other
hand,
patient
disorders
are
instance
features,
and
we
have
mapped
it
to
standard
SCT
concepts
in
another
work
[18].
We
concentrated
on
the
CBR
semantic
retrieval
aspect,
not
sharing
and
interoperability
issues.
For
example,
if
feature
HbA1c
=
6.4
is
encoded
in
SCT
as
|43396009:
Hemoglobin
A1c
measurement|
=
6.4,
this
code
enhances
semantic
interoperability
but
does
not
enhance
semantic
retrieval
process
in
CBR.
On
the
other
hand,
if
the
patient
has
a
disorder
such
as
nephropathy,
this
concept
has
a
long
sub-tree
of
disorders
(e.g.,
caliectasis,
amyloid
nephropathy,
calyceal
fistula,
and
so
on),
which
can
be
described
by
different
physicians.
The
semantic
similarity
of
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
185
Table
1
The
patient
attributes
used
to
describe
cases.
Data
type
=
{P
=
primitive,
I
=
instance
of
SCT
concept,
N
=
numerical,
C
=
categorical,
F
=
fuzzy,
O
=
ordinal}.
Feature
type Feature
name Data
type Normal
range
UoM
Min-mean-max
F.
No.
Demographics Residence
P,
C
{Urban,
Rural}
1
Occupation
P,
C
{Farmer,
Police}
2
Gender
P,
C
{Male,
Female}
3
Age
P,
N,
F
Year
29–48.117–74
4
BMI
P,
N,
F
18.5–25
kg/m220–33.117–45
5
Diabetes
lab
tests HbA1C
P,
N,
F
5
mmol/L
5–6.373–7.4
6
2h
PG P,
N,
F <139
mg/dL
165–202.733–235 7
FPG
P,
N,
F
<99
mg/dL
96–129.633–156
8
Haematological
profile Prothrombin
INR
P,
N,
F
0–1
%
1–1.16–1.4
9
Red
cell
count
P,
N,
F
4.2–5.4
106/cmm
3.8–5.194–5.88
10
Hbg
P,
N,
F
12–16
g/dL
9.8–12.332–13.4
11
Haematocrit
(PCV)
P,
N,
F
37–47
vol%
31.1–35.215–36.8
12
MCV
P,
N,
F 80–90
26.8–71.908–76.4 13
MCH
P,
N,
F 27–32 pg
3.3–25.47–29.4
14
MCHC
P,
N,
F
30–37
%
1.8–35.465–41.7
15
Platelet
count
P,
N,
F
150–400
103/cmm
135–316.183–2000
16
White
cell
count P,
N,
F 4–11 103/cmm 6–8.055–9.2
17
Basophils
P,
N,
F
0–1
%
0–1.013–5
18
Lymphocytes
P,
N,
F
20–45
%
21.2–25.768–29
19
Monocytes
P,
N,
F
2–10
%
1.7–2.942–4
20
Eosinophils
P,
N,
F
1–4
%
1–1.897–3.4
21
Symptoms Urination
frequency
O
22
Vision
O
23
Thirst
O
24
Hunger
O
25
Fatigue
O
26
Kidney
Function
Lab
tests
Serum
potassium
P,
N,
F
3.5–5.3
mEq/L
2.4–3.767–4.3
27
Serum
urea
P,
N,
F
5–50
mg/dL
17–31.56–67
28
Serum
Uric
acid P,
N,
F 3.0–7.0 mg/dL
3–4.237–7.9
29
Serum
creatinine
P,
N,
F
0.7–1.4
mg/dL
0.9–1.35–3.6
30
Serum
sodium
P,
N,
F
135–150
mEq/L
134–137.833–158
31
Lipid
profile LDL
cholesterol
P,
N,
F
0–130
mg/dL
50–94.917–170
32
Total
cholesterol
P,
N,
F
0–200
mg/dL
158–209.367–275
33
Triglycerides
P,
N,
F
60–160
mg/dL
78–144.767–189
34
HDL
cholesterol
P,
N,
F
45–65
mg/dL
30–55.533–65
35
Tumor
markers FERRITIN
P,
C
28–397
ng/mL
36
AFP
Serum
P,
C
0.5–5.5
IU/ml
37
CA-125
P,
C
1.9–16.3
U/mL
38
Urine
analysis Chemical
examination Protein
O
39
Blood
O
40
Bilirubin
O
41
Glucose
O
42
Ketones
O
43
Urobilinogen
O
44
Microscopic
examination
Pus
O
45
RBcs
O
46
Crystals
O
47
Liver
function
tests S.
albumin
P,
N,
F
3.5–5.0
g/dL
1.9–4.082–5.4
48
Total
bilirubin
P,
N,
F
0.0–1.0
mg/dL
0.8–1.317–3
49
Direct
bilirubin
P,
N,
F
0.0–0.3
mg/dL
0.3–0.533–1.6
50
SGOT
(AST)
P,
N,
F
0–40
U/L
35–54.567–165
51
SGPT
(ALT)
P,
N,
F
0–45
U/L
35–57.317–183
52
Alk.
phosphatase
P,
N,
F
64–306
U/L
170–214.2–360
53
GT
P,
N,
F
7–32
U/L
18–35.833–98
54
Total
protein
P,
N,
F
6.0–8.7
g/dL
3.1–4.858–8.7
55
Females
history Amenorrhea
I
56
Birth
I
57
Dysmenorrhea
I
58
Diagnosis
Diabetes
type
P,
C
59
Nephropathy
Nephropathy
check
I
60
Lipid
disease
Hypercholesteremia’s
check
I
61
Cancer
type
Tumor
markers
I
62
Liver
disease
Liver
problem
I
63
Radiological
examination
Radiological
examination
I
64
these
concepts
is
critical
in
KI-CBR
retrieval
engine.
Moreover,
the
case
solution
features
are
not
encoded
because
these
features
do
not
participate
in
measuring
similarity
between
cases.
3.4.
Ontology
Ontology
is
a
formal,
explicit
specification
of
a
shared
concep-
tualization.
It
is
a
unified
view
of
a
domain,
which
describes
its
instances,
concepts,
and
relationships
between
them
[57].
The
main
advantage
of
ontology
usage
is
that
it
support
the
sharing
and
reusing
of
formally
represented
knowledge
by
explicitly
stating
the
concepts,
relationships
and
axioms
of
a
domain.
Ontology
is
defined
by
a
particular
language.
OWL2
is
the
most
recent
ontology
repre-
sentation
language
defined
by
W3C2.
In
addition,
ontology
mainly
2http://www.w3.org/TR/owl2-overview/.
186
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
depends
on
a
specific
description
logic
(DL).
For
example,
OWL2
is
based
on
the
SROIQ
(D)
DL.
DL
is
a
formal
logic
that
can
enhance
the
reasoning
capabilities
of
CBR
systems.
Ontology
can
be
formally
defined
as
Definition
5.
Ontology
is
defined
as
O
=I,
C,
R,
A
where
O
is
an
ontology
for
a
domain
of
interest;
C
is
a
set
of
concepts
in
a
domain;
I
is
a
set
of
individuals
or
instances
in
a
domain;
R
is
a
set
of
relations
among
concepts
including
object
and
data
rela-
tionships;
A
is
a
set
of
axioms
holding
among
concepts,
relations,
or
individual.
Axioms
provide
explicit
logical
assertions
about
these
three
elements.
3.5.
Fuzzy
sets
The
backbone
of
the
proposed
framework
is
the
case-base
fuzzy
ontology.
This
ontology
is
defined
by
the
combination
of
fuzzy
sets
theory
with
crisp
ontology.
Fuzzy
set
theory
was
introduced
by
Zadeh
[58]
to
address
vague
and
imprecise
concepts.
Classical
sets
are
defined
by
characteristic
functions:
Definition
6.
Let
be
a
set
and
A
be
a
subset
of
(A
).
Then
the
function
in
the
following
equation:
A(x)=1
if
x
A
0
if
x
/
A(2)
is
called
the
characteristic
function
of
the
set
A
in
.
Fuzzy
sets
introduce
the
concept
of
partial
membership
where
an
element
can
be
a
member
of
a
set
with
a
certain
degree
in
[0,1]
other
than
{0,
1}
in
crisp
sets.
As
a
result,
it
allows
the
reasoning
by
linguistic
terms.
A
fuzzy
set
can
be
defined
as
follows:
Definition
7.
A
fuzzy
set
A
over
a
universe
of
discourse
X
is
defined
by
a
membership
function
A(or
simply
A)
which
maps
each
ele-
ment
x
to
a
value
between
[0,1],
as
shown
in
the
following
equation:
A(x):
X
[0,
1](3)
where
A
is
the
fuzzy
set,
Ais
the
degree
of
membership,
x
X,
and
A(x)
[0,
1].
A
fuzzy
set
A
can
be
defined
as
set
of
ordered
pairs:
A
=x/A(x)|x
X.
3.6.
Fuzzy
ontology
Vagueness
is
the
vital
part
of
any
suitable
medical
diagnosis
sys-
tem.
Fuzzy
logic-based
systems
employ
the
classical
fuzzy
logic
theory,
which
can
handle
vagueness
at
a
certain
level.
After
the
successfulness
of
crisp
ontology
and
the
applicability
of
fuzzy
logic
in
case
representation
and
retrieval
of
CBR,
the
integration
of
these
two
technologies
(in
a
fuzzy
ontology)
will
surely
enhance
the
per-
formance
of
CBR
systems.
Up
to
17
formal
definitions
can
be
found
in
fuzzy
ontology
[59].
One
definition
is
an
ontology
that
uses
fuzzy
logic
to
provide
a
natural
representation
of
imprecise
and
vague
knowl-
edge
and
eases
reasoning
over
it.
Formally
speaking,
a
fuzzy
ontology
can
be
defined
as
follows:
Definition
8.
A
Fuzzy
OWL
ontology
FO
consists
of
a
fuzzy
ontol-
ogy
structure
FOSand
a
fuzzy
ontology
instances
FOI,
so
FO
=
(FOS,
FOI)
[42]:
FOS=
FID0
FAxiom0,
where
FID0=
FCID0
FDRID0
FOPID0
FDPID0is
a
set
of
fuzzy
class
descriptions,
and
FAxiom0is
a
set
of
fuzzy
class
and
property
axioms
defined
over
FID0:
FCID0is
a
set
of
fuzzy
classes
or
concepts.
Each
fuzzy
class
may
be
a
user-defined
fuzzy
class,
or
one
of
two
predefined
fuzzy
classes
owl:
Thing
and
owl:
Nothing.
FDRID0is
a
set
of
fuzzy
datatypes.
Each
fuzzy
data
type
may
be
a
predefined
XML
Schema
fuzzy
datatype.
FOPID0is
a
set
of
fuzzy
object
properties.
FDPID0is
a
set
of
fuzzy
data
properties.
FAxiom0is
a
set
of
fuzzy
class
and
property
axioms
defined
over
FID0.
FOI=
FIID0
FAxiom0,
where
FIID0is
a
set
of
individuals,
and
FAxiom0is
a
set
of
fuzzy
individual
axioms.
FO
=
fuzzy
ABOX
A
+
fuzzy
TBOX
T.
A
Fuzzy
TBOX
is
a
finite
set
of
fuzzy
concept
inclusion
axioms
of
the
form
˛
n,
and
fuzzy
role
inclusion
axioms
of
the
form
˛
n,
where
n
(0,
1]
and
can
be
a
concept
inclusion
axiom
or
a
role
inclusion
axiom.
A
Fuzzy
ABOX
is
a
finite
set
of
fuzzy
concept
and
fuzzy
role
assertions
axioms
of
the
form
˛
n,
where
n
(0,
1]
and
˛
is
a
role
or
concept
assertion
of
the
form
a:C
˛,(a,b):R
˛,(a,b):¬R
˛,a
/=
b,
and
a
=
b.
The
main
idea
in
fuzzy
DLs
is
that
concepts
and
roles
are
inter-
preted
as
fuzzy
subsets
of
an
interpretation’s
domain.
In
fuzzy
DLs,
axioms
can
occur
with
a
certain
degree
of
truth.
The
notion
of
satis-
faction
of
a
fuzzy
axiom
E
by
a
fuzzy
interpretation
I,
denoted
I
E,
is
defined
in
[60]
as
follows:
I
˛
iff
1
˛
I
(trans
R)iff
x,yI,
RI(x,
y)
supzIRI(x,
z)
RI(z,
y)
I
R1
R2iff
x,
y
I·
RI
1(x,
y)
RI
2(x,
y)
I
(inv
R1,
R2)
iff
x,
y
I·
RI
1(x,
y)
RI
2(x,
y)
Concept
C
is
satisfiable
iff
there
is
an
interpretation
I,
and
an
individual
X
Isuch
that:
CI(x)
>
0.
For
a
set
of
axioms
,
we
say
that
I
satisfies
ε
iff
I
satisfies
each
element
in
ε.
I
is
a
model
of
E
iff
I
E.
I
satisfies
(is
a
model
of)
a
fuzzy
KB
K
=
A,
T,
denoted
I
K,
iff
I
is
a
model
of
each
component
A,
T,
respectively.
An
axiom
E
is
a
logical
consequence
of
a
knowledge
base
K,
denoted
K
E
iff
every
model
of
K
satisfies
E.
3.7.
The
fuzzy-semantic
case
representation
Given
a
case
base
crisp
ontology,
elements
that
can
be
fuzzified
include
datatypes,
object
properties
(through
fuzzy
modifiers),
and
data
properties.
Moreover,
fuzzy
case
base
ontology
can
include
crisp
assertions
side-by-side
with
fuzzy
assertions.
Cases
are
stored
in
a
fuzzy
ontology
as
concept
instances.
As
a
result,
a
case-base
CB
is
defined
as:
CB
=
{01,02.
.
.
0m},
where
m
is
the
number
of
cases
and
0kis
the
ks
case.
Each
case
in
the
case-base
ontology
is
defined
as
follows:
Definition
9.
A
case
0kis
a
vector
of
conjunctive
set
of
predicates
of
the
form:
0k
P1
P2,
.
.
.,
Pn
where
Piis
the
i’s
predicate
of
four
forms:
A
(fuzzy)
concept
assertion
a:Ci
˛,
A
(fuzzy)
object
property
assertion
(a,b):Ri
˛,
or
A
(fuzzy)
data
property
assertion
(a,v):Ti
˛,
for
a,b
as
abstract
individuals
and
v
as
a
literal
value.
A
(fuzzy)
data
property
asser-
tion
(a,v):Ti,
for
v
as
a
fuzzy
linguistic
term
defined
using
a
fuzzy
datatype.
By
converting
the
physician
query
into
a
semantic
query
of
the
form
iQ
PQ1
PQ2,
.
.
.,
PQn,
the
similarity
calculation
between
these
predicates
becomes
straightforward.
This
similar-
ity
depends
on
the
inference
capabilities
of
the
utilized
ontology
reasoners.
The
querying
process
will
be
detailed
in
subsequent
sections.
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
187
As
we
assert,
(fuzzy)
ontology
has
many
roles
in
CBR.
It
can
play
many
roles
in
every
phase
of
CBR
including
case
representation,
indexing,
retrieval,
adaptation,
and
maintenance.
For
the
case
rep-
resentation
and
retrieval
steps,
these
roles
include
the
following:
It
considerably
reduces
the
knowledge
acquisition
bottleneck
[11].
It
allows
knowledge
engineers
to
use
knowledge
already
acquired,
conceptualized,
and
implemented
in
a
formal
language,
such
as
DLs
based
languages.
It
supports
persistence
of
cases
and
indexes
using
individuals
or
concepts
that
are
embedded
in
the
ontology
[31].
It
can
be
used
as
a
vocabulary
to
define
the
case
structure,
either
if
the
cases
are
embedded
as
individuals
in
the
ontology
itself,
or
if
the
cases
are
stored
in
a
different
persistent
media
such
as
a
database
[10].
It
can
play
the
role
of
terminology
to
define
the
query
vocabulary
[31].
The
user
can
better
express
his
requirements
if
he
can
use
a
richer
vocabulary
to
define
the
query.
During
the
similarity
computation,
the
ontology
allows
the
user
to
bridge
the
semantic
gap
between
the
query
terminology
and
the
case
base
terminology
[18].
It
supports
dynamic
case
storage
where
features
can
be
added,
updated,
or
deleted
from
the
case
base.
It
preserves
storage
space
in
which
many
cases
can
point
to
the
same
feature
values.
It
can
define
a
semantic
index
of
cases
for
an
in-memory
case
base
[61].
Ontology’s
description
logic
reasoners
such
as
Pellet
and
FaCT++
[62]
can
check
the
case
base
consistency,
redundancy,
and
ade-
quacy,
which
is
not
possible
in
a
regular
database
environment
[42].
Moreover,
reasoners
significantly
enhance
the
effectiveness
of
the
case
retrieval
process
Domain’s
background
knowledge
such
as
SCT
medical
terminol-
ogy
can
be
integrated
with
case
base
ontology
to
create
a
KI-CBR
[11].
For
active
CDSS,
ontology
supports
interoperability
between
CBR-based
CDSS
and
EHR
system
database.
Ontology
provides
a
common
understanding
of
a
domain.
As
a
result,
it
supports
the
implementation
of
distributed
CBR
systems
[52].
By
using
an
ontology,
complex
relations
between
case
features
can
be
created.
For
example,
the
relationship
between
diabetes
symptoms
and
disorders
can
be
used
for
inference
values
of
missing
features.
Heterogeneous
cases,
which
have
no
fixed
structure,
can
be
designed.
They
may
have
different
structures
with
different
types
and
numbers
of
features.
Cases
can
have
relationships
with
each
other
such
as
Cause,
ISA,
Part
Of,
Result
From.etc.
These
relationships
can
handle
incom-
plete
cases
and
allow
default
values
(by
inheritance)
[11].
Compound
features,
which
contain
many
other
simple
and
com-
pound
features,
can
be
defined.
Utilizing
(fuzzy)
ontology
engineering
methodologies
can
help
making
the
CBR
knowledge
acquisition
process
more
efficient.
In
the
medical
domain,
where
there
are
many
standard
ontolo-
gies
as
SCT,
GO.etc.,
ontologies
reuse
has
many
benefits
such
as
standardization
of
the
CDSS
knowledge,
interoperability,
distribu-
tion
of
knowledge,
and
so
on.
Moreover,
many
ontologies
can
be
integrated
with
the
CBR,
where
each
case
feature
can
be
semanti-
cally
connected
with
an
ontology.
For
example,
the
patient
disease
feature
can
be
connected
to
Disease
ontology;
patient
gene
feature
can
be
connected
to
GO
ontology;
patient
lab
test
features
can
be
connected
to
LOINC
ontology.etc.
Ontology-based
representation
of
cases
enables
reusing
and
adaptation
in
a
variety
of
application
scenarios
[21].
Creating
a
fuzzy
case-base
ontology
from
a
fuzzy
case-base
database
is
supported
by
methodologies
[40],
languages
[63],
tools
[48],
and
reasoners
[60].
These
fuzzy
ontologies
add
vagueness
to
the
KI-CBR
systems.
4.
Research
methodology
As
shown
in
Fig.
6,
we
follow
a
specific
methodology
to
finish
this
study.
To
accomplish
the
purpose
of
this
study,
we
have
utilized
some
existing
technologies
and
studies.
Moreover,
we
have
utilized
our
previous
research
studies
to
complete
some
specific
steps.
In
the
figure,
we
make
a
clear
cut
between
the
current
study
goals
and
the
other
utilized
works.
In
the
first
step,
the
detailed
understanding
of
the
nature
of
diabetes
mellitus
disease
and
its
diagnosis
process
requires
deep
interviews
with
the
domain
experts.
The
next
step
involves
the
col-
lection
of
patients
EHR
records
to
implement
the
case-base
fuzzy
ontology.
This
dataset
will
determine
the
structure
of
the
case-base
ontology,
and
it
will
be
used
to
populate
the
ontology.
However,
the
collected
medical
data
needed
preparation
processes
includ-
ing
(pre-processing
to
enhance
the
quality
of
data
and
calculate
the
weight
vector,
coding
to
formalize
the
unstructured
contents
of
medical
data
using
a
standard
medical-ontology,
and
fuzzifi-
cation
to
fuzzify
some
numerical
features).
Moreover,
a
standard
ontology
needs
to
be
created
from
the
huge
SCT
ontology
to
be
used
as
the
domain
background
knowledge
in
similarity
calcula-
tion
process.
We
utilize
our
previous
studies
to
accomplish
this
Existing utilized works
SNOMED C
T
HL7 RIM
Our SCT ont
olog
y
Our stand
ard
diabetes
data mod
el
Pro
OWL2
Fuzzy OWL2 plugin
Crisp ontology
reasoners
(Pellet
)
Fuzzy ontolo
gy
rea
son
ers (
F
uzzD
L)
Our EHR
raw ca
ses
Our diabetes case
b
ase cris
p
ontol
o
gy
Our ontology
en
g
inee
rin
g
method
Our encoding methodology
Our pr
e-process
ing method
olog
yOur fuzzification methodology
Pa
p
er’s s
p
ecific work
CBROnto
In depth
interview
s with
domain
exper
ts
System testing (each module and as a whole)
System implemen
tatio
n
Design a fuzzy semantic case retrieval algorithm
Design of the fuzzy KI-CBR framework
Fuz
zy ontolog
y con
stru
ction
and
popul
ation
Case bas
e pr
e-process
ing
, encoding,
and
fuzzific
ation
IKARUS-O
nto
Fig.
6.
Research
structure.
188
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
step
[9,18,19,35].
The
next
step
involves
the
construction
and
pop-
ulation
of
the
case-base
fuzzy
ontology.
As
this
step
is
complex,
we
extend
the
previously
proposed
crisp
ontology
[40]
using
a
high-
level
methodology
[31]
and
create
a
fuzzy
OWL2
ontology
using
protégé
tool.
As
far
as
we
know,
there
are
no
fuzzy
case-base
ontolo-
gies
for
medical
CBR
systems.
In
the
next
step,
we
propose
a
fuzzy
KI–CBR
framework.
This
framework
is
an
integrated
set
of
modules.
Each
module
is
for
a
specific
purpose,
and
each
one
has
inputs
and
outputs.
The
framework
will
be
detailed
in
the
next
section.
Next,
for
the
fuzzy
case
base
ontology,
and
for
handling
the
supported
feature
types,
we
design
a
hybrid
semantic
retrieval
algorithm.
The
next
step
is
the
implementation
of
our
framework
using
JAVA
pro-
gramming
language.
Finally,
we
test
the
implemented
system
using
case
base
of
real
diabetics.
5.
The
proposed
fuzzy
KI-CBR
framework
for
diabetes
diagnosis
This
section
provides
a
description
of
our
proposed
fuzzy-
ontology
based
CBR
system
for
diabetes
diagnosis.
The
architecture
of
this
system
is
shown
in
Fig.
7.
It
has
six
modules:
Case
source
preparation,
case
base
ontology
engineering,
terminology
server,
fuzzy
case-base
ontology
population,
case
retrieval
engine,
and
case
query
parser.
The
main
steps
of
the
framework
are
case-base
preparation
and
case
retrieval.
The
case-base
preparation
step
is
achieved
by
the
case
source
preparation,
case-base
ontology
engineering,
terminology
server,
and
fuzzy
case-base
ontology
population
modules
as
fol-
lows:
1.
The
case-source
preparation
module
takes
the
EHR
raw
data
and
converts
it
into
pre-processed,
encoded,
and
fuzzified
relational
database.
2.
The
encoding
process
is
based
on
SCT
codes
from
the
terminology
server
module.
3.
The
case-base
ontology-engineering
module
builds
the
case-base
crisp
ontology
and
extends
it
to
a
fuzzy
ontology.
4.
The
fuzzy
case-base
ontology
population
module
populates
the
resulting
fuzzy
ontology
in
step
3
with
the
fuzzy
relational
database
in
step
1.
The
case
retrieval
step
is
achieved
by
the
terminology
server,
case
retrieval
engine,
and
case
query
parser
modules.
1.
The
case
query-parser
module
takes
the
user
query
vector
and
converts
it
to
a
semantic
query
vector
according
to
the
case
base
fuzzy
ontology
terminologies.
2.
The
case
retrieval-engine
module
takes
the
created
semantic
query
vector
generated
in
step
1
and
searches
for
the
most
similar
k
cases
in
the
fuzzy
case-base
ontology.
3.
The
clinical
similarity
between
medical
concepts
of
semantic
features
is
based
on
the
SCT
ontology
in
the
terminology
server
module.
5.1.
Case
source
preparation
module
This
module
prepared
the
EHR
raw
data
to
a
case-base
structure
and
content.
It
collected
the
patient’s
features
related
to
a
diabetes
diagnosis
from
distributed
EHR
systems
and
stored
it
in
an
opera-
tional
data
store
(ODS).
We
have
collected
60
cases,
which
describe
Fig.
7.
The
proposed
CBR
framework.
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
189
Fig.
8.
The
proposed
SCT
reference
set
for
diabetes
diagnosis.
diabetic
patients,
as
shown
in
Table
1.
These
cases
are
descriptive
of
all
types
of
cases
as
in
[64],
which
used
47
cases
only.
Next,
these
data
were
anonymized,
cleaned,
and
normalized.
Features’
weights
were
calculated
using
machine
learning
algorithms
includ-
ing
genetic
algorithm,
decision
tree,
and
others.
El-Sappagh
et
al.
[35]
have
proposed
a
case-base
preparation
process
and
applied
it
to
the
used
case-base
data.
Moreover,
the
data
were
converted
to
a
case
base
structure
using
our
proposed
standard
data
model
[9].
In
addition,
the
prepared
case-base
was
coded
according
to
SCT
reference
set
that
was
created,
which
is
specialized
for
dia-
betes
diagnosis
[18].
Finally,
the
encoded
case-base
was
fuzzified
in
a
fuzzy
relational
database
using
our
proposed
methodology
in
another
work.
The
works
of
El-Sappagh
et
al.
in
[18,35,64]
are
uti-
lized
in
this
module
to
prepare
the
used
EHR
medical
data.
The
resulting
database
is
the
source
of
instances
(ABOX)
for
our
pro-
posed
fuzzy
case-base
ontology.
5.2.
Terminology
server
module
This
module
creates
the
domain
background
ontology.
This
knowledge
is
critical
in
two
places:
(1)
in
semantic
similarity
mea-
surement,
and
(2)
in
query
formulation.
The
domain
knowledge
ontology
can
be
built
locally,
or
it
can
depend
on
a
standard
medical
ontology
such
as
SCT
[56].
Unfortunately,
ontologies
are
typically
created
in
an
ad-hoc
manner,
which
may
influence
the
accuracy
of
the
similarity
calculations
[64].
The
second
choice
is
better
because
clinical
ontologies
are
mature,
and
they
include
all
required
medical
concepts
and
relationships.
Moreover,
this
standardiza-
tion
enhances
the
interoperability,
reuse,
sharing,
and
integration
with
the
EHR
environment.
SCT
was
the
terminology
used
in
this
study.
Building
a
complete
ontology
is
not
realistic
and
using
the
whole
SCT
in
CBR
affects
the
retrieval
algorithm
because
it
is
a
very
large
ontology
(i.e.,
it
contains
361,800
concepts).
We
have
collected
all
SCT
concepts
related
to
diabetes
according
to
our
pro-
posed
methodology
[18],
and
built
its
OWL
2
ontology
(TBOX),
as
shown
in
Fig.
8.
This
ontology
only
contains
550
concepts.
When
measuring
semantic
similarity
with
JCOLIBRI
API,
it
is
between
concept
instances;
however,
SCT
contains
only
concepts.
We
have
solved
this
problem
by
creating
an
instance
for
each
selected
con-
cept
with
the
same
name
(ABOX).
Moreover,
we
have
represented
the
selected
concepts
using
its
conceptIDs.
Fully
specified
names,
symptoms,
and
preferred
names
can
be
added
as
annotations
with
their
corresponding
names.
As
shown
in
Fig.
8,
this
ontology
is
not
user
readable.
We
resolve
this
issue
in
our
future
work.
Each
con-
cept
name
begins
with
the
pattern
“C
to
be
readable
by
JCOLIBRI
API3as
a
concept
and
differentiate
it
from
instances.
The
resulting
ontology
is
a
directed
acyclic
graph
(DAG),
which
supports
single
inheritance
only,
but
the
whole
SCT
supports
multiple
inheritances.
An
ontology
has
a
structured
format
with
relationships
between
concepts.
The
“IS
A”
relationship
between
a
parent
and
a
child
is
the
core
relationship,
whereas
other
semantic
relationships
pro-
vide
additional
associations
between
terms
(such
as
“part-of”
or
“active-ingredient-of”).
Our
ontology
concentrates
on
the
“IS
A”
relationship
only
to
form
a
taxonomy
of
concepts.
Enriching
the
ontology
with
other
relationships
and
axioms
will
be
considered
in
future
work.
5.3.
Case-base
ontology
engineering
module
This
module
converts
our
crisp
case
base
ontology
created
in
our
previous
work
[31]
into
a
fuzzy
case-base
ontology.
We
apply
the
procedural
steps
of
IKARUS-Onto
[40]
methodology
for
con-
verting
a
crisp
ontology
to
a
fuzzy
ontology.
The
IKARUS-Onto
is
a
high-level
and
abstract
methodology
to
add
a
fuzzification
aspects
to
a
crisp
ontology.
We
customize
this
methodology
according
to
our
requirements.
It
is
the
most
accurate
and
complete
method-
ology.
Moreover,
the
resulting
ontology
is
represented
by
Bobillo
and
Straccia
syntax
as
OWL
2
ontology
using
Fuzzy
OWL2
2.1.1
plug-in
in
Protégé
4.1
[63].
This
syntax
adds
the
fuzzy
compo-
nents
as
annotations
for
concepts
and
relationships
(i.e.,
datatype
and
object
properties).
Moreover,
it
allows
the
creation
of
hedges
and
fuzzy
data
types.
The
default
reasoners
such
as
Pellet
[62]
and
default
modeling
tools
such
as
protégé
can
be
used
with
the
result-
ing
ontology
because
all
fuzzy
aspects
are
coded
as
annotations
(i.e.,
FuzzyLabel
annotation).
Every
annotation
is
delimited
by
a
start
tag
<fuzzyOwl2>
and
an
end
tag
</fuzzyOwl2>,
with
an
attribute
fuzzyType
specifying
the
fuzzy
element
being
tagged.
3http://gaia.fdi.ucm.es/research/colibri/jcolibri
190
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Fig.
9.
The
crisp
case
base
ontology.
5.3.1.
Crisp
ontology
customization
Before
starting
the
fuzzification
process,
our
previously
created
crisp
case-base
ontology
[31]
is
customized
according
to
our
case-
base’s
fuzzy
database
contents
and
the
CBROnto
standard
ontology
of
JCOLIBRI
2
API
[52].
Fig.
9
shows
our
crisp
case
base
ontology
after
customization.
This
customization
includes:
1.
No
outcome
concept
in
our
new
ontology,
2.
We
have
removed
the
temporal
aspect
because
we
do
not
provide
the
treatment
plan
for
the
diabetic
patient,
and
our
data
set
does
not
have
multiple
values
over
time
for
case
features,
3.
The
context
has
been
removed,
and
we
will
propose
an
indexing
methodology
in
another
work,
4.
The
diagnoses
are
Normal,
Prediabetic,
Prediabetic
Gestational,
Diabetic,
and
Diabetic
Gestational
only.
Our
data
set
cannot
determine
the
type
of
diabetes
(i.e.,
Type
1
or
Type
2)
and
the
type
of
pre-diabetes
(i.e.,
IFG,
IGT),
5.
In
our
data
set,
many
of
the
problem
description
features
are
new
and
not
modeled
in
the
previous
ontology
[31],
6.
The
hierarchy
of
the
ontology
is
simplified
as
much
as
possible
to
be
compatible
with
CBROnto,
7.
Dealing
with
rules
in
the
form
of
SWRL
will
be
a
future
work
to
enhance
the
semantic
of
our
case
base.
As
shown
in
Fig.
9,
CASE
INDEX
subsumes
all
of
the
case
fea-
tures,
CBRCASE
subsumes
case
instances,
and
HAS-COMPONENT
subsumes
the
two
parts
of
the
case.
This
way,
we
utilize
OntoBridge
API
of
JCOLIBRI2
to
address
ontology
storage,
retrieval,
and
manip-
ulation
in
a
straightforward
way
[52].
In
ontology-based
CBR,
cases
are
represented
as
concept
instances
and
their
attributes
are
repre-
sented
as
ontology
relations
or
properties.
The
values
that
relation
attributes
may
take
are
instances
defined
within
some
domain
ontology.
For
example,
consider
a
small
fragment
of
our
case
base
containing
only
age,
gender,
cancer,
and
labTest.
In
Fig.
10,
all
of
the
case
base
data
and
structure
are
inside
the
case-base
ontology.
We
may
implement
this
ontology
as
two
sepa-
rate
components:
A
case
base
structure
stored
in
an
OWL2
ontology,
and
instances
of
cases
and
features
stored
in
a
database,
as
shown
in
Fig.
11.
Each
choice
has
its
advantages
and
limitations,
and
we
chose
the
first
one.
5.3.2.
Case-base
ontology
fuzzification
process
Our
proposed
fuzzy
KI-CBR
(FKI-CBR)
framework
operates
on
two
axes,
namely
the
ontology-based
representation
of
imprecise
knowledge
and
the
utilization
of
this
knowledge
for
effective
case
retrieval.
For
the
first
axis,
a
fuzzy
ontology
is
proposed.
For
case
retrieval,
an
algorithm
that
utilizes
ontology
and
fuzzy
is
proposed.
Fig.
10.
A
small
fragment
of
crisp
case
base
instantiation
structure.
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
191
Fig.
11.
Case
instances
stored
in
database.
An
ontology
may
be
defined
as
a
set
of
concepts,
instances,
prop-
erties
(data
type
properties)
and
relations
(object
properties).
A
concept
represents
a
set
or
a
class
of
entities
within
a
domain
while
the
entities
that
belong
to
a
concept
are
called
instances
of
this
con-
cept.
A
relation
in
turn
links
a
concept
instance
to
another
instance
while
a
property
links
an
instance
to
a
standard
data
type
such
as
a
string,
integer,
float,
Boolean,
etc.
5.3.2.1.
The
case-base
fuzzy
ontology
design.
In
this
paper,
we
handle
only
vagueness
(i.e.,
imprecision),
but
uncertainty
is
not
handled
(i.e.,
probability,
ambiguity,
or
inexactness).
A
fuzzy
ontology
may
be
informally
defined
as
an
ontology
that
expresses
vague
knowledge
using
a
fuzzy
set
(fuzzy
concept)
namely
degree-vagueness
and
fuzzy
relation
and
properties
namely
combinatory-vagueness
[40].
Because
a
crisp
ontology
is
a
special
case
of
a
fuzzy
ontology,
in
which
all
relation
and
property
degrees
are
equal
to
1,
fuzzy
ontology-based
CBR
retains
the
characteris-
tics
of
the
traditional
ontology-based
CBR
paradigm.
Crisp
elements
that
can
be
fuzzified
include
data
types,
object
properties
(through
fuzzy
modifiers),
and
data
properties
(through
fuzzy
modified
data
types).
In
other
words,
the
fuzziness
of
ontology
includes
modeling
of
[40]:
1.
Fuzzy
concepts:
concepts
whose
instances
may
belong
to
it
in
certain
degrees,
such
as
youngPatient
are
fuzzy
concepts.
Because
young
is
a
vague
predicate,
the
concept
is
also
vague
and,
there-
fore,
can
be
represented
as
a
fuzzy
one;
it
allows
the
fuzzy
concept
assertions
such
as
patient
X
is
an
instance
of
youngPatient
at
a
degree
of
0.7.”
2.
Fuzzy
relations:
there
are
two
main
types,
(2.1)
Fuzzy
object
rela-
tions,
which
link
concept
instances
at
a
certain
degree,
and
it
allows
fuzzy
role
assertions
as
patient
X
has-Disease
Y
at
a
degree
of
0.8.”
(2.2)
Fuzzy
data
type
relations,
which
either
assign
literal
value
to
concept
instances
at
certain
degrees
(e.g.,
patient
X
has-
Residence
“Rural”
at
a
degree
of
0.4),
or
a
fuzzy
datatype
is
assigned
to
a
concept
instance
(e.g.,
patient
X
has-Fuzzy-Age
young),
which
includes
the
age
fuzzy
predicate.
There
are
many
fuzzy
ontology
construction
methodologies
as
IKARUS-Onto
[40],
UFOC
[65],
UPFON
[66]
and
OntoMethodology
[67].
Moreover,
fuzzy
ontology
representation
languages
have
been
proposed
in
[63,68].
Fuzzy
ontology
reasoners
include
FuzzyDL,
Fire,
and
DeLorean.
Fuzzy
reasoners
use
fuzzy
description
logics
as
fuzzy
SROIQ
(D),
F-ALC,
fuzzy
SHIN,
and
fuzzy
SHOID
(D).
As
shown
in
Table
2,
in
our
case,
the
fuzzy
case-base
ontology
con-
struction
process,
to
store
fuzzy
cases
about
diabetic
patient,
used
this
IKARUS-Onto
methodology,
OWL
2
fuzzy
extension
[63],
the
FuzzyDL4reasoner
using
fuzzy
DL
SROIQ
(D)
[60],
and
a
protégé
tool
with
the
fuzzy
OWL
plugin
[69].
The
plug-in
does
not
translate
fuzzy
representations
into
OWL
2,
but
rather
eases
their
represen-
tation
by
allowing
specification
of
the
type
of
fuzzy
logic
used,
the
definition
of
fuzzy
data
types,
fuzzy
modified
concepts,
weighted
concepts,
weighted
sum
concepts,
fuzzy
nominals,
fuzzy
modifiers,
fuzzy
modified
roles
and
data
types,
and
fuzzy
axioms.
Table
2
shows
the
execution
steps
of
the
IKARUS-Onto
methodology
in
our
case
study.
5.3.2.2.
The
case-base
fuzzy
ontology
implementation.
For
the
fuzzi-
fication
of
our
crisp
case-base
ontology,
we
use
the
Fuzzy
OWL2
2.1.1
plug-in5in
Protégé
4.16.
In
the
following,
we
detail
fuzzy
concepts,
data
types,
relations,
and
data
types.
A
fuzzy
data
type
D
is
a
pair
D,
Dwhere
Dis
a
concrete
interpreta-
tion
domain,
and
Dis
a
set
of
fuzzy
concrete
predicates
d
with
an
arity
n
and
an
interpretation
d1:
n
D
[0,
1],
which
is
an
n-ary
fuzzy
relation
over
D[63].
For
fuzzy
data
types,
the
functions
allowed
in
Fuzzy
OWL
2,
defined
over
an
inter-
val
[k1,
k2]
Q
are
d
{left(k1,k2,a,b)(fig.
13c),
right(k1,k2,a,b)(fig.
13d),
Triangle(k1,k2,a,b,c)
(fig.
13b),
Trapizoidal(k1,k2,a,b,c,d)
(fig.
13a),
linear(k1,k2,c)
fig.
13e,
mod(d)}
The
formalization
of
each
ele-
ment
in
the
ontology
is
conducted
as
follows:
5.3.2.2.1.
Fuzzy
data
types
and
fuzzy
concrete
roles
(data
prop-
erties).
For
each
of
the
numerical
features
in
our
case
base,
our
domain
experts
have
defined
their
ranges,
and
fuzzy
member-
ship
functions,
their
shapes,
and
parameters.
For
fuzzification
of
these
values,
we
define
two
things:
(1)
a
fuzzy
data
type,
(2)
a
fuzzy
concrete
role.
Because
we
have
70
features,
and
most
of
them
are
numerical,
we
only
give
examples
here.
In
cooperation
with
our
domain
experts,
we
have
used
MATLAB
to
define
the
fuzzy
membership
functions
and
their
ranges,
shapes,
and
equa-
tions,
as
shown
in
Fig.
12.
Experience
suggests
that
the
overlap
of
triangle-to-triangle
and
trapezoid-to-triangle
fuzzy
regions
aver-
ages
somewhere
between
25%
and
50%
of
the
fuzzy
set
base
[70].
In
our
case,
our
domain
expert
has
recommended
fixing
the
normal
ranges
and
overlapping
low
and
high
ranges
by
50%
to
the
normal
range,
see
Fig.
12b.
Considering
HbA1c
lab
test
values,
let
us
assume
its
range
is
[71,71]
and
its
linguistic
terms
are
lowA1c
(left
shoulder
5.7,
6.05),
normalA1c
(triangle
(5.7,
6.05,
6.4)),
and
highA1c
(right-
shoulder
(6.05,
6.4)).
Firstly,
we
create
a
fuzzy
data
type
for
each
of
these
vague
terms.
As
shown
in
Fig.
14,
we
have
used
the
protégé
plugin
[65]
to
create
a
datatype
lowA1c
and
then
annotate
it
as
fuzzy
datatype.
This
action
is
repeated
for
every
linguistic
term
in
each
4FuzzyDL
Reasoner:
http://gaia.isti.cnr.it/straccia/software/fuzzyDL/fuzzyDL.
html.
5Fuzzy
OWL2
2.1.1
plug-in:
http://www.straccia.info/software/FuzzyOWL/.
6Protégé
4.1:
http://protege.stanford.edu/.
192
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Table
2
The
fuzzy
case-base
ontology
construction
process.
fuzzy
variable
in
our
case
base
ontology.
Next,
for
each
numerical
feature,
we
have
defined
a
concrete
role
for
each
of
its
linguistic
val-
ues.
The
previously
defined
fuzzy
datatypes
are
used
as
ranges
for
these
roles.
Continuing
with
HbA1c,
we
define
three
fuzzy
concrete
roles
hasLowA1c,
hasNormalA1c,
and
hasHighA1c.
For
example,
the
hasLowA1c
is
modeled
as
hasLowA1c
(HbA1c,
lowA1c)
where
HbA1c
is
a
crisp
concept
and
lowA1c
is
a
fuzzy
data
type.
5.3.2.2.2.
Fuzzy
modifiers,
fuzzy
modified
data
types,
and
fuzzy
modified
roles.
Modifiers
can
improve
the
expressiveness
of
the
ontology
and
semantic
queries.
The
degree
of
membership
of
fuzzy
Fig.
12.
An
example
of
fuzzy
datatypes.
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
193
Fig.
13.
Membership
functions
in
[16]
for
fuzzy
data
type
definition
(i.e.,
fuzzy
concrete
domains)
and
fuzzy
modifier
functions:
(a)
trapezoidal
function;
(b)
triangular
function;
(c)
left-shoulder
function;
(d)
right-shoulder
function;
and
(e)
linear.
data
types
may
be
changed
using
fuzzy
modifiers.
A
fuzzy
modi-
fier
is
a
function
fmod :
[0,
1]
[0,
1],
which
applies
to
a
fuzzy
set
to
change
its
membership
function,
which
can
be
linear
(c)
(Fig.
12e)
or
triangular
(a,
b
and
c)
(Fig.
12b).
By
the
help
of
domain
expert,
we
defined
modifies
values
as
very,
slightly,
somewhat
etc.
with
the
help
of
a
domain
expert.
For
example,
we
have
defined
fuzzy
modifiers
very
as
linear
(0.85).
In
our
work,
these
modifiers
have
two
purposes
(Fig.
13):
To
modify
a
data
type,
such
as
the
new
data
type
veryLowA1c,
which
is
a
modified
version
of
lowA1c
as
shown
next:
<fuzzyOwl2
fuzzyType
=
“datatype”>
<Datatype
type
=
“modified”
modifier
=
“very”
base
=
“lowA1c”/>
</fuzzyOwl2>
To
modify
a
fuzzy
concrete
role
as
shown
next.
5.3.2.2.3.
Fuzzy
modified
data
type
properties.
The
other
type
of
fuzzy
data
type
properties
are
Degree-vagueness
as
has-Disease
and
lived-In
attributes
as
shown
in
Table
2,
they
are
modeled
as
fuzzy
modified
roles.
For
example,
has-Disease
role
can
be
modified
by
very
modifier
in
a
new
role
very-has-Disease
as
<fuzzyOwl2
fuzzyType
=
“role”>
<Role
type
=
“modified”
modifier
=
“very”
base
=
“has-Disease”/>
</fuzzyOwl2>
5.3.2.2.4.
Fuzzy
logic
of
the
ontology.
We
have
selected
Zadeh
fuzzy
logic
for
our
ontology
where:
t-Norm
˛
ˇ
=
min
{˛,
ˇ},
t-
Conorm
˛
ˇ
=
max
{˛,
ˇ},
Negation
˛
=
1
˛,
and
Implication
˛
ˇ
=
max
{1
˛,
ˇ}.
This
annotation
is
at
the
ontology
level
as
<fuzzyOwl2
fuzzyType
=
“ontology”>
<FuzzyLogic
logic
=
“zadeh”/>
</fuzzyOwl2>
The
resulting
fuzzy
ontology
structure
(TBOX)
contains
63
classes,
54
object
properties,
138
(fuzzy)
datatype
properties,
105
fuzzy
datatypes.
After
creating
the
fuzzy
ontology
structure,
the
next
step
is
to
create
the
ontology
instances.
The
instances
of
the
cases
and
the
instances
of
its
describing
features
are
populated
from
our
fuzzy
case
base
relational
database.
We
populate
the
ontology
with
60
real
world
diabetes
diagnosis
individual
cases.
5.4.
Fuzzy
case-base
ontology
population
module
Fuzzy
ontology
population
from
the
fuzzy
relational
database
has
been
studied
[42].
Moreover,
there
are
protégé
plugins
to
auto-
mate
the
process
such
as
FRDB2FOnto
[42],
which
convert
the
fuzzy
database
schema
and
content
to
a
fuzzy
ontology
structure
and
instance.
On
the
other
hand,
for
storage
of
large
ontologies,
fuzzy
ontologies
can
be
stored
in
semantic
preserved
databases
[73].
We
selected
the
first
choice
to
be
compatible
with
the
JCOLIBRI2
frame-
work.
Inspired
by
ontology
population
approaches,
we
developed
our
procedure
to
fill
the
resulting
case-base
fuzzy
ontology
with
cases
(i.e.,
instances)
from
our
previously
modeled
case-base
fuzzy
relational
database.
We
show
the
process
on
a
single
fuzzy
table
f
Age,
which
includes
the
fuzzy
components
of
feature
age.
Case
base
crisp
ER
model
in
Fig.
5
has
been
previously
fuzzified
and
implemented
into
a
fuzzy
relational
database.
As
shown
in
Fig.
15,
the
Age
feature
in
table
Patient
Case
(Fig.
15a)
has
been
fuzzified
into
f
Age
table
(Fig.
15b).
According
to
our
resulting
fuzzy
ontology,
we
can
map
between
fuzzy
concrete
properties
as
has-Young-Age
and
the
attributes
of
f
Age
relation
as
youngAge.
Moreover,
the
has-Age
object
property
connects
the
instances
from
classes
the
Case
and
Age
as:
ClassAsser-
tion
(Case
C1);
ClassAssertion
(Age
A1);
ObjectPropertyAssertion
(has-Age
C1
A1).
The
same
process
done
in
Fig.
10
for
the
whole
case-base
crisp
ontology
was
performed
for
the
fuzzy
ontology.
Our
mapping
rules
of
database
instances
to
ontology
instances
are
Fig.
14.
An
example
of
fuzzy
data
type
definition.
194
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Fig.
15.
Correspondence
between
fuzzy
relational
database
and
fuzzy
ontology.
guided
by
W3C
rules7and
the
rules
in
Zhang
et
al.
[74]
by
adapt-
ing
their
rules
to
work
with
the
fuzzy
relational
database
(e.g.,
table,
tuple,
attribute,
primary
key,
and
foreign
key
are
mapped
to
concept,
instance,
data
type
property,
axiom,
and
object
property,
respectively).
The
resulting
ontology
is
now
ready
with
the
sets
of
TBOX
and
ABOX.
After
populating
the
ontology
with
60
cases,
it
contains
2640
concept
instances.
The
resulting
case-base
is
a
collection
of
case
instances
in
the
ontology.
Case
attributes
are
represented
as
fuzzy
data
properties
and
fuzzy
object
properties,
as
follows:
CB
=
Ui
=
1,nCBRase,
where
n
is
the
number
of
cases,
CBRCasei
(Has
ID.string)
(CASE
COMPONENT.CBR
DESCRIPTION)
(has
Solution.CBR
Solution)
(Has
Residence.string)
(Has
Age.Age)
(Has
BMI.BMI)
(Has
Occupation.string)
(Has
Disease.Disease)
(Has
FemaleHistory.FemaleHistory)
(Has GlobalSymptom.GlobalSymptom)
(Has
Hemato
log
icalProfile.Hemato
log
icalProfile)
(Has
KidneyFunctionTest.KidneyFunctionTest)
(Has
LabTest.LabTest)
(Has
LipidProfile.LipidProfile)
(Has
LiverFunctionTest.LiverFunctionTest)
(Has
Radio
log
icalExa
min
ation.Radio
log
icalExa
min
ation)
(Has
UyinationSymptom.UyinationSymptom)
Has
Gender.{female, male}
.
.
.
Moreover,
case
diagnosis
part
is
a
nominal
concept
of
the
form
DIAGNOSIS
=
{diabetic,
preDiabetic,
normal,
diabeticGestational,
prediabeticGestational}.
As
a
result,
in
the
next
module
(case
query
parser
module),
the
query
case
will
be
modeled
in
the
same
format
as
CBRCasei,
and
in
the
case
retrieval
module
(Section
5.6),
a
systematic
comparison
between
the
cases’
predicates
can
calculate
the
similarity
levels
between
cases.
5.5.
Case
query
parser
module
For
a
new
patient
diagnosis
problem,
the
physician
enters
the
new
patient
description
in
the
query
form;
this
forms
the
new
case
without
a
solution.
We
have
asserted
before
that
our
cases
have
a
homogeneous
structure.
Implementing
heterogeneous
case
structure
will
be
discussed
in
another
study.
As
a
result,
all
of
the
necessary
patient
features
are
known
in
advance,
but
the
physician
may
not
know
all
of
the
values
of
these
features
when
describing
the
patient,
and
their
entry
may
be
time-consuming.
Ontologies
especially
standard
medical
ontologies
support
the
integration
of
a
CBR
system
and
EHR
[18].
The
query
module
can
search
the
patient
record
for
the
necessary
fields.
Moreover,
we
can
implement
a
rule
base
to
link
features
and
infer
the
missing
ones.
Next,
the
query
is
fuzzified
and
coded
with
the
same
methods
used
for
the
case-base
ontology
to
facilitate
similarity
and
mapping.
The
new
problem
structure
is
transformed
into
the
fuzzy
case-base
7http://www.w3.org/2001/sw/rdb2rdf/wiki/Database-Instance-Only
and
Database-Instances-and-Schema
Mapping.
ontology
vocabulary
by
some
strategy;
then,
the
semantic
query
is
sent
to
the
Case
Retrieval
Engine
to
compute
the
similarity
between
the
query
concepts
and
the
concepts
of
the
new
semantic-query
problem.
The
semantic
query
is
a
DL
conjunctive
query
of
the
logic
form
ˆi(Øi)
˛,
where
Øiis
a
conjunction
of
terms
of
the
form
A(x),R(x,y),
for
atomic
concept
A,
and
atomic
role
R;
x,
y
are
either
individuals
or
variables
names
˛
(0,1],
and
{>,,,<}.
To
this
end,
let
us
take
a
semantic
query
example.
After
acquir-
ing
the
query
case
Q
from
physician,
it
is
represented
as
a
vector
Q
=
<attributei=valuei>,
for
i
is
the
number
of
features.
Our
cases
are
represented
with
70
features,
so
writing
seman-
tic
queries
using
all
of
these
features
will
create
a
long
and
complicated
query.
A
very
small
fragment
of
these
features
is
Q
=
<Age
=
38,
Residence
=
“Rural”,
Fatigue
=
“++”,
Gender
=
“Male”,
disease
=
“Malignant
tumor
involving
left
ovary
by
direct
extension
from
endometrium”.
.
.>.
This
vector
enters
two
main
prepa-
ration
steps:
fuzzification
of
numerical
data,
and
coding
of
unstructured
data.
After
the
fuzzification
process,
the
vector
is
Q
=
<(young
=
0.2,
middleAged
=
0.8,
old
=
0,
fuzzyLabel
=
middleAged,
Age
=
38),
Residence
=
“Rural”,
Fatigue
=
“++”,
Gender
=
“Male”,
dis-
ease
=
“Malignant
tumor
involving
left
ovary
by
direct
extension
from
endometrium”.
.
.>.
After
the
encoding
of
the
query
by
our
SNOMED
CT
domain
OWL2
ontology;
this
step
encodes
unstructured
data
into
standard
codes.
The
resulting
vector
is
Q
=
<(young
=
0.2,
middleAged
=
0.8,
old
=
0,
fuzzyLabel
=
middleAged,
Age
=
38),
Residence
=
“Rural”,
Fatigue
=
“++”,
Gender
=
“Male”,
dis-
ease
=
369524001”.
.
.>.
The
other
ordinal
and
categorical
features
remain
the
same.
The
vector
Q
needs
to
be
transformed
into
a
semantic
query.
This
query
is
a
conjunction
of
a
set
of
predi-
cates
as
P1
P2
,
.
.
.,
Pnwhere
Piis
a
predicate
of
four
forms:
a
(fuzzy)
concept
assertion
a:Ci
˛,
a
(fuzzy)
object
property
asser-
tion
(a,b):Ri
˛
a
(fuzzy)
data
property
assertion
(a,v):Ti
˛,
for
a,b
as
abstract
individuals
and
v
as
a
literal
value,
or
A
(fuzzy)
data
prop-
erty
assertion
(a,v):Ti
˛,
for
v
as
a
fuzzy
linguistic
term
defined
using
a
fuzzy
datatype.
According
to
the
vocabulary
of
our
fuzzy
case-base
ontology,
the
vector
Q
is
transformed
into
a
semantic
query
containing
OWL
individuals
and
property
(i.e.,
data
and
object)
instances
of
the
form
<concept
instance,
object
property,
concept
instance;
˛>,
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
195
Fig.
16.
A
part
of
the
semantic
representation
of
fuzzy
query
case.
Fig.
17.
A
part
of
a
semantic
query
over
the
case-base.
<concept
instance,
data
property,
fuzzy
value;
˛>,
<concept
instance,
data
property,
literal;
˛>,
as
shown
in
Fig.
15c.
Fig.
16
shows
an
example
using
OWL2
Functional-Style
Syntax8for
the
previous
query
vector
Q;
we
concentrate
on
the
fuzzy
values
representation
and
assume
that
˛
=
1.
Up
to
this
point,
we
have
two
options.
We
firstly
search
for
an
exact
match
between
this
query
case
and
a
case
in
the
fuzzy
ontol-
ogy.
In
this
case,
an
SPARQL-DL
query
can
be
used
to
query
the
case-base
ontology
and
retrieve
the
diagnosis
of
the
matched
case,
as
shown
in
Fig.
17.
The
other
general
option
is
the
existence
of
partial
similar-
ity
between
the
query
case
and
all
cases
in
the
case-base.
In
this
case,
we
use
a
set
of
APIs
including
JCOLIBRI,
OntBridge,
Pel-
let,
and
fuzzyDL
APIs
to
implement
a
java
project
to
implement
the
retrieval
algorithm.
The
algorithm
uses
the
proposed
simi-
larity
function
in
the
next
section
to
retrieve
the
most
suitable
k
cases
in
the
fuzzy
case-base
ontology.
The
algorithm
calculates
the
clinical
similarity
between
the
query
case
and
all
cases
in
the
case-base
according
to
the
inference
capabilities
of
the
Pellet
and
fuzzyDL
reasoners.
The
solutions
of
the
most
similar
k
cases
are
selected
and
retrieved
to
the
physician
to
guide
his
decision
pro-
cess.
5.6.
Case
retrieval
engine
module
We
can
state
that
a
case
is
equivalent
to
another
case
if
both
cases
have
exactly
the
same
structure
and
attribute
values.
In
crisp
ontology-based
CBR,
the
retrieval
of
cases
involves
the
exploitation
of
the
structure
and
the
content
of
the
ontology
for
computing
the
semanticsimilarity
between
the
attribute
values
8http://www.w3.org/TR/owl2-syntax/.
and
consequently,
for
the
cases.
There
is
some
ontology-specific
similarity
functions
that
utilize
ontological
knowledge
in
a
dif-
ferent
manner
[72].
None
of
these
measures
utilizes
imprecise
knowledge
in
any
way.
Case
retrieval
can
be
implemented
with
a
neural
network
(NN),
rule-based
(RB),
case
indexing
(CI),
and
a
decision
tree
(DT).
However,
it
is
hard
to
determine
the
cor-
responding
structure
and
parameters
of
NN
and
DT,
in
addition,
extraction
and
the
choice
of
rules
and
indexes
are
largely
depen-
dent
on
the
experience
of
the
knowledge
engineers
as
well
[75].
In
this
paper,
we
propose
a
case
retrieval
algorithm
that
involves
combining
the
reasoning
capabilities
of
classical
ontologies
(i.e.,
semantic
similarity)
with
fuzzy
similarity
for
numerical
features
in
order
to
create
a
powerful
hybrid
reasoning
mechanism.
We
assume
that
all
case
classes
have
a
unique
structure
(i.e.,
the
same
set
of
attributes).
The
performance
of
similarity
measure
totally
depends
on
the
type
and
the
importance
of
features.
We
have
used
a
set
of
machine
learning
algorithms
to
calculate
feature
weights,
as
in
another
study
[35].
First,
we
calculated
the
local
similarity
of
each
feature
accord-
ing
to
its
type
[12];
next,
we
used
a
global
similarity
function
based
on
a
distance
function
as
Euclidian
or
Minkowski.
Accord-
ing
to
feature
types,
our
proposed
similarity
algorithm
had
two
stages
of
similarity.
The
first
stage
depends
on
syntactic
features
only
to
retrieve
a
set
of
potentially
similar
cases,
and
the
second
depends
on
the
remaining
semantic
features
to
select
the
most
similar
case.
5.6.1.
Similarity
calculation’s
first
stage
Consider
a
query
case
Cq,
stored
cases
Cifor
i
=
1,
.
.
.,
n
and
n
is
the
number
of
cases
in
the
case-base,
and
feature
weights
wi.
All
instance
features
have
weight
wi=
0.
The
first
layer
calcu-
lates
SIMsyntactic Cq,
Ci.
This
global
similarity
function
SIMsyntactic
returns
the
most
similar
cases
according
to
the
similarity
between
Cqand
Ciusing
syntactic
similarity
of
syntactic
features
(fuzzy
and
196
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
not
fuzzy),
see
the
following
equation:
SIMsyntactic Cq,
Ci
=n
j=1wi×
sim fqj,
fij
n
j=1wi=m
j=1wi×
simLAj,
Zj
n
j=1wi
+k
j=m+1wi×
simOBj,
Zj
n
j=1wi+r
j=k+1wi×
simFCj,
Zj
n
j=1wi
+n
j=r+1wi×
simNDj,
Zj
n
j=1wi
(4)
The
A,B,C,
and
D
are
the
sets
of
nominal,
ordinal,
fuzzy,
and
numerical
features
of
the
query
case,
and
Z
contains
the
corre-
sponding
features.
Moreover,
we
can
add
importance
for
each
type
of
features
by
using
weights
w1,
w2,
w3,
w4as
shown
in
Eq.
(5),
where
w1,
w2,
w3,
and
w4
(0,1]
and
w1+
w2+
w3,
+
w4=
1
SIMsyntactic =
w1×m
j=1wi×
simLAj,
Zj
n
j=1wi+
w2
×k
j=m+1wi×
simOBj,
Zj
n
j=1wi+
w3
×r
j=k+1wi×
simFCj,
Zj
n
j=1wi+
w4
×n
j=r+1wi×
simNDj,
Zj
n
j=1wi
(5)
Depending
on
the
type
of
a
feature,
the
local
similarity
sim
is
selected
as
follows:
If
the
feature
is
nominal,
the
exact
match
is
used
as
in
the
fol-
lowing
equation:
simLAj,
Zj=1
if
Aj=
Zj
0
if
Aj/=
Zj
(6)
If
the
feature
is
ordinal,
our
domain
experts
proposed
a
similarity
matrix
for
each
ordinal
feature,
and
the
similarity
simO(Bj,
Zj)
is
calculated
based
on
this
matrix.
Due
to
space
restrictions,
we
do
not
show
matrices.
If
the
feature
is
fuzzy,
we
have
two
options:
(1)
The
feature
value
is
numerical.
Our
proposed
fuzzy
similarity
measure
utilizes
all
of
the
fuzzy
sets
of
compared
features
in
cal-
culating
similarity.
As
the
case-base
fuzzy
ontology
store
case
with
fuzzified
features,
the
input
query
numerical
features
is
fuzzified
using
the
same
fuzzy
sets,
and
a
comparison
is
conducted
between
stored
and
query
fuzzy
values.
The
normalized
Euclidean
distances
between
fuzzy
sets
of
a
feature
are
used
to
calculate
similarity
as
in
the
following
equation:
DistFCj,
Zj=n
k=1cjk
zjk2
n(7)
where
Cj=
crisp
value
of
a
feature
in
query,
Zj=
crisp
value
of
a
fea-
ture
in
a
case,
n
=
number
of
fuzzy
sets
for
feature
f,
cjk and
zjk
are
k’s
fuzzy
values
for
query
and
stored
cases’
feature,
respectively.
The
similarity
is
calculated
using
the
following
equation:
simFCj,
Zj=
1
Dist Cj,
Zj(8)
After
testing
this
function,
we
found
it
insensitive
for
extreme
values
because
the
membership
functions
are
equal
to
zero
except
one
function,
for
example,
for
ages
60
and
70,
DistF(60,74)
=
0
and
simF(60,74)
=
0.
To
solve
this
problem,
to
calculate
fuzzy
similarity,
we
take
the
average
of
crisp
similarity
(Eq.
(9))
and
fuzzy
one.
(2)
The
feature
value
is
a
vague
term.
A
patient
can
be
described
using
vague
terms
for
numerical
features
(e.g.,
Age
=
young,
BMI
=
obese,
FPG
=
low).
Our
case-base
fuzzy
ontology
supports
all
types
of
similarities.
As
shown
in
Fig.
15,
the
has-Fuzzy-Age
data
type
property
stores
the
linguistic
term
young
for
the
numerical
age
=
36.
When
a
patient
is
described
by
a
linguistic
term,
proposed
similarity
matrices
of
our
domain
expert
are
used
(see
Table
3).
In
addition,
fuzzy
hedges
such
as
“very”,
“quite”,
“somewhat”,
“not”,
or
“extremely”
are
possible
in
query
case
description.
As
shown
in
Section
4.3.1,
it
is
possible
to
define
hedges
in
the
case
base
ontology.
The
stored
and
entered
hedges
can
be
compared
using
similarity
matrices
as
proposed
by
our
domain
expert.
If
the
feature
is
simple
numerical,
then
the
similarity
is
calcu-
lated
using
the
following
equation:
simNCj,
Zj=
1
|Dj
Zj|
Max
Min (9)
5.6.2.
Similarity
calculation’s
second
stage
Medical
concepts
similarity
can
be
conducted
non-semantically
or
lexically,
or
it
can
be
done
semantically
using
standard
ontologies
as
SCT
[16,64].
We
selected
the
second
choice
to
measure
the
sim-
ilarity
in
meaning
between
concepts.
The
retrieved
m
cases
from
the
first
layer
(i.e.,
by
SIMsyntactic(Cq,
Ci))
enter
another
evaluation
based
on
the
semantic
similarity
between
the
instance
features.
Lexical
or
exact
similarity
cannot
be
used
to
compare
ontology
concepts.
All
syntactic
features
have
wi=
0.
The
SIMsyntactic(Cq,
Ci))
utilizes
our
proposed
SCT
domain
ontology
to
calculate
the
semantic
similarity
between
compared
SCT
concepts
[18].
Instance
features
have
the
data
type
=
I
in
Table
1.
Not
like
relatedness,
semantic
similarity
measures
how
similar
the
meaning
of
con-
cepts
are
based
on
the
IS–A
relationship
only
[55].
These
measures
include
edge-based,
node-based
(i.e.,
information
content
and
features-based),
and
hybrid
measures.
Garla
and
Brandt
[27]
have
provided
a
recent
survey
of
all
existing
measures.
Most
these
measures
are
suitable
for
WordNet
nouns
only.
We
do
not
uti-
lize
Information
Content
(IC),
neither
corpus
nor
intrinsic,
because
none
of
its
calculation
methods
is
applicable
to
SCT;
its
calcu-
lation
is
time
consuming
[55];
it
is
inaccurate
due
to
shallow
annotations
[27].
The
most
popular
methods
for
intrinsic
IC
are
Seco
et
al.
[76]
using
Eq.
(10)
and
Sánchez
et
al.
[77]
using
Eq.
(11).
ICSeco(u)
=
1
log (D(u))
log
|(C)|(10)
ICSanchez =
log Leaves(u)/A(u)+1
Max
leaves
+
1(11)
with
D(u)
{v|v
u},
C
the
set
of
all
concepts
in
the
ontology,
leaves
(u)
as
the
number
of
leaves
subsumed
by
the
concept
u,
and
Max
Leaves
as
the
number
of
terminal
concepts
of
the
ontol-
ogy.
We
propose
a
new
hybrid
measure
based
on
path
length
and
concept
features.
First,
for
path
length,
our
similarity
is
based
on
Table
3
The
fuzzy
sets
similarity
matrix
for
age
feature.
Query
Stored
case Young
Middle-aged
Old
Young
1
0.5
0.1
Middle-aged
0.5
1
0.6
Old
0.1
0.6
1
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
197
Fig.
18.
The
structure
of
the
system
implementation.
the
depth
of
the
Least
Common
Ancestor
(LCA)
of
the
two
con-
cepts
and
the
closeness
level
of
concepts
to
their
LCA
in
our
SCT
sub-ontology
is
based
on
IS-A
relationship
only.
In
other
words,
(1)
the
deeper
the
LCA,
the
more
specific
it
is
considered
and,
thus,
the
more
similar
the
compared
concepts
are
assumed;
(2)
the
closer
the
two
concepts
are
to
their
LCA,
the
more
similar
they
are.
Second,
to
quantify
similarity
for
concept
features,
the
commonalities
and
differences
between
concepts
must
be
consid-
ered
[55].
JCOLIBRI
API9uses
four
semantic
similarity
measures:
path-based
such
as
fdeep
basic
and
fdeep,
and
feature-based
such
as
cosine,
and
detail
[52].
These
measures
have
been
tested
in
the
API’s
tutorial;
however,
path-based
measures
do
not
take
in
to
account
the
depth
of
concepts
from
their
LCA,
and
feature-based
measures
depend
only
on
the
commonalities
between
compared
concepts.
Our
proposed
measure
overcomes
these
limitations
and
integrates
path
based
and
feature
based
approaches.
The
proposed
composite
similarity
measure
uses
the
equation
the
following
equation:
SIMSemantic (u,
v)=
w1simpath (u,
v)+
w2simfeature (u,
v)(12)
where
w1,
w2
(0,
1]
are
weights
for
w1+
w2=
1,
and
simpath(u,
v)
(Eq.
(13))
is
an
adapted
version
of
Wu
and
Palmer
[69]
(Eq.
(14))
because
simwu
and
palmer(u,
v)
<
1
which
violates
the
Identity
Of
the
Indiscernibles
property
(IOI)
[55].
simpath(u,
v)
=1
if
u
=
v
simwu
and
palmer otherwise (13)
simwu
and
palmer (u,
v)
=2
×
depth (lca (u,
v))
shortest
path (u,
lca (u,
v)) +
shortest
path (v,
lca (u,
v)) +
2
×
depth (lca(u,
v))
(14)
In
addition,
simFeature(u,v)
is
based
on
Batet
et
al.
[26],
Eqs.
(15)
and
(16):
simFeature (u,
v)=
1
DistBatet (u,
v)(15)
DistBatet (u,
v)
log21
+|A(u)\A(v)|
+
|A(v)\A(u)|
|A(u)\A(v)|
+
|A(v)\A(u)|
+
|A(u)
A(v)|
(16)
where
A(u)
is
the
set
of
ancestors
of
u,
i.e.,
A(u)
=
{v|uv},
A(u)/A(v)
is
specificity
of
u,
and
A(u)
A(v)
is
the
commonality
between
u
and
v.
We
tried
to
calculate
the
clinical
similarity
between
two
concepts
rather
than
the
semantic
distance.
Clinical
similarity
is
9http://gaia.fdi.ucm.es/research/colibri/jcolibri.
influenced
by
the
clinical
granularity
of
concepts.
For
example,
if
we
consider
the
hierarchy
megacalycosis
is-a
caliectasis
is-a
kidney-disease
from
SCT,
semantic
similarity
(kidney
disease,
kidney
disease)
=
1,
but
clinical
similarity
(kidney
disease,
kidney
disease)
<
1
because
kidney
disease
is
more
general
and
abstract
concept,
which
means
other
diseases
as
well.
Moreover,
clinical
similarity
(caliecta-
sis,
caliectasis)
>
clinical
similarity
(kidney
disease,
kidney
disease)
is
decided.
The
main
rule
is
that
the
deeper
the
concept,
the
more
spe-
cific
it
is.
Finally,
the
solution
for
the
most
similar
case
is
suggested
for
the
new
problem.
6.
Implementation
and
evaluation
6.1.
System
implementation
A
CBR
system
was
developed
in
Java
by
extending
the
APIs
of
the
JCOLIBRI2
CBR
framework
[52].
As
shown
in
Fig.
18,
the
pro-
posed
customization
has
three
layers,
and
each
layer
has
specific
tasks.
Due
to
space
restrictions,
we
do
not
discuss
this
framework
in
detail.
The
persistence
layer
prepares
the
fuzzy
case-base
ontology.
The
CBR
application
layer
is
the
core
of
the
framework
as
it
contains
the
whole
CBR
cycle.
The
interface
layer
accepts
a
query
from
the
physician
and
returns
the
most
similar
case.
We
have
implemented
the
case
representation
and
retrieval
steps
only;
case
adaptation
and
retention
are
out
of
scope.
Due
to
space
restrictions,
we
select
only
seven
from
our
70
fea-
tures
to
implement
our
system.
These
features
are
representative
of
the
dataset
because
it
includes
fuzzy
features
as
Age,
HbA1c,
and
BMI;
instance
features
as
lipid
disease,
liver
disease,
and
nephropa-
thy;
and
nominal
features
as
gender.
The
fuzzification
of
numerical
features
has
been
done
in
Matlab,
and
for
space
limitation,
we
will
not
discuss
this
process.
Moreover,
instance
features
have
been
encoded
using
our
SCT
reference
set.
Fig.
19
shows
the
query
screen
used
to
collect
patient
attributes.
For
instance
features,
the
user
selects
an
instance
from
shown
ontology.
Fig.
20
shows
the
sim-
ilarity
configuration
window;
It
allows
the
dynamic
selection
of
similarity
functions
and
weights
for
each
feature;
the
selection
of
the
number
of
cases
to
retrieve
(i.e.,
k).
We
have
implemented
all
of
the
proposed
similarity
function
in
Section
5.5
including
fuzzy
and
semantic.
Spinner
is
used
to
let
the
user
choose
from
a
range
of
values
to
control
the
number
of
retrieved
cases.
The
slider
is
used
to
set
the
weight.
Fig.
21
shows
the
retrieved
cases
with
their
level
of
similarity.
6.2.
Evaluation
of
the
proposed
CBR
system
Each
component
of
the
proposed
system
is
evaluated
upon
completion.
These
evaluations
have
provided
proof
of
concept,
illuminated
system
strengths,
and
weaknesses
and
guided
system
198
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Fig.
19.
A
new
case
description.
development.
The
proposed
framework
is
the
first
to
integrate
the
capabilities
of
standard
medical
ontologies
(i.e.,
SCT),
fuzzy
logic,
ontology,
and
CBR
in
on
the
hybrid
system.
This
combination
has
powerful
benefits
to
CBR
functionality.
In
the
next
sub-sections,
we
evaluate
the
proposed
fuzzy
case-base
ontology
(Section
6.2.1).
Moreover,
we
evaluate
the
proposed
semantic
retrieval
functions
on
a
small
fragment
of
the
SCT
medical
ontology
(Section
6.2.2).
In
addition,
the
overall
performance
of
the
system
is
evaluated
using
the
case-base
ontology,
the
overall
retrieval
algorithm,
and
the
domain
standard
ontology
(Section
6.2.3).
6.2.1.
Case-base
fuzzy
ontology
evaluation
6.2.1.1.
Our
fuzzy
ontology
evaluation
includes
three
dimensions.
First,
the
ontology
consistency
has
been
checked
using
a
set
of
Fig.
20.
Similarity
measures
setting.
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
199
Fig.
21.
Retrieved
similar
cases.
reasoners
including
HermiT
1.3.8,
Fact++,
and
Pellet
2.3.0.
More-
over,
its
fuzziness
consistency
has
been
checked
by
fuzzyDL
1.1.
Pitfalls
found
in
the
ontology
modeling
process
were
detected
using
the
OOPS!
Pitfall
Scanner
[78]
and
carefully
corrected.
Second,
checking
correctness,
accuracy,
and
completeness
is
typically
manual
[48].
Our
Domain
experts
have
validated
the
correctness,
accuracy,
and
completeness
of
the
built
case-base
fuzzy
ontology.
Regarding
the
correctness,
our
two
domain
experts
reviewed
each
fuzzy
element
(i.e.
fuzzy
datatype,
fuzzy
object
prop-
erty,
and
fuzzy
data
property)
and
asserted
that
these
elements
convey
a
meaning,
which
is
indeed
vague
in
diabetes
diagnosis
domain.
Regarding
the
accuracy,
our
domain
experts
have
unified
the
fuzzification
process.
They
asserted
that
for
each
fuzzy
variable
such
as
HbA1c
lab
test
its
normal
range
is
modeled
using
a
triangu-
lar
fuzzy
set
and
the
other
fuzzy
values
such
as
low
and
high
will
be
modeled
using
left
and
right
shoulder
functions;
these
shoul-
der
functions
are
overlapped
with
the
normal
range
by
50%.
All
these
aspects
have
been
reviewed
by
the
domain
experts,
and
they
asserted
that
the
vagueness
has
been
done
in
an
intuitively
accurate
way.
Regarding
the
completeness,
this
fuzzy
ontology
is
an
exten-
sion
of
our
crisp
ontology
[31].
First,
the
crisp
ontology
is
complete
because
it
has
been
tested
using
a
set
of
competency
questions
and
using
all
medical
concepts
of
diabetes
diagnosis
domain.
In
other
words,
our
domain
experts
have
collected
328
medical
terms
from
some
diabetes
diagnosis
CPGs
such
as
Canadian
Diabetes
Guideline,
and
they
have
tested
the
coverage
of
the
ontology
for
all
of
these
terms.
The
ontology
has
100%
concept
coverage
for
all
medical
con-
cepts
required
to
describe
diabetic
patient
cases.
Second,
domain
experts
have
checked
the
completeness
of
vagueness.
We
used
a
set
of
SPARQL
and
protégé
DL
queries
to
verify
the
ability
of
the
ontology
to
answer
any
fuzzy
queries
defined
by
domain
experts;
experts
have
verified
that
all
the
vagueness
needed
for
diabetes
domain
has
been
represented
in
the
ontology.
Third,
we
have
evaluated
our
ontology
using
criteria-based
and
data-driven
approaches.
Brewster
et
al.
[79]
argued
that
precision
and
recall
are
not
appropriate
for
ontology
evaluation
because
they
depend
on
a
comparison
between
concepts
of
evaluated
ontology
and
a
standard
one.
There
are
no
standard
ontology
evaluation
mechanisms
[80].
To
measure
the
quality
of
our
ontology,
we
can
use
criteria-based
or
data-driven
evaluation
mechanisms.
Regard-
ing
criteria-based
evaluation
mechanisms,
we
need
to
compare
it
with
other
ontologies
in
the
same
domain.
There
are
no
other
(fuzzy)
case-base
ontologies
in
the
medical
domain
to
compare
our
ontology
with
it.
Alexopoulos
et
al.
[21]
have
proposed
a
fuzzy
case-
base
ontology
for
electricity
market
CBR
system.
On
the
other
hand,
there
are
some
crisp
case-base
ontologies
such
as
ArgCBROnto
[81]
for
argumentation,
[82]
for
mould
design,
and
[83]
for
resource
management.
There
are
many
proposed
criteria
to
quantify
the
quality
of
ontologies
[84].
Some
of
these
criteria
such
as
consis-
tency
can
be
successfully
determined
using
semantic
reasoners.
Some
criteria,
such
as
clarity
is
difficult
to
evaluate
as
there
are
no
means
in
place
to
determine
them.
Most
of
the
proposed
criteria
are
overlapped.
From
the
set
of
criteria
proposed
in
the
literature,
we
depend
on
criteria
proposed
by
Djedidi
et
al.
[84],
where
each
criterion
can
be
measured
by
metrics.
These
criteria
include:
1.
Complexity
criterion,
which
assesses
structural
and
semantic
links
between
ontology
entities
and
the
navigability
in
ontology
structure,
2.
Cohesion
criterion,
which
takes
into
account
the
connected
ontol-
ogy
components
(i.e.
classes),
3.
Conceptualization
criterion,
which
corresponds
to
design
richness
of
the
ontology
content,
4.
Abstraction
criterion,
which
indicates
class
abstraction
level
(generalization/specialization)
by
measuring
the
depth
of
sub-
sumption
hierarchies,
200
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Table
4
Ontology
evaluation
quality
metrics.
Measure
Ontology
The
proposed
ontology
ArgCBROnto
Criteria
Metrics
Complexity The
average
number
of
paths
to
reach
a
class
from
the
root 5
3
Average
number
of
semantic
relations
(object
properties)
per
class
1.4
1.3
Abstraction
Average
depth
of
the
ontology
2
2
Cohesion
Average
number
of
connected
components
(classes)
63
27
Conceptualization Semantic
richness:
Ratio
of
the
total
number
of
semantic
relations
assigned
to
classes,
divided
by
the
total
number
of
ontology
relations
(object
properties
and
subsumption
relations)
58/(58
+
59)
=
0.495
38/(38
+
23)
=
0.62
Attribute
richness:
Ratio
of
the
ontology’s
total
number
of
attributes
(i.e.,
the
data
properties),
divided
by
the
total
number
of
classes
138/62
=
2.26 24/26
=
0.92
Inheritance
richness:
Average
number
of
subclasses
per
class
5.0
2.875
Comprehension Documentation
of
the
properties
2.04%
0.0%
Documentation
of
the
classes
88.71%
0.0%
Table
5
Ontology
evaluation
measures
for
three
ontologies.
Knowledge
coverage
measures
The
measure
Number
of
classes
Number
of
properties
Number
of
axioms
Maximum
number
of
Parents
Documentation
of
the
classes
Documentation
of
the
properties
Properties
with
domain
Properties
with
range
Number
of
individuals
The
proposed
ontology
62
196
1316
31
88.71%
2.04%
98.47%
98.98%
2640
ArgCBROnto
[81] 26
62
446
13
0%
0%
85.48%
77.41%
0
Alexopoulos
et
al.
[21]
10
18
N/A
N/A
N/A
N/A
N/A
N/A
N/A
Sclerosi
ng
Microcystic
Withou
t nephrocalcinosi
s
Cortical cystic
Kidne
y
disea
se
Glomerular disease
Glome
rulon
ephrit
is
Glomer
uloscl
erosis
Acute
Chronic
Acute prolifer
ati
ve
Idiopathic
crescentic
Type I
Type II
Chronic focal Membranous
Stage II
I
Stage
I
Stage II
Focal segmental
Hyperfiltration
Classical
Autosom
al recess
ive
Structural
and
f
u
n
ct
i
o
nal abnormalities
Cystic dis
ease of kidn
ey
Congenital
Medullary sponge kidney
With nephrocalcinosis
N
e
p
hrocalcino
sis
Macrosco
p
ic
N
eonatal
Cortical Medullary
li
p
oma
tosis
reni
sUrem
ia
Uremic acidosisUrem
ic neuro
p
ath
y
Fig.
22.
A
sub-graph
of
SCT
for
kidney
disease.
5.
Completeness
criterion,
which
evaluates
if
the
ontology
covers
domain
relevant
properties;
This
criterion
has
been
evaluated
previously
by
the
ontology
concept
coverage,
6.
Comprehension
criterion,
which
assesses
the
facility
of
under-
standing
ontology.
ArgCBROnto
is
the
most
complete
ontology,
and
the
other
stud-
ies
have
no
OWL2
ontologies.
Table
4
represents
a
comparison
between
ArgCBROnto
and
our
ontology
regarding
these
metrics.
For
calculating
metrics,
we
have
used
the
equations
proposed
by
Zhang
et
al.
[85].
The
ontology
parameters
used
for
metrics
are
calculated
using
a
protégé
4.3
ontology
editor’s
evaluation
plugin
(i.e.,
Ontology
Evaluation10).
Protégé
has
other
automatic
evalua-
tion
plugins
such
as
OntoClean
and
AEON
(Automatic
Evaluation
of
ONtologies)11.
As
shown
in
the
table,
regarding
data-driven
evaluation
mech-
anisms,
Fernández
et
al.
[86]
have
proposed
another
measure
for
data-driven
evaluation.
They
measure
the
structure
of
the
ontology
including
number
of
classes,
properties,
axioms,
and
individuals.
Protégé
Ontology
Evaluation
plugin
calculates
these
metrics
and
others
including
naming
conventions,
class
hierarchy,
object
prop-
erties
hierarchy,
data
type
properties
hierarchy,
documentation,
10 http://protegewiki.stanford.edu/wiki/Ontology
Evaluation.
11 http://code.google.com/p/aeon-project/.
properties
domain
and
range,
disjointness
restrictions,
and
lex-
ically
similar
concepts
and
properties.
The
application
of
these
measures
is
summarized
in
Table
5,
where
our
ontology
does
over-
weight
the
compared
ontologies.
6.2.2.
Evaluation
of
the
proposed
semantic
retrieval
algorithm
The
proposed
retrieval
algorithm
supports
five
types
of
features
including
numerical,
nominal,
fuzzy,
ordinal,
and
semantic.
The
last
type
(i.e.
semantic
type)
measures
the
clinical
distance
between
the
compared
SCT
standard
medical
concepts.
In
this
section,
the
proposed
semantic
similarity
algorithm
is
evaluated
by
comparing
it
with
the
most
popular
semantic
similarity
algorithms
in
CBR
(i.e.
with
JCOLIBRI2
[52]).
As
shown
in
Fig.
22,
this
is
done
by
doing
experiments
using
a
sub-ontology
from
our
SCT
ontology
for
kidney
diseases,
assuming
that
w1
and
w2
are
0.5
in
Eq.
(9).
We
argue
that
there
is
a
difference
between
the
lexical,
seman-
tic,
and
clinical
similarity.
Lexical
similarity
depends
on
the
level
of
textual
similarity
between
the
two
concepts.
Therefore,
the
lexi-
cal
similarity
SIMlexical (Chronic
focal,
Membranous)
is
equal
0,
and
this
is
not
accurate
because
both
197618004|chronic
focal
glomeru-
lonephritis
and
77182004|membranous
glomerulonephritis
are
both
20917003|chronic
glomerulonephritis.
The
semantic
similarity
adds
some
intelligence
to
this
process.
If
we
compare
two
patients
P1
and
P2
with
diseases
D1
=
“kidney
disease”
and
D2
=
“renal
disorder”,
then
the
semantic
distance
SimSemantic (D1,
D2)
=
1.
Another
example,
if
D1
=
“autosomal
dominant
focal
segmental
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
201
Table
6
The
comparison
between
JCOLIBRI
semantic
similarity
methods
and
our
proposed
one.
Method
Similarity
Fdeep
basic
Fdeep
Cosine
Detail
Proposed
method
Case
no.
Sim(Type
I,
Type
I)
7/7
7/7
7/7
13/14
1
1
Sim(“lipomatosis
renis”,
uremia)
0
0
1
1/2
0.11
2
Sim(Cortical,
Classical) 0
0
1/5
1/2
0.04
3
Sim(Chronic,
Stage
II)
4/7
4/6
2/6
8
0.66
4
Sim(Glomerulosclerosis,
Type
I)
2/7
2/7
2/21
4
0.47
5
Sim(Glomerulosclerosis,
acute)
2/7
1/2
2/12
4
0.383
6
Sim(Acute,
chronic)
3/7
3/4
3/4
5/6
0.889
7
Sim(Type
I,
chronic)
3/7
3/7
3/21
6
0.423
8
glomerulosclerosis”
and
D2
=
“hyperfiltration
focal
segmental
glomerulosclerosis”
then
SimSemantic (D1,
D2)
=
1.
Semantic
similar-
ity
depends
on
the
ontology
structure
to
infer
the
level
of
similarity
between
two
concepts.
However,
the
two
patients
in
the
second
example
are
more
similar
than
the
first
example.
This
is
because
while
both
patients
with
“kidney
disease”
from
a
semantic
perspective
have
the
same
concept
and
therefore
a
semantic
distance
is
zero,
when
applying
these
concepts
to
the
patient
case,
“kidney
dis-
ease”
could
mean
many
other
disease
entities
including
“Medullary
sponge
kidney”,
“medullary
cystic
disease
OS”,
“caliectasis”,
“amy-
loid
nephropathy”,
“hypertensive
renal
disease”,
and
other.
On
the
other
hand,
in
the
second
example,
444977005|autosomal
dominant
focal
segmental
glomerulosclerosis
and
236405006|hyperfiltration
focal
segmental
glomerulosclerosis
refer
to
a
specific
disease
entity.
We
propose
to
handle
this
issue
by
using
the
clinical
simi-
larity
measurement.
In
clinical
similarity,
the
SimClinical (“kidney
disease”,
“kidney
disease”)
<
SimClinical (“autosomal
dominant
focal
segmental
glomerulosclerosis”,
“hyperfiltration
focal
segmental
glomerulosclerosis”).
As
a
result,
the
three
similarities
are
not
equal
regarding
accuracy,
i.e.
SimLexical /=
SimClinical /=
SimSemantic.
Our
proposed
similarity
measure
takes
into
account
the
level
of
speci-
ficity
of
a
concept
that
subsumes
the
two
compared
concepts
and
the
level
of
commonality
between
the
compared
concepts.
As
a
result,
as
shown
in
Table
6,
the
similarity
Sim
(Type
I,
Type
I)
=
1
because
Type
I
and
Type
I
are
very
specific
in
the
ontology.
The
sim-
ilarity
Sim
(Acute,
Chronic)
=
0.889
because
these
concepts
are
not
specific;
they
contains
many
sub-concepts.
Our
algorithm
is
very
sensitive
to
the
level
of
similarity
between
the
compared
concepts.
As
we
can
see
in
Table
6,
Fdeep
basic
and
Fdeep
do
not
take
in
to
account
the
depth
of
concepts
from
their
LCA
(i.e.,
the
closeness
between
concepts)
as
in
cases
7,
8.
Moreover,
Cosine
and
Detail
do
not
account
for
the
differences
between
concepts
such
as
cases
5,
6.
What
is
more,
there
are
distributed
inefficiencies
as
Detail
(Type
I,
Type
I)
/=
1,
Cosine
(“lipomatosis
renis”,
Uremia)
=
1,
etc.
On
the
other
hand,
the
proposed
similarity
measure
provides
logically
con-
sistent
results
for
all
types
of
problems
because
it
accounts
for
into
account
the
depth
of
the
compared
concepts
from
their
LCA,
and
it
takes
the
differences
between
compared
concepts
as
well
as
com-
monalities.
Eqs.
(17)–(19)
are
the
implementation
of
the
semantic
equations
in
the
JCOLIBRI
OntoBridge
API
environment,
where
()
is
the
difference.
SimWU+Palmer (u,
v)=2
×
max
ProfLCS (u,
v)
(profConcept (u)
maxProfLCS (u,
v)) +(profConcept (v)
maxProfLCS (u,
v)) +(2
×
maxProfLCS (u,
v)) (17)
Simfeature (u,
v)=
1
Math.
log (1
+
x)
Math.
log (2)(18)
x
=(super (u,
CN)) (super (v,
CN))+(super (v,
CN)) (super (u,
CN))
(super (u,
CN)) (super (v,
CN))+(super (v,
CN)) (super (u,
CN))+(super (u,
CN)) (super (v,
CN))+(super (v,
CN)) (super (u,
CN))(19)
6.2.3.
Performance
of
the
proposed
system
As
domain
experts
knowledge
are
known
to
be
the
most
rel-
evant
for
evaluating
the
CDSS
performance,
one
measure
of
the
performance
of
our
system
is
the
extent
to
which
the
proposed
system
decisions
are
matched
with
domain
experts
decisions
[87].
After
system
development,
our
domain
experts
have
conducted
realistic
experiments
to
test
the
accuracy,
correctness,
flexibility,
applicability,
and
ease
of
use
of
the
proposed
diabetes
diagnosis
CDSS
framework.
The
testing
environment
is
the
Mansura
Univer-
sity
Hospitals,
and
they
have
reported
the
results.
The
results
show
that
our
implemented
CDSS
is
a
realistic
model
of
the
real
world
of
diabetes
diagnosis.
Patient
symptoms
and
tests
are
collected
in
real-time
using
crisp,
fuzzy,
text,
and
semantic
values.
If
the
crisp
value
of
an
attribute
is
not
available,
the
domain
expert
can
set
a
descriptive
vague
value
according
to
the
patient’s
description
of
these
condi-
tions.
The
determination
of
patient’s
current
diseases
(e.g.,
kidney,
liver,
cancer,
etc.)
is
selected
from
the
SCT
ontology
form.
SCT
pro-
vides
the
most
comprehensive
and
standard
interface
for
selecting
concepts
that
describe
patient
diseases.
These
values
form
a
new
query
case,
and
this
case
is
formatted
in
the
form
of
semantic
query
according
to
the
case
base
fuzzy
ontology.
The
CBR
retrieval
engine
retrieves
the
most
similar
k
cases.
The
value
of
k
is
selected
by
the
domain
expert.
The
system’s
produced
decisions
are
compared
with
the
experts’
diagnoses
of
the
case.
After
execution
of
the
sys-
tem
for
many
times,
domain
experts
have
evaluated
our
system
as
100%
regarding
flexibility,
adequacy,
and
ease
of
use.
Figs.
14
and
15
illustrate
the
screen
shots
of
our
prototype
application
in
a
test-
ing
scenario.
We
have
applied
this
study
on
a
case-base
containing
60
cases
from
Mansura
University
Hospitals.
Out
method
shows
promising
results.
These
results
can
be
considered
as
a
first
step
for
real
world
testing
of
our
proposed
system.
We
did
the
evaluation
of
our
system
using
a
set
of
measures.
First,
we
used
the
leave-one-in
evaluation
technique
to
check
the
accuracy
of
our
system
to
retrieve
existing
cases.
Our
system
was
100%
accurate
when
retrieving
existing
cases.
Second,
we
used
the
leave-one-out
technique
to
measure
the
performance
for
non-existing
cases.
Namely,
cases
are
taken
out
from
the
case-base
one
by
one,
and
we
have
computed
the
simi-
larity
of
this
case
with
all
the
remaining
cases
in
the
case-base.
It
is
a
particular
case
of
cross-validation.
It
has
been
used
to
evaluate
many
CBR
systems
including
radiotherapy
planning
system
[88],
diabetes
management
system
[89],
and
Fuzzy
CBR
systems
[90].
The
domain
experts
evaluate
the
performance
of
the
implemented
framework
by
organizing
a
set
of
43
experiments.
The
test
cases
are
selected
in
a
manner
that
allowed
them
to
span
the
majority
of
topics
and
content
represented
in
the
case
base.
Each
test
query
is
fed
into
the
system,
and
the
corresponding
response
was
recorded.
202
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Table
7
The
system
performance
evaluation.
K
=
3
Test
case
FKI-CBR
decision
Expert
decision
Retrieved
cases
decision
Confidence
(%)
Case
1Diabetic
90.1 Diabetic
Diabetic
88.2
Diabetic
88
Case
2Pre-diabetic
92 Diabetic
Diabetic
91
Pre-diabetic
gestational
90
Case
3Normal
98.1 Normal
Normal
97.7
Normal
95.7
Case
4Normal
98.2 Normal
Normal
97
Normal
94.1
Case
5Pre-diabetic
99.4 Pre-diabetic
Diabetic
gestational
93
Diabetic
91
Case
6Pre-diabetic
99 Pre-diabetic
Diabetic
gestational
92
Diabetic
91
Case
7Diabetic
gestational
100 Diabetic
gestational
Diabetic
92
Pre-diabetic
91
Case
8Diabetic
97 Diabetic
Diabetic
96
Diabetic
93
Case
9Diabetic
95 Diabetic
Diabetic
94
Diabetic
92
Case
10 Diabetic
94 Diabetic
Diabetic
gestational
90
Pre-diabetic
89
Case
11 Diabetic
97 Diabetic
Diabetic
95
Diabetic
94
Case
12 Diabetic
98 Diabetic
Diabetic
92
Diabetic
91
Case
13 Diabetic
98 Diabetic
Pre-diabetic
95
Diabetic
91.6
Case
14 Diabetic
94 Diabetic
Diabetic
93
Diabetic
93
Case
15 Diabetic
87 Diabetic
Diabetic
85
Diabetic
84
Case
16 Diabetic
86 Diabetic
Diabetic
84
Diabetic
83
Case
17 Diabetic
93 Diabetic
Diabetic
91
Normal
87
Case
18 Diabetic
93 Diabetic
Diabetic
92.5
Diabetic
91
Case
19 Pre-diabetic
84 Pre-diabetic
Diabetic
82
Pre-diabetic
81
Case
20 Pre-diabetic
92 Pre-diabetic
Diabetic
91
Diabetic
90
Case
21 Normal
91.5 Normal
Normal
91
Pre-diabetic
90
Case
22 Diabetic
94 Diabetic
Diabetic
gestational
92
Pre-diabetic
91
Case
23 Diabetic
98.2 Diabetic
Diabetic
93.5
Diabetic
92.6
Case
24 Diabetic
96.5 Diabetic
Pre-diabetic
94
Diabetic
gestational
94
Table
7
(Continued)
K
=
3
Test
case
FKI-CBR
decision
Expert
decision
Retrieved
cases
decision
Confidence
(%)
Case
25 Diabetic
92 Diabetic
Diabetic
91
Diabetic
87
Case
26 Diabetic
98 Diabetic
Pre-diabetic
97
Diabetic
93
Case
27 Diabetic
93 Diabetic
Diabetic
92
Diabetic
91.9
Case
28 Diabetic
95.43 Diabetic
Diabetic
95.2
Diabetic
95
Case
29 Diabetic
95 Diabetic
Pre-diabetic
93.7
Diabetic
92.6
Case
30 Normal
97.74 Normal
Normal
97.6
Normal
94.6
Case
31 Pre-diabetic
91.97 Pre-diabetic
Diabetic
90.8
Diabetic
89.01
Case
32 Diabetic
92.1 Diabetic
Diabetic
89.9
Diabetic
87.7
Case
33 Normal
91.5 Normal
Normal
90.3
Diabetic
90
Case
34 Diabetic
95.5 Diabetic
Diabetic
87.5
Diabetic
87.2
Case
35 Normal
93.05 Normal
Pre-diabetic
92.2
Pre-diabetic
91
Case
36 Diabetic
87 Diabetic
Pre-diabetic
86
Diabetic
85.9
Case
37 Pre-diabetic
92.06 Pre-diabetic
Diabetic
90.8
Diabetic
90.3
Case
38 Diabetic
90.9 Diabetic
Diabetic
90
Pre-diabetic
88
Case
39 Pre-diabetic
90.4 Pre-diabetic
Diabetic
89
Diabetic
87.84
Case
40 Normal
95.79 Normal
Normal
94.63
Normal
94.1
Case
41 Diabetic
97.52 Diabetic
Diabetic
97.03
Diabetic
95.1
Case
42 Normal
92.27 Normal
Pre-diabetic
90.5
Normal
90.3
Case
43 Diabetic
97.5 Diabetic
Diabetic
94.5
Diabetic
93.8
The
proposed
system’s
decisions
are
compared
with
the
domain
expert
ones
[21,71],
and
the
“system’s
effectiveness”
is
referred
to
the
amount
of
right
answers,
that
is
to
say,
the
answers
that
verify
what
the
expert
had
said.
In
other
words,
the
accuracy
is
inversely
proportional
to
the
amount
of
the
system’s
failures.
As
shown
in
Table
7,
our
CDSS
takes
decisions
similar
to
those
of
domain
expert
for
all
cases
in
the
test
set.
The
table
contains
three
main
columns
the
proposed
system
decision,
the
confidence
of
these
decisions,
and
the
corresponding
domain
expert
decisions.
This
study
testified
the
performance
of
the
proposed
CBR
approach
through
experiment.
The
system
results
are
contrasted
with
the
domain
expert
decisions
to
determine
if
the
results
matched
the
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
203
Table
8
The
2
×
2
ROC
confusion
matrix.
System
decision
Domain
expert
decision
Positive
Negative
Positive
TP
FP
Negative
FN
TN
diagnosis
expected
by
the
expert
or
not.
With
these
data,
the
accu-
racy,
precision,
recall,
accuracy,
and
f-measure
of
the
system
could
be
measured.
We
have
selected
k
=
3
to
assert
the
system
behavior.
For
example,
in
case
1,
the
system
has
decided
that
this
case
has
Diabetic
diagnose
for
the
three
choices
with
similarities
of
90.1,
88.2,
and
88.
The
system
performs
right
for
most
types
of
diagno-
sis,
e.g.
Pre-diabetes,
Diabetic
(cases
2,
3,
5,
10),
and
Normal.
As
we
can
see
in
Table
7,
there
is
only
one
false
decision
in
case
2,
where
the
patient
is
diabetic,
but
the
system
diagnose
it
as
per-diabetic.
The
semantic
performance
of
the
system
is
97.67%,
compared
to
66%
using
Node
Distance
(ND)
metrics
only,
79%
using
IC
similarity
metric
only,
and
82%
using
combination
of
both
IC
and
ND
[91].
Based
on
results
in
Table
7,
we
use
a
2
×
2
ROC
confusion
matrix
to
calculate
the
evaluation
metrics
of
our
system.
For
Diabetic
deci-
sions
only,
the
values
of
TP,
FP,
FN,
TN
in
Table
8
can
be
interpreted
as:
TP
=
the
CBR
system
decides
the
diabetic
case,
and
domain
expert
decides
a
diabetic
case.
FP
=
the
CBR
system
decides
a
diabetic
case,
but
the
domain
expert
do
not.
FN
=
the
CBR
system
decides
not
a
diabetic
case,
but
the
domain
expert
decides
it
is
diabetic.
TN
=
the
CBR
system
decides
not
a
diabetic
case,
and
the
expert
decides
not
a
diabetic
case.
The
above
parameters
can
be
evaluated
for
Pre-diabetic
and
Normal
as
well.
For
space
restrictions,
we
calculate
Precision
(P),
Recall
(R),
Accuracy
(A),
Sensitivity
(S),
Effectiveness
(E),
and
Neg-
ative
Prediction
Value
(NPV)
for
Diabetic
decisions
only
as
follows.
The
metrics
E
and
NPV
are
calculated
using
the
following
equations:
Effectiveness (E)=
F
Measure (Score)=1
1/2P+1/2R(20)
Table
9
Diabetic
decision
confusion
matrix.
System
decision
Domain
expert
decision
Positive
Negative
Positive
27
0
Negative
1
15
Negative
Prediction
Value (NPV)=TN
TN
+
FN (21)
From
Table
7,
we
have
calculated
the
values
in
Table
9
for
the
proposed
systems.
The
P,
R,
A,
S,
E
regarding
diabetic
diagnosis
are
P
=27
27+0=
100%,
R
=27
27+1=
96.43%,
A
=27+15
27+15+0+1=
97.67%,
S
=
15
15+0=
100%,
E
=1
(1/2×(1))+(1/2×(0.9643)) =
98.18%,
and
NPV
=
15
15+1=
93.75%
Although,
the
pre-diabetic
and
normal
patients
form
less
than
half
of
the
case-base,
the
proposed
system
accuracy
for
predicting
them
is
100%.
The
performance
of
our
proposed
system
is
enhanced
because
its
similarity
measures
take
into
account
the
nature
of
all
features.
6.2.4.
A
comparison
between
the
proposed
and
other
CBR
systems
Most
of
the
existing
diabetes
diagnosis
CBR
systems
are
tradi-
tional,
and
they
did
not
provide
adequate
evaluations
[6,8].
Fig.
23
shows
a
comparison
with
two
diabetes
diagnosis
systems
[92],
and
it
asserts
that
our
system
has
a
better
performance
than
these
sys-
tems.
Montani
et
al.
[93]
proposed
a
traditional
CBR
system
for
dia-
betes
care
with
the
accuracy
of
83%.
The
4DSS
hybrid
CBR–RBR
(Rule
Based
Reasoning)
system
proposed
by
Marling
[89]
has
retrieval
accuracy
of
80%.
Fuzzy
case-based
reasoning
has
not
been
used
for
diagnosis
of
diabetes
before;
however,
it
has
been
used
for
develop-
ing
other
medical
systems
as
the
diagnosis
of
stress
[94].
The
results
of
this
system
are
Precision
=
79.16%
and
Recall
=
79.96%.
Utilizing
fuzzy
ontology
with
a
rule-based
system
for
diabetes
management
in
[23]
has
enhanced
the
accuracy
to
be
91.2%.
However,
Lee
and
Wang
[23]
used
the
rule-based
reasoning
technique,
which
is
not
suitable
for
experience-based
problems
such
as
diabetes
diagnosis.
Moreover,
Lee
and
Wang’s
study
used
the
Pima
Indians
Dataset,
but
we
use
real
cases
from
Mansura
University
Hospitals
in
Egypt.
Fig.
23.
A
comparison
between
the
proposed
system
and
traditional
ones.
204
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Table
10
A
comparison
between
the
accuracy
of
proposed
system
and
other
studies.
Reasoning
type
Domain
System
name
Purpose
Accuracy
(%)
Fuzzy
CBR Medical
and
semantic
The
proposed
system
Diabetes
diagnosis
97.67
Medical ConFuCiuS
[95]
Diabetes
diagnosis
75.53
CBFDT
[96]
Diagnosis
of
liver
disorder
85
Begum
et
al.
[97]
Diagnosis
of
stress
80
Petrovic
et
al.
[88]
Radiotherapy
planning
84.72
Non-medical Li
et
al.
[98]
Financial
application
92.36
Arias-Aranda
et
al.
[99] Knowing
the
relationship
between
flexibility
and
operations
strategy
89.23
Khanum
et
al.
[22]
Facial
expression
recognition
85
Han
et
al.
[100]
Endpoint
prediction
of
basic
oxygen
furnace
(BOF)
91.98
Sushmita
et
al.
[101]
Financial
application
75
Xiong
et
al.
[102]
Hybrid
rule-CBR
93.25
Martins-Bede
et
al.
[103]
Classifying
the
prevalence
of
Schistosomiasis
in
the
state
of
Minas
Gerais
in
Brazil
71
Jin
et
al.
[104]
Customer-driven
design
92
Traditional
CBR Medical System
implemented
on
our
dataset
Diabetes
diagnosis 57.14
T-IDDM
[93] Diabetes
treatment
and
monitoring
using
conventional
insulin
therapy
84
Marling
et
al.
[105]
Type
1
diabetes
management
in
insulin
pump
therapy
77.5
Balakrishnan
et
al.
[106]
Predictive
system
for
diabetic
retinopathy
85
Bellazzi
et
al.
[107]
Diabetes
therapy
90
Marling
et
al.
[89] 4DSS
system
for
diabetes
diagnosis 80
Based
on
our
case-base
knowledge,
we
have
implemented
a
CBR
system
without
any
semantic
capabilities
(i.e.
neither
case
base
fuzzy
ontology
nor
domain
standard
ontology).
The
resulting
system
has
achieved
precision
=
85.7%,
recall
=
42.85%,
accuracy
=
57.14%,
specificity
=
85.7%,
effectiveness
=
57.13%,
and
NPV
=
42.9%.
Our
system
has
achieved
a
better
performance,
which
explains
the
effects
of
case
base
knowledge
preparation
and
seman-
tic
case
retrieval
algorithms.
Moreover,
our
system
has
only
one
false
case
(Case
2
in
Table
7).
One
of
the
most
important
features
of
CBR
is
the
ability
to
retrieve
k
similar
cases
to
the
current
problem.
In
our
systems,
if
we
just
consider
k
=
2,
then
our
system
will
have
no
false
negative
cases,
and
the
accuracy
will
be
100%.
To
compare
our
system
with
previous
studies,
Table
10
compares
our
system’s
performance
with
a
set
of
existing
medical
and
non-medical
CBR
studies.
6.2.5.
A
comparison
of
the
proposed
system
and
machine
learning
classifiers
Shankaracharya
et
al.
[108]
presented
a
review
of
diabetes
diag-
nosis
techniques.
Techniques
such
as
artificial
neural
networks
(ANN),
support
vector
machines
(SVMs),
neuro-fuzzy
systems
and
expert
systems
that
developed
by
different
authors
have
been
dis-
cussed.
Firstly,
all
these
studies
have
lower
performance
than
ours.
However,
these
systems
mostly
depend
on
Pima
Indians
Dataset12.
To
compare
our
system
with
these
techniques,
it
is
better
to
run
these
algorithms
on
our
dataset.
This
dataset
has
been
prepared
before,
and
all
noise
and
missing
data
have
been
handled
[35].
For
the
comparing
purpose,
we
apply
some
machine
learning
classi-
fiers
including
C4.5,
k-NN,
SVM,
Bayesian
classifier,
and
ANN
on
our
dataset
and
measure
their
performance.
We
use
the
2-fold,
3-fold,
4-fold.10-fold
The
cross-validation
technique
in
the
eval-
uation
process.
Cross-validation
is
a
statistical
technique
useful
in
determining
the
robustness
of
a
model.
The
n-fold
cross
validation
divides
the
whole
data
set
into
n
folds.
The
n
1
folds
are
used
for
training,
and
one
fold
is
used
for
testing.
This
process
is
continued
until
each
fold
from
n
is
used
for
testing.
12 https://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes.
The
overall
performance
of
these
algorithms
is
presented
in
Table
11.
For
the
k-NN
algorithm,
we
select
k
=
3
as
done
in
our
system;
however,
its
performance
is
low.
C4.5
achieves
the
best
performance
(about
89.19%)
among
machine-learning
techniques;
however,
our
system
outperforms
it.
After
testing
the
machine
learning
algorithms
using
from
2-fold
to
10-fold
cross-validation
techniques,
we
calculate
the
average
performance
of
each
fold,
and
we
make
a
comparison
of
different
folds’
results.
Fig.
24
shows
that
the
best
performance
is
achieved
with
5-fold
cross
validation.
We
calculate
the
average
precision,
recall,
accuracy,
f-measure,
and
specificity
for
all
folds.
These
averages
are
compared
with
the
proposed
system,
the
5-fold
cross
validation,
and
the
traditional
(i.e.
not
fuzzy
and
not
semantic)
system,
as
shown
in
Fig.
25.
Our
find-
ings
show
that
the
fuzzy
KI-CBR
can
classify
data
more
accurately
than
the
other
machine
learning
techniques
and
conventional
CBR.
It
can
be
seen
in
Fig.
25
that
the
machine
learning
classifiers
have
better
performances
than
conventional
CBR
systems.
This
means
that
our
study
makes
a
high
improvement
in
the
CBR
performance.
The
average
accuracies
of
C4.5,
conventional
CBR,
and
proposed
system
are
88.88%,
57.14%,
and
98.18%,
respectively.
The
proposed
approach
demonstrates
a
major
improvement
than
machine
learn-
ing
techniques
and
conventional
CBR
system.
The
results
of
this
study
clearly
indicate
that
the
hybridization
of
CBR
with
fuzzy
ontology
and
medical
ontologies
is
the
most
suitable
technique
for
solving
medical
diagnosis
problems.
The
enhanced
performance
of
our
system
is
a
result
of
a
couple
of
reasons.
Firstly,
the
proposed
CBR
framework
is
integrated
and
complete.
All
com-
ponents
have
been
fully
implemented
and
tested.
The
knowledge
representation
formalism
using
fuzzy
ontology
integrates
the
rea-
soning
capabilities
of
fuzzy
logic,
description
logic,
and
CBR.
There
are
many
studies,
which
use
each
of
these
reasoning
mechanisms
individually,
but
they
have
not
achieved
high
accuracy.
The
second
reason
is
the
preparation
of
case-base
data.
These
data
have
been
pre-processed,
fuzzified,
and
encoded
before
populated
into
the
case-base
knowledge.
As
a
result,
accurate
data
will
produce
accu-
rate
decisions.
The
third
reason
is
the
usage
of
a
suitable
weight
vector
for
the
used
case
features;
the
global
similarity
function
has
produced
suitable
similarities.
The
fourth
reason
is
the
proposed
semantic
retrieval
algorithm.
We
have
handled
most
of
the
possi-
ble
datatypes,
which
appear
in
the
medical
domain.
The
fuzzy
types
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
205
Table
11
Performance
of
machine
learning
algorithms
on
our
dataset.
Machine
learning
algorithms
Fold
Algorithm
Precision
(%)
Recall
(%)
Accuracy
(%)
F-measure
(%)
Specificity
(%)
2-Fold C4.5
93.1
93.1
93.33
93.1
93.54
k-NN
(k
=
3) 63.3 63.3 63.33
63.3
64.5
SVM
6.3
58.6
63.33
60.7
67.74
Naive
Bayes
81.8
62.1
75
70.6
87.09
ANN
65.5
65.5
66.66
65.5
67.74
3-Fold C4.5
90
93.1
91.66
91.5
90.32
k-NN
(k
=
3)
60
60
60
59.9
64.51
SVM
71
75.9
73.33
73.3
70.96
Naive
Bayes 65.4 58.6 65
61.8 70.96
ANN
72.4
72.4
73.33
72.4
74.19
4-Fold C4.5
89.7
89.7
90
89.7
90.32
k-NN
(k
=
3)
68.7
68.3
68.33
68
77.41
SVM
69
69
70
69
70.96
Naive
Bayes
77.3
58.6
71.66
66.7
83.87
ANN
75.9
75.9
76.66
75.9
77.41
5-Fold C4.5
92.9
89.7
91.66
91.2
93.54
k-NN
(k
=
3)
68.3
68.3
68.33
68.3
70.96
SVM
78.6 75.9 78.33 77.2
80.64
Naive
Bayes
77.3
58.6
71.66
66.7
83.87
ANN
78.6
75.9
78.33
77.2
80.64
6-Fold C4.5
89.3
86.2
88.33
87.7
90.32
k-NN
(k
=
3)
61.7
61.7
61.66
61.5
67.74
SVM
67.7
72.4
70
70
67.74
Naive
Bayes
61.5
55.2
61.66
58.2
67.74
ANN
73.3 75.9 75
74.6
74.19
7-Fold C4.5
89.7
89.7
90
89.7
90.32
k-NN
(k
=
3)
73.6
73.3
73.33
73.2
80.64
SVM
69.7
79.3
73.33
74.2
67.74
Naive
Bayes
70.4
65.5
70
67.9
74.19
ANN
71.9 79.3 75
75.4
70.96
8-Fold C4.5
89.7
89.7
90
89.7
93
k-NN
(k
=
3) 68.7
68.3
68.33
68
77.41
SVM
74.2
79.3
76.66
76.7
74.19
Naive
Bayes
82.6
65.5
76.66
73.1
87.09
ANN
70
72.4
71.66
71.2
70.96
9-Fold C4.5
89.3
86.2
88.33
87.7
90.32
k-NN
(k
=
3) 66.8
66.7
66.66
66.4
74.19
SVM
75
82.8
78.33
78.7
74.19
Naive
Bayes
79.2
65.5
75
71.7
83.87
ANN
77.4
82.8
80
80
77.41
10-Fold C4.5
74.2
79.3
76.66
76.7
90.32
k-NN
(k
=
3)
73.1
65.5
71.66
69.1
70.96
SVM
77.4
82.8
80
80
77.41
Naive
Bayes 79.2 65.5 75
71.7
83.87
ANN
74.2
79.3
76.66
76.7
74.19
Average
(%)
73.88
73.39
75.1
74.04
78.04
Conventional
CBR
system
85.7
42.85
57.14
57.13
85.7
Proposed
fuzzy
KI-CBR
system
100
96.43
97.67
98.18
100
Fig.
24.
A
comparison
between
the
n-folds
cross
validation
results.
206
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
Fig.
25.
Classification
results
comparison.
support
the
reasoning
using
linguistic
terms
and
enhance
the
simi-
larity
calculation.
Ordinal
features’
similarity
is
based
on
the
expert
domain
knowledge
in
the
form
of
similarity
matrixes.
Semantic
fea-
tures
support
the
calculation
of
clinical
similarities
between
SCT
concepts.
In
addition
to
its
enhanced
performance,
the
proposed
sys-
tem
is
tested
for
problems
that
are
complex
and
cannot
be
solved
by
traditional
systems.
For
example,
If
the
case
base
contains
a
case
C1
=
(age
=
20,
disease
=
“Acute
proliferative”,
uri-
nation
frequency
=
“++”.)
and
the
query
case
is
(age
=
young,
disease
=
“Idiopathic
crescentic”,
urination
frequency
=
“Nil”.);
in
the
traditional
CBR
systems,
these
cases
are
not
similar
and
C1
will
not
be
returned.
For
fuzzy
systems,
the
age
is
matched
right
as
age
=
20
is
the
same
as
age
=
young
(i.e.,
Young(20)
=
1).
How-
ever,
the
comparison
of
semantic
and
ordinal
features
fails
to
get
the
similarity.
In
semantic
CBR
systems,
they
fail
to
get
the
similar-
ity
of
fuzzy
and
ordinal
features.
Due
to
these
conditions,
the
results
of
these
systems
might
prove
to
be
not
accurate.
In
our
proposed
system,
we
have
proposed
algorithms
to
handle
all
of
these
types.
7.
Conclusion
This
paper
proposes
a
fuzzy
ontology-based
semantic
CBR
system
and
its
implementation
for
a
decision
support
system
for
diabetes
diagnosis.
This
system
enhances
the
decision
maker
efficiency
in
the
diagnosing
process.
The
proposed
approach
has
many
contributions
and
novelties:
(1)
It
builds
a
case-base
fuzzy
ontology
compatible
with
the
most
famous
CBR
framework,
i.e.
JCOLIBRI,
(2)
It
builds
and
uses
a
standard
medical
terminology
subset
for
diabetes
diagnosis
from
SCT,
which
is
the
most
complete
medical
ontology,
and
(3)
It
proposes
a
fuzzy-semantic
similarity
algorithm
for
case
retrieval.
Our
implemented
fuzzy
ontology
has
followed
a
formal
methodology,
and
it
has
represented
using
fuzzy
OWL2
language.
The
proposed
fuzzy-semantic
retrieval
algorithm
outweighs
all
of
the
JCOLIBRI
algorithms,
and
it
covers
their
limitations.
The
integration
of
path-based
similarity
measures
and
feature-based
measures
enhances
the
accuracy
of
calculating
clinical
distances
between
concepts.
Our
system
has
achieved
a
performance
of
97.67%.
These
results
show
that
the
proposed
system
has
high
accuracy,
and
physicians
can
consult
it
when
diagnosing
patients.
In
the
future,
we
will
implement
the
rest
of
the
CBR
steps
especially
the
case
adaptation
process.
We
will
utilize
fuzzy
ontology
in
the
other
steps
of
CBR
as
case
adaptation,
retention,
and
case-base
maintenance.
Moreover,
we
will
try
to
integrate
multiple
medical
ontologies
in
our
system
because
SCT
has
limitation
in
many
aspects
as
lab
tests
and
genes
represen-
tation.
Fortunately,
there
are
many
standard
medical
ontologies
for
theses
domains
such
as
LOINC
for
lab
tests
and
GO
for
genes
representation.
The
integration
of
CBR
with
EHR
environment
will
enhance
the
automation
of
the
decision
support
process.
Finally,
we
will
benefit
from
the
relational
database
for
storing
and
query-
ing
the
case
base
fuzzy
ontology.
The
relational
database
supports
storage
of
large
case-base
using
a
semantic
preserving
method.
Acknowledgments
This
project
was
supported
by
King
Saud
University,
Deanship
of
Scientific
Research,
College
of
Sciences,
Research
Centre.
The
authors
would
like
to
thank
Dr.
Farid
Badria,
Prof.
of
Pharma-
cognosy,
Department
and
head
of
Liver
Research
Lab,
Mansoura
University,
Egypt;
and
Dr.
Hosam
Zaghloul,
Prof.
at
Clinical
Pathol-
ogy
Department,
Faculty
of
Medicine,
Mansoura
University,
Egypt,
for
their
efforts
in
this
work.
References
[1]
World
Health
Organization
(WHO).
Diabetes;
2015.
http://www.who.int/
mediacentre/factsheets/fs312/en
(accessed:
2
May
2015).
[2]
Ofori
S,
Unachukwu
C.
Holistic
approach
to
prevention
and
management
of
type
2
diabetes
mellitus
in
a
family
setting.
Diabetes
Metab
Syndr
Obes
2014;7:159–68.
[3]
AlJarullah
A.
Decision
tree
discovery
for
the
diagnosis
of
type
II
diabetes.
In:
International
conference
on
innovations
in
information
technology.
Abu
Dhabi,
UAE:
IEEE;
2011.
p.
303–7.
[4]
Begum
S,
Ahmed
M,
Funk
P,
Xiong
N,
Folke
M.
Case-based
reasoning
systems
in
the
health
sciences:
a
survey
of
recent
trends
and
developments.
IEEE
Trans
Syst
Man
Cybernet,
C
2010;7(1):39–59.
[5]
Marlinga
C,
Montanib
S,
Bichindaritzc
I,
Funkd
P.
Synergistic
case-based
rea-
soning
in
medical
domains.
Expert
Syst
Appl
2014;41(2):249–59.
[6]
Jha
M,
Pakhira
D,
Chakraborty
B.
Diabetes
detection
and
care
applying
CBR
techniques.
Int
J
Soft
Comput
Eng
(IJSCE)
2013;2(6):132–7.
[7]
Jaya
A,
Uma
G.
Role
of
ontology
in
case-based
reasoning
(CBR)
for
diagnosing
diabetes.
J
Inf
Technol
2009;5(3):17–23.
[8]
Chen
J,
Su
S,
Chang
C.
Diabetes
care
decision
support
system.
In:
The
2nd
international
conference
on
industrial
and
information
systems
(IIS).
2010.
p.
323–6,
1.
[9]
El-Sappagh
S,
Elmogy
M,
Riad
A.
A
CBR
system
for
diabetes
mellitus
diagnosis:
case-base
standard
data
model.
Int
J
Med
Eng
Inf
2015;7(3).
[10]
Dendani
N,
Khadir
M,
Guessoum
S.
Use
a
domain
ontology
to
develop
knowl-
edge
intensive
CBR
systems
for
fault
diagnosis.
In:
International
conference
on
information
technology
and
e-Services
(ICITeS).
Sousse,
Tunisia:
IEEE;
2012.
p.
1–6.
[11]
Diaz-Agudo
B,
Gonzalez-Calero
P.
An
architecture
for
knowledge
intensive
CBR
systems,
advances
in
case-based
reasoning,
Enrico
Blanzieri
and
Luigi
Portinale,
vol.
1898.
Berlin,
Heidelberg,
Germany:
Springer;
2000.
p.
37–48.
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
207
[12]
Amailef
K,
Lu
J.
Ontology-supported
case-based
reasoning
approach
for
intelligent
m-Government
emergency
response
services.
Decis
Support
Syst
2013;55(1):79–97.
[13]
Chen
R,
Huang
Y,
Bau
C,
Chen
S.
A
recommendation
system
based
on
domain
ontology
and
SWRL
for
anti-diabetic
drugs
selection.
Expert
Syst
Appl
2012;39(4):3995–4006.
[14]
Health
Level
Seven
International
(HL7),
http://www.hl7.org/
(accessed:
19
August
2015).
[15]
The
International
Health
Terminology
Standards
Development
Organi-
zation
(IHTSDO),
SNOMED
CT:
The
Global
Language
of
Healthcare,
http://www.ihtsdo.org/snomed-ct
(accessed:
19
August
2015).
[16]
Melton
G,
Parsons
S.
Inter-patient
distance
metrics
using
SNOMED
CT
defining
relationships.
J
Biomed
Inf
2006;39:697–705.
[17]
Jirathitikul
P,
Nithisansawadikul
S,
Tongphu
S,
Suntisrivaraporn
B.
A
sim-
ilarity
measuring
service
for
SNOMED-CT
structural
analysis
of
concepts
in
ontology.
In:
The
11th
international
conference
on
electrical
engineer-
ing/electronics,
computer,
telecommunications
and
information
technology
(ECTI-CON).
2014.
p.
1–6.
[18]
El-Sappagh
S,
Elmogy
M,
El-Masri
S,
Riad
A.
A
diabetes
diagnostic
domain
ontology
for
CBR
system
from
the
conceptual
model
of
SNOMED
CT.
In:
The
second
international
conference
on
engineering
and
technology
(ICET
2014).
Cairo,
Egypt:
IEEE;
2014.
p.
1–7.
[19]
El-Sappagh
S,
Elmogy
M,
Riad
A,
Zaghloul
H,
Badria
F.
A
proposed
SNOMED
CT
ontology-based
encoding
methodology
for
diabetes
diagnosis
case-base.
In:
The
ninth
international
conference
on
computer
engineering
and
systems
(ICCES
2014).
Cairo,
Egypt:
IEEE;
2014.
p.
184–91.
[20]
Zadeh
L.
From
search
engines
to
question-answering
systems
the
need
for
new
tools,
advances
in
web
intelligence,
Ernestina
Menasalvas,
Javier
Segovia,
and
Piotr
Szczepaniak,
vol.
2663.
Berlin,
Heidelberg,
Germany:
Springer;
2003.
p.
15–7.
[21]
Alexopoulos
P,
Wallace
M,
Kafentzis
K,
Askounis
D.
Utilizing
imprecise
knowl-
edge
in
ontology-based
CBR
systems
by
means
of
fuzzy
algebra.
Int
J
Fuzzy
Syst
2010;12(1):1–14.
[22]
Khanum
A,
Mufti
M,
Javed
M,
Shafiq
M.
Fuzzy
case-based
reasoning
for
facial
expression
recognition.
Fuzzy
Sets
Syst
2009;160:231–50.
[23]
Lee
C,
Wang
M.
A
fuzzy
expert
system
for
diabetes
decision
support
applica-
tion.
IEEE
Trans
Syst
Man
Cybernet,
B:
Cybernet
2011;41(1):139–53.
[24]
Zhaoa
J,
Cui
L,
Zhao
L,
Qiu
T,
Chen
B.
Learning
HAZOP
expert
system
by
case-
based
reasoning
and
ontology.
Comput
Chem
Eng
2009;33:371–8.
[25]
Samwald
M,
Fehre
K,
Bruin
J,
Adlassnig
K.
The
Arden
syntax
standard
for
clinical
decision
support:
experiences
and
directions.
J
Biomed
Inf
2012;45:
711–8.
[26]
Sánchez
D,
Batet
M,
Isern
D,
Valls
A.
Ontology-based
semantic
similarity:
a
new
feature-based
approach.
Expert
Syst
Appl
2012;39:7718–28.
[27]
Gan
M,
Dou
X,
Jiang
R.
From
ontology
to
semantic
similarity:
calculation
of
ontology-based
semantic
similarity.
Sci
World
J
2013;2013:1–11.
[28]
Rahimi
A,
Liaw
S,
Taggart
J,
Ray
P,
Yu
H.
Validating
an
ontology-based
algo-
rithm
to
identify
patients
with
type
2
diabetes
mellitus
in
electronic
health
records.
Int
J
Med
Inf
2014;83(10):768–78.
[29]
Sherimon
P,
Vinu
P,
Krishnan
R,
Takroni
Y,
AlKaabi
Y,
AlFars
Y.
Adaptive
ques-
tionnaire
ontology
in
gathering
patient
medical
history
in
diabetes
domain.
Proceedings
of
the
first
international
conference
on
advanced
data
and
infor-
mation
engineering,
Tutut
Herawan,
Mustafa
Mat
Deris,
and
Jemal
Abawajy,
vol.
285.
Tannery
Lane,
Singapore:
Springer;
2014.
p.
453–60.
[30]
Sugiyanto
Hayuhardhika
W,
Sarno
R,
Sidiq
M.
Weighted
ontology
and
weighted
tree
similarity
algorithm
for
diagnosing
diabetes
mellitus.
In:
Inter-
national
conference
on
computer,
control,
informatics
and
its
applications
(IC3INA).
IEEE;
2013.
p.
267–72.
[31]
El-Sappagh
S,
El-Masri
S,
Elmogy
M,
Riad
AM,
Saddik
B.
An
ontological
case
base
engineering
methodology
for
diabetes
management.
J
Med
Syst
2014;38(8):67–81.
[32]
Samwald
M,
Stenzhorn
H,
Dumontier
M,
Marshall
M,
Luciano
J,
Adlassnig
KP.
Towards
an
interoperable
information
infrastructure
providing
deci-
sion
support
or
genomic
medicine.
Stud
Health
Technol
Inform
2011;169:
165–9.
[33]
Samwald
M,
Antonio
J,
Giménez
M,
Boyce
R,
Freimuth
R,
Adlassnig
KP,
et
al.
Pharmacogenomic
knowledge
representation,
reasoning
and
genome-based
clinical
decision
support
based
on
OWL
2
DL
ontologies.
BMC
Med
Inf
Decis
Making
2015;15:pp.12.
[34]
Yu
H,
Liaw
S,
Taggart
J,
Khorzoughi
A.
Using
ontologies
to
identify
patients
with
diabetes
in
electronic
health
records.
In:
International
semantic
web
conference.
Berlin,
Heidelberg,
Germany:
Springer-Verlag;
2013.
p.
77–80.
[35]
El-Sappagh
S,
Elmogy
M,
Riad
A,
Zaghlol
H,
Badria
F.
EHR
data
prepara-
tion
for
case
based
reasoning
construction.
The
proceedings
of
the
second
international
conference
on
advanced
machine
learning
technologies
and
applications
(AMLTA14),
communications
in
computer
and
information
sci-
ence
(CCIS),
Aboul
Ella
Hassanien,
Mohamed
Tolba,
and
Ahmad
Azar,
vol.
488.
Cham
(ZG),
Switzerland:
Springer
International
Publishing;
2014.
p.
483–97.
[36]
Adlassnig
K.
Fuzzy
set
theory
in
medical
diagnosis.
IEEE
Trans
Syst
Man
Cyber-
net
1986;16(2):260–5.
[37]
Abdul
M,
Muhammad
A,
Mustapha
N,
Muhammad
S,
Ahmad
N.
Database
workload
management
through
CBR
and
fuzzy
based
characterization.
Appl
Soft
Comput
2014;22:605–21.
[38]
Ekong
V,
Inyang
U,
Onibere
E.
Intelligent
decision
support
system
for
depression
diagnosis
based
on
neuro-fuzzy-CBR
hybrid.
Modern
Appl
Sci
2012;6(7):79–88.
[39]
Thirugnanam
M,
Kumar
P,
Srivatsan
S,
Nerlesh
C.
Improving
the
prediction
rate
of
diabetes
diagnosis
using
fuzzy,
neural
network,
case
based
(FNC)
approach.
Procedia
Eng
2012;38:1709–18.
[40]
Alexopoulos
P,
Wallace
M,
Kafentzis
K,
Askounis
D.
IKARUS-Onto:
a
methodology
to
develop
fuzzy
ontologies
from
crisp
ones.
Knowl
Inf
Syst
2012;32(3):667–95.
[41]
Rodríguez
N,
Cuéllar
M,
Lilius
J,
Calvo-Flores
M.
A
fuzzy
ontology
for
semantic
modelling
and
recognition
of
human
behaviour.
Knowl
Based
Syst
2014;66:46–60.
[42]
Zhang
F,
Yan
L,
Ma
Z.
Reasoning
of
fuzzy
relational
databases
with
fuzzy
ontologies.
Int
J
Intel
Syst
2012;27:613–34.
[43]
Torshizi
A,
Zarandi
M,
Torshizi
G,
Eghbali
K.
A
hybrid
fuzzy-ontology
based
intelligent
systemto
determine
level
of
severity
and
treatment
recommen-
dation
for
Benign
Prostatic
Hyperplasia.
Comput
Methods
Programs
Biomed
2014;113(1):301–13.
[44]
Carlsson
C,
Brunelli
M,
Mezei
J.
Decision
making
with
a
fuzzy
ontology.
Soft
Comput
2012;16:1143–52.
[45]
Mezei
J,
Wikström
R,
Carlsson
C.
Aggregating
linguistic
expert
knowledge
in
type-2
fuzzy
ontologies.
Appl
Soft
Comput
2015;(March),
http://dx.doi.
org/10.1016/j.asoc.2015.03.023.
ISSN:1568-4946;
pii:S1568494615001799.
[46]
Molinera
J,
Galvez
I,
Wikstrom
R,
Viedma
E,
Carlsson
C.
Designing
a
decision
support
system
for
recommending
smartphones
using
fuzzy
ontologies.
Intel
Syst
2015;323:323–34.
[47]
Ali
F,
Kim
E,
Kim
Y.
Type-2
fuzzy
ontology-based
opinion
mining
and
infor-
mation
extraction:
a
proposal
to
automate
the
hotel
reservation
system.
Appl
Intel
2015;42(3):481–500.
[48]
Ali
F,
Kim
E,
Kim
Y.
Type-2
fuzzy
ontology-based
semantic
knowledge
for
colli-
sion
avoidance
of
autonomous
underwater
vehicles.
Inf
Sci
2015;295:441–64.
[49]
Lee
C,
Wang
M,
Hsu
C,
Chen
Z.
Type-2
fuzzy
set
and
fuzzy
ontology
for
diet
application.
Adv
Type-2
Fuzzy
Sets
Syst
2013;301:237–56.
[50]
Park
G,
Benedictos
J,
Lee
C,
Wang
M.
Ontology-based
fuzzy-CBR
support
system
for
ship’s
collision
avoidance.
International
conference
on
machine
learning
and
cybernetics,
vol.
4.
IEEE;
2007.
p.
1845–50.
[51]
The
Open
Biological
and
Biomedical
Ontologies,
http://www.obofoundry.
org/
(accessed:
19
August
2015).
[52]
Recio-García
J,
Díaz-Agudo
B,
González-Calero
P.
In:
Montani
Stefania,
Jain
Lakhmi,
editors.
The
COLIBRI
platform:
tools,
features
and
working
examples,
successful
case-based
reasoning
applications-2,
vol.
494.
Berlin,
Heidelberg,
Germany:
Springer;
2014.
p.
55–85.
[54]
Aamodt
A,
Plaza
E.
Case-based
reasoning
foundational
issues,
methodological
variations,
and
system
approaches.
J
AI
Commun
1994;7(1):39–59.
[55]
Harispe
S,
Sanchez
D,
Ranwez
S,
Janaqi
S,
Montmain
J.
A
framework
for
uni-
fying
ontology-based
semantic
similarity
measures
a
study
in
the
biomedical
domain.
J
Biomed
Inf
2014;48:38–53.
[56]
The
SNOMED
CT
technical
implementation
guide,
July
2014
Interna-
tional
Release,
International
Health
Terminology
Standards
Development
Organization,
http://ihtsdo.org/fileadmin/user
upload/doc/download/
doc TechnicalImplementationGuide
Current-en-US
INT
20140731.pdf
(accessed:
19
August
2015).
[57]
Gruber
T.
Towards
principles
for
the
design
of
ontologies
used
for
knowledge
sharing.
Int
J
Hum
Comput
Stud
1995;43(5–6):907–28.
[58]
Zadeh
L.
Fuzzy
sets.
Inf
Control
1965;8:338–53.
[59]
Bobillo
F.
Managing
vagueness
in
ontologies.
Spain:
University
of
Granada;
2008
(Ph.
D.
Thesis).
[60]
Bobillo
F,
Straccia
U.
fuzzyDL:
an
expressive
fuzzy
description
logic
reasoner.
In:
International
conference
on
fuzzy
systems
(FUZZ-08).
IEEE
Computer
Soci-
ety;
2008.
p.
923–30.
[61]
Recio-Garía
J,
Díaz-Agudo
B.
In:
Ellis
Richard,
Allen
Tony,
Andrew
Tuson,
editors.
Ontology
based
CBR
with
jCOLIBRI
applications
and
innovations
in
intelligent
systems
XIV.
London,
WC1X
8HB,
United
Kingdom:
Springer-
Verlag;
2007.
p.
149–62.
[62]
Description
logics,
http://dl.kr.org/
(accessed:
19
August
2015).
[63]
Bobillo
F,
Straccia
U.
Fuzzy
ontology
representation
using
OWL
2.
Int
J
Approx
Reason
2011;52(7):1073–94.
[64]
Akmal
S,
Shih
L,
Batres
R.
Ontology-based
similarity
for
product
information
retrieval.
Comput
Ind
2014;65(1):91–107.
[65]
Gu
H,
Wang
X,
Ling
Y,
Shi
J.
Building
a
fuzzy
ontology
of
edutainment
using
OWL.
Comput
Sci—ICCS
2007;4489:591–4.
[66]
Baldarrago
A,
Santos
M,
Prado
A.
UPFON:
unified
process
for
building
fuzzy
ontology.
In:
The
ninth
international
conference
on
fuzzy
systems
and
knowl-
edge
discovery
(FSKD).
Chongqing,
Sichuan,
China:
IEEE;
2012.
p.
617–22.
[67]
Ghorbel
H,
Bahri
A,
Bouaziz
R.
Fuzzy
ontologies
building
method:
fuzzy
ontomethodology.
In:
Annual
meeting
of
the
North
American
Fuzzy
Infor-
mation
Processing
Society
(NAFIPS).
Toronto,
Canada:
IEEE;
2010.
p.
1–8.
[68]
Yaguinuma
C,
Santos
M,
Camargo
H,
Nicoletti
M.
Fuzz-onto:
a
meta-ontology
for
representing
fuzzy
elements
and
supporting
fuzzy
classification
rules.
In:
The
12th
international
conference
on
intelligent
systems
design
and
applica-
tions
(ISDA).
Cochin,
India:
IEEE;
2012.
p.
166–71.
[69]
Wu
Z,
Palmer
M.
Verb
semantics
and
lexical
selection.
In:
Proceedings
of
the
32nd
annual
meeting
on
association
for
computational
linguistics.
San
Francisco,
CA:
Morgan
Kaufmann
Publishers;
1994.
p.
133–8.
[70]
Ghanea-Hercock
R.
Applied
evolutionary
algorithms
in
Java.
2003
ed.
New
York,
NY:
Springer-Verlag;
2003.
[71]
Lin
K,
Shih
L,
Lu
S,
Lin
Y.
Strategy
selection
for
product
service
systems
using
case-based
reasoning.
Afr
J
Bus
Manage
2010;4(6):987–94.
208
S.
El-Sappagh
et
al.
/
Artificial
Intelligence
in
Medicine
65
(2015)
179–208
[72]
Garla
V,
Brandt
C.
Semantic
similarity
in
the
biomedical
domain:
an
evaluation
across
knowledge
sources.
BMC
Bioinf
2012;13(1):pp.
261.
[73]
Ma
Z,
Zhang
F,
Yan
L,
Cheng
J.
Fuzzy
ontology
knowledge
bases
storage
in
fuzzy
databases,
fuzzy
knowledge
management
for
the
semantic
web.
Stud
Fuzz
Soft
Comput
2014;306:233–42.
[74]
Zhang
F,
Ma
Z,
Fan
G,
Wang
X.
automatic
fuzzy
semantic
web
ontology
learn-
ing
from
fuzzy
object-oriented
database
model.
Database
Expert
Syst
Appl
2010;6261:16–30.
[75]
Qi
J,
Hu
J,
Peng
Y,
Wang
W,
Zhang
Z.
A
case
retrieval
method
combined
with
similarity
measurement
and
multi-criteria
decision
making
for
concurrent
design.
Expert
Syst
Appl
2009;36:10357–66.
[76]
Seco
N,
Veale
T,
Hayes
J.
An
intrinsic
information
content
metric
for
semantic
similarity
in
WordNet.
The
16th
Eur
Conf
Artif
Intell
(ECAI),
vol.
16.
Amster-
dam,
Netherlands:
IOS
Press;
2004.
p.
1089–90.
[77]
Sánchez
D,
Batet
M,
Isern
D.
Ontology-based
information
content
computa-
tion.
Knowledge-Based
Syst
2011;24:297–303.
[78]
Poveda-Villalón
M,
Suárez-Figueroa
M,
Gómez-Pérez
A.
Validating
ontologies
with
OOPS!
Knowl
Eng
Knowl
Manage
2012;7603:267–81.
[79]
Brewster
C,
Alani
H,
Dasmahapatra
S,
Wilks
Y.
Data
driven
ontology
evalua-
tion.
In:
Proceedings
of
the
international
conference
on
language
resources
and
evaluation.
2004.
p.
164–8.
[80]
Bright
T,
Furuya
E,
Kuperman
G,
Cimino
J,
Bakken
S.
Development
and
evalu-
ation
of
an
ontology
for
guiding
appropriate
antibiotic
prescribing.
J
Biomed
Inf
2012;45:120–8.
[81]
Heras
S,
Botti
V,
Julian
V.
ArgCBROnto:
a
knowledge
representation
formalism
for
case-based
argumentation.
Agreement
Technol
2013;8068:105–19.
[82]
Guo
Y,
Hu
J,
Peng
Y.
A
CBR
system
for
injection
mould
design
based
on
ontol-
ogy:
a
case
study.
Comput
-Aided
Des
2012;44(6):496–508.
[83]
Zhukova
I,
Kultsova
M,
Navrotsky
M,
Dvoryankin
A.
Intelligent
support
of
decision
making
in
human
resource
management
using
case-based
reasoning
and
ontology.
Knowledge-Based
Softw
Eng
2014;466:172–84.
[84]
Djedidi
R,
Aufaure
M.
ONTO-EVOAL
an
ontology
evolution
approach
guided
by
pattern
modeling
and
quality
evaluation.
In:
Foundations
of
information
and
knowledge
systems,
Sebastian
Link
and
Henri
Prade.
Berlin,
Heidelberg,
Germany:
Springer;
2010.
p.
286–305.
[85]
Zhang
D,
Ye
C,
Yang
Z.
An
evaluation
method
for
ontology
complexity
analysis
in
ontology
evolution,
managing
knowledge
in
a
world
of
networks,
Steffen
Staab
and
Vojtˇ
ech
Svátek,
4248.
Berlin,
Heidelberg,
Germany:
Springer-
Verlag;
2006.
p.
214–21.
[86]
Fernández
M,
Overbeeke
C,
Sabou
M,
Motta
E.
What
makes
a
good
ontology?
A
case
study
in
Fine-Grained
knowledge
reuse.
Semant
Web
2009;5926:61–75.
[87]
Satter
R,
Cohen
T,
Ortiz
P,
Kahol
K,
Mackenzie
J,
Olson
C,
et
al.
Avatar-based
simulation
in
the
evaluation
of
diagnosis
and
management
of
mental
health
disorders
in
primary
care.
J
Biomed
Inform
2012;45:1137–50.
[88]
Petrovic
S,
Mishra
N,
Sundar
S.
A
novel
case
based
reasoning
approach
to
radiotherapy
planning.
Expert
Syst
Appl
2011;38:10759–69.
[89]
Marling
C,
Wiley
M,
Cooper
T,
Bunescu
R,
Shubrook
J,
Schwartz
F.
The
4
dia-
betes
support
system:
a
case
study
in
CBR
research
and
development.
In:
Ram
Ashwin,
Wiratunga
Nirmalie,
editors.
The
proceeding
of
the
19th
inter-
national
conference
on
case-based
Reasoning
(ICCBR).
Berlin,
Heidelberg,
Germany:
Springer;
2011.
p.
137–50.
[90]
Armengol
E,
Esteva
F,
Godo
L,
Torra
V.
On
learning
similarity
relations
in
fuzzy
case-based
reasoning.
Trans
Rough
Sets
II
2005;3135:14–32.
[91]
Fernandes
R,
Grosse
I,
Krishnamurty
S,
Witherell
P,
Wileden
J.
Semantic
meth-
ods
supporting
engineering
design
innovation.
Adv
Eng
Inf
2011;25:185–92.
[92]
Anouncia
S,
Madonna
L,
Jeevitha
P,
Nandhini
R.
Design
of
a
diabetic
diagnosis
system
using
rough
sets.
Cybern
Inf
Technol
2013;13(3):124–39.
[93]
Montani
S,
Bellazzi
R,
Portinale
L,
d’Annunzio
G,
Fiocchi
S,
Stefanelli
M.
Diabetic
patients
management
exploiting
case-based
reasoning
techniques.
Comput
Methods
Programs
Biomed
2000;62:205–18.
[94]
Mobyen
A,
Begum
B,
Funk
P,
Xiong
N,
Schéele
B.
Case-based
reasoning
for
diagnosis
of
stress
using
enhanced
cosine
and
fuzzy
similarity.
Trans
Case-
Based
Reasoning
Multimedia
Data
2008;1(1):3–19.
[95]
Rodriguez
Y,
Garcia
M,
Baets
B,
Morell
C,
Bello
R.
A
connectionist
fuzzy
case-
based
reasoning
model,
MICAI:
advances
in
artificial
intelligence,
Alexander
Gelbukh
and
Carlos
Reyes-Garcia.
Berlin,
Heidelberg,
Germany:
Springer;
2006.
p.
176–85.
[96]
Fan
C,
Chang
P,
Lin
J,
Hsieh
J.
A
hybrid
model
combining
case-based
reason-
ing
and
fuzzy
decision
tree
for
medical
data
classification.
Appl
Soft
Comput
2011;11:632–44.
[97]
Begum
S,
Ahmed
M,
Funk
P,
Xiong
N,
Schéele
V.
A
case-based
decision
support
system
for
individual
stress
diagnosis
using
fuzzy
similarity
matching.
Int
J
Comput
Intell
Appl
2009;25(3):180–95.
[98]
Li
S,
Ho
H.
Predicting
financial
activity
with
evolutionary
fuzzy
case-based
reasoning.
Expert
Syst
Appl
2009;36:411–22.
[99]
Arias-Aranda
D,
Castro
J,
Navarro
M,
Zurita
J.
A
CBR
system
for
knowing
the
relationship
between
flexibility
and
operations
strategy.
Found
Intell
Syst
2009;5722:463–72.
[100]
Han
M,
Cao
Z.
An
improved
case-based
reasoning
method
and
its
appli-
cation
in
end
point
prediction
of
basic
oxygen
furnace.
Neurocomputing
2015;149(Part
C):1245–52.
[101]
Sushmita
S,
Chaudhury
S.
Hierarchical
fuzzy
case
based
reasoning
with
multi-
criteria
decision
making
for
financial
applications.
Pattern
Recognit
Mach
Intell
2007;4815:226–34.
[102]
Xiong
N.
Learning
fuzzy
rules
for
similarity
assessment
in
case-based
reason-
ing.
Expert
Syst
Appl
2011;38:10780–6.
[103]
Godo
L,
Sandri
S,
Dutra
L,
Freitas
C,
Carvalho
O,
Guimarães
R,
et
al.
Clas-
sification
of
schistosomiasis
prevalence
using
fuzzy
case-based
reasoning.
Bio-Inspired
Syst:
Comput
Ambient
Intell
2009;5517:1053–60.
[104]
Jin
Q,
Jie
H,
Ying-hong
P,
Wei-ming
W,
Zhen-fei
Z.
New
weighted
fuzzy
case
retrieval
method
for
customer-driven
product
design.
J
Shanghai
Jiaotong
Univ
(Sci)
2010;15(6):641–50.
[105]
Marling
C,
Shubrook
J,
Schwartz
F.
Case-based
decision
support
for
patients
with
type
1
diabetes
on
insulin
pump
therapy.
Advances
in
case-based
reasoning:
ninth
European
conference
(ECCBR),
Klaus-Dieter
Althoff,
Ralph
Bergmann,
Mirjam
Minor,
and
Alexandre
Hanft,
vol.
5239.
Berlin,
Heidelberg,
Germany:
Springer;
2008.
p.
325–39.
[106]
Balakrishnan
V,
Shakouri
M,
Hoodeh
H,
Loo
H.
Predictions
using
data
mining
and
case-based
reasoning:
a
case
study
for
retinopathy.
World
Acad
Sci
Eng
Technol
2012;63:573–6.
[107]
Bellazzi
R,
Montani
S,
Portinale
L.
Retrieval
in
a
prototype-based
case
library:
a
case
study
in
diabetes
therapy
revision.
In:
Smyth
Barry,
Cunningham
Pádraig,
editors.
Advances
in
Case-Based
Reasoning,
vol.
1488.
Berlin,
Heidelberg,
Germany:
Springer;
1998.
p.
64–75.
[108]
Shankaracharya
D,
Odedra
S,
Vidyarthi
Samanta
A.
Computational
intelli-
gence
in
early
diabetes
diagnosis:
a
review.
Rev
Diabetes
Stud
2010;7:252–62.
... Fuzzy extensions of DLs have also been developed [35]. These extensions have found application in human activity modeling for ambient intelligence systems [36], diabetes diagnosis systems [37], and database systems [38], to mention a few. Authors in [9] proposed a consistency data representation for IoT healthcare systems, transforming health data obtained from heterogeneous IoT devices into a semantic data model that supports logical reasoning using OWL. ...
Article
Full-text available
The Internet of Things (IoT) has become one of the most popular technologies in recent years. Advances in computing capabilities, hardware accessibility, and wireless connectivity make possible communication between people, processes, and devices for all kinds of applications and industries. However, the deployment of this technology is confined almost entirely to tech companies, leaving end users with only access to specific functionalities. This paper presents a framework that allows users with no technical knowledge to build their own IoT applications according to their needs. To this end, a framework consisting of two building blocks is presented. A friendly interface block lets users tell the system what to do using simple operating rules such as “if the temperature is cold, turn on the heater.” On the other hand, a fuzzy logic reasoner block built by experts translates the ambiguity of human language to specific actions to the actuators, such as “call the police.” The proposed system can also detect and inform the user if the inserted rules have inconsistencies in real time. Moreover, a formal model is introduced, based on fuzzy description logic, for the consistency of IoT systems. Finally, this paper presents various experiments using a fuzzy logic reasoner to show the viability of the proposed framework using a smart-home IoT security system as an example.
... An Ant Colony based approach for extracting fuzzy rules is presented in (Ganji & Abadeh, 2011) for diabetes diagnosis. A fuzzy ontology-based case-based reasoning method is discussed in (El-Sappagh, Elmogy, & Riad, 2015) which depicts expert thinking behavior. Razavian et al. (Razavian et al., 2015) used a high dimensional dataset comprising of 4.1 million individuals and 42000 variables to develop a predictive model for diabetes patient prediction, based on regression. ...
Article
Full-text available
The recent advancements in the field of health sciences have produced substantial amount of data such as clinical information that is generated by patient records which is used in AI applications for better diagnosis and predictions. Diabetes belongs to a group of metabolic disorders that affects 422 million people worldwide. This is primarily due to lack of predictive and forecasting measures. Research on several aspects of diabetes has generated huge amounts of data which makes it suitable for application of AI based methods. Presently, several methods have been used for predicting diabetes on the basis of certain factors. However, results of this study show that Support Vector Machine (SVM) and Linear regression when combined with statistical methods, provide much better results compared to AI methods.
... Fuzzy Ontology converts crips numbers into Categorical numbers. The categorical numbers are low, medium, and high Similarity (El-Sappagh, 2015;Zhai, et al., 2008;Abou-of, 2020). ...
Article
Full-text available
The ambiguous sentences Homonyms and Homophones become a big problem when processed by computers. From these problems, a Novelty was found; the Novelty created a system that was able to recognize ambiguous sentences of Homonyms and Homophones. The process that the system runs for the first time is to test the proximity of the ambiguous sentences entered with the data set; from this process, the ambiguous sentences entered can already be recognized as the meaning of the sentence. The resulting result is how many per cent the level of similarity. Then the results are processed with the fuzzy ontology method. The results of the Fuzzy Ontology are low similarity level, moderate similarity level, and high similarity level. The method used to analyze this research is the confusion matrix, the precision results obtained were 92%, recall was 100%, and accuracy was 96%. In the future, this research can be used to refine translation results in a translation system.
... Other studies in the literature have shown that PPG-related features and other physiological parameters of subjects can accurately predict DM-2 using machine learning (ML) and deep learning (DL) techniques . Various research studies have used Pima Indians Diabetes Dataset (PIMADD), a publicly-accessible dataset, to estimate the DM-2 using different ML algorithms [7][8][9][10][11]. This dataset includes information like insulin level, pregnancy, age, BMI, and so on. ...
Article
Full-text available
Type-2 diabetes mellitus (DM-2) is a complicated endocrine and metabolism condition recognized as the most major non-communicable disease in the world. The complications associated with DM-2 involve cardiovascular disease, diabetic retinopathy and neuropathy. This article proposes the Fourier decomposition method for non-invasive automated type-2 diabetes detection using photoplethysmography (PPG) signals. The proposed research work comprises three major phases. In the first phase, the 5-min duration of the toe PPG signal is split into 10-s segments and decomposed into frequency subbands known as Fourier intrinsic band functions (FIBFs). Two features from each FIBF are extracted in the second phase, including kurtosis and log energy entropy. The last stage involves passing the features on to various machine learning techniques. The least-square support vector machine (radial basis function) algorithm yielded better classification results with an accuracy of 98.61%, a sensitivity of 98.96%, and a selectivity of 98.26%.
... A feature selection based on Entropy Ensemble of Neural Networks [27] is designed with Recursive Feature Elimination and better results are achieved . Classification was based on Support Vector Machine with Recursive Feature Elimination and got an accuracy of 85.66% [28].In [29] classification based on Deep Belief Network was designed and got accuracy of 83.9%. Table 1 shows some important class of drugs and its adverse effects. ...
Article
Full-text available
In this paper, various data mining algorithms for pharmacovigilance is analyzed and a decision support system for hospital is proposed.. Overall analysis of adverse events of a specific drug helps in finding the potential danger of using the specific drug. Decision support system with good classification accuracy to improve its use in hospital for computer aided diagnosis by doctors is also analyzed
... The proposed TFAMT -TWGA paradigm achieves 83.7% accuracy when evaluated using the UCI registry and the DR Dataset. El-Sappagh et al [36] propose a fuzzy ontology-based case-based justification paradigm for the treatment of diabetes. It presents a fuzzy semantic retrieval technique as well as a fuzzy case-based OWL2 ontology for managing various types of functionalities. ...
Article
Full-text available
Early diabetes diagnosis allows patients to begin treatment on time, reducing or eliminating the risk of serious consequences. In this paper, we propose the Neutrosophic-Adaptive Neuro-Fuzzy Inference System (N-ANFIS) for the classification of diabetes. It is an extension of the generic ANFIS model. Neutrosophic logic is capable of handling the uncertain and imprecise information of the traditional fuzzy set. The suggested method begins with the conversion of crisp values to neutrosophic sets using a trapezoidal and triangular neutrosophic membership function. These values are fed into an inferential system, which compares the most impacted value to a diagnosis. The result demonstrates that the suggested model has successfully dealt with vague information. For practical implementation, a single-value neutrosophic number has been used; it is a special case of the neutrosophic set. To highlight the promising potential of the suggested technique, an experimental investigation of the well-known Pima Indian diabetes dataset is presented. The results of our trials show that the proposed technique attained a high degree of accuracy and produced a generic model capable of effectively classifying previously unknown data. It can also surpass some of the most advanced classification algorithms based on machine learning and fuzzy systems.
Article
Nowadays, the prevalence of metabolic syndromes (MSs) has attracted increasing concerns as it is closely related to overweight and obesity, physical inactivity and overconsumption of energy, making the diagnosis and real-time monitoring of the physiological range essential and necessary for avoiding illness due to defects in the human body such as higher risk of cardiovascular disease, diabetes, stroke and diseases related to artery walls. However, the current sensing techniques are inconvenient and do not continuously monitor the health status of humans. Alternatively, the use of recent wearable device technology is a preferable method for the prevention of these diseases. This can enable the monitoring of the health status of humans in different health domains, including environment and structure. The use wearable devices with the purpose of facilitating rapid treatment and real-time monitoring can decrease the prevalence of MS and long-time monitor the health status of patients. This review highlights the recent advances in wearable sensors toward continuous monitoring of blood pressure and blood glucose, and further details the monitoring of abnormal obesity, triglycerides and HDL. We also discuss the challenges and future prospective of monitoring MS in humans.
Conference Paper
Full-text available
Building ontologies is very important for diverse domains and especially for semantic Web. We find in the literature many methods and tools for this building. However, the fuzzy aspect is not enough studied in these methods and tools, whereas information systems can include uncertainties and imperfections. The goal of the definition of fuzzy ontologies is to integrate these characteristics. So, we must be able to modulate uncertainties, on the one hand, and to product representations accessible and understandable by machines, on the other hand. If we find actually many building methods and editors for classic ontologies (i.e., crisp or exact), we do not find such methods for fuzzy ontologies. Then, this paper defines our work for fuzzy ontologies building. It presents our fuzzy ontologies building method "Fuzzy OntoMethodology".
Chapter
Nowadays, smartphones have become indispensable items for everybody. Thanks to them, people can communicate and access Internet at any time regardless of where they are located. New smartphones belonging to a high amount of labels and with different features and prices keep appearing constantly in the market. This way, there is a need of tools that help buyers to select and buy the smartphone that better fits their necessities. In this article, a decision support system build over a fuzzy ontology has been designed in order to help people to select the perfect smartphone for them. Linguistic labels are used in order to provide the buyer with a comfortable way of expressing himself/herself.
Conference Paper
The evaluation of ontologies is vital for the growth of the Semantic Web. We consider a number of problems in evaluating a knowledge artifact like an ontology. We propose in this paper that one approach to ontology evaluation should be corpus or data driven. A corpus is the most accessible form of knowledge and its use allows a measure to be derived of the ‘fit’ between an ontology and a domain of knowledge. We consider a number of methods for measuring this ‘fit’ and propose a measure to evaluate structural fit, and a probabilistic approach to identifying the best ontology.
Article
Clinical Decision Support System (CDSS) can be used to prepare diagnosis from different patient's details and hence physicians or nurses can review this diagnosis for improving the final decision. Due to the lack of CDSS in diabetes and related diseases in Sultanate of Oman, an Ontology based CDSS is proposed here. The deployed key components of the system are Adaptive Questionnaire Ontology, patient's semantic profile, guideline ontology and risk assessment reasoner. We here propose a model for gathering the patient medical history based on dynamic questionnaire ontology. Ontology is among the most powerful tools to encode medical knowledge semantically. It is an abstract model which represents a common and shared understanding of a domain. The model is explained and implemented for diabetes domain.
Article
In the context of the Semantic Web, fuzzy extensions to OWL (the W3C standard ontology language) and Description Logics (DLs, the logical foundation of OWL) have been extensively investigated as introduced in Chap. 4, and many real knowledge bases based on fuzzy DLs and fuzzy OWL tend to become very large to huge. Therefore, how to store fuzzy knowledge bases has become an important issue. Based on the widespread investigation of fuzzy relational databases, in this chapter, we briefly introduce how to store fuzzy knowledge bases in fuzzy relational databases. Until now, there are a few papers discussing fuzzy DL or ontology knowledge base storage, which is still an open problem. Much work about fuzzy DL and ontology knowledge base storage may be needed for supporting the fuzzy knowledge management in the Semantic Web.
Article
COLIBRI is an open source platform for the development of Case-based reasoning (CBR) systems. It supports the development of different families of specialized CBR systems: from Textual CBR to Knowledge Intensive applications. This chapter provides a functional description of the platform, its capabilities and tools. These features are illustrated with real examples of working systems that have been developed using COLIBRI. This overview should serve to motivate and guide those readers that plan to develop CBR systems and are looking for a tool that eases this task.
Article
Case retrieval and case revise (reuse) are core parts of case-based reasoning (CBR). According to the problems that weights of condition attributes are difficult to evaluate in case retrieval, and there are few effective strategies for case revise, this paper introduces an improved case-based reasoning method based on fuzzy c-means clustering (FCM), mutual information and support vector machine (SVM). Fuzzy c-means clustering is used to divide case base to improve efficiency of the algorithm. In the case retrieval process, mutual information is introduced to calculate weights of each condition attribute and evaluate their contributions to reasoning results accurately. Considering the good ability of the support vector machine for dealing with limited samples, it is adopted to build an optical regression model for case revise. The proposed method is applied in endpoint prediction of Basic Oxygen Furnace (BOF), and simulation experiments based on a set of actual production data from a 180 t steelmaking furnace show that the model based on improved CBR achieves high prediction accuracy and good robustness.
Chapter
Nowadays, most people can get enough energy to maintain one-day activity, while few people know whether they eat healthily or not. It is quite important to analyze nutritional facts of foods eaten for those who are losing weight or suffering chronic diseases such as diabetes. However, diet is a problem with a high uncertainty, and it is widely pointed out that classical ontology is not sufficient to deal with imprecise and vague knowledge for some real-world applications like diet. On the other hand, a fuzzy ontology can effectively help handle and process uncertain data and knowledge. This chapter proposes a type-2 fuzzy set and fuzzy ontology for diet application and uses the type-2 fuzzy markup language (T2FML) to describe the knowledge base and rule base of the diet, including ingredients and the contained servings of six food categories of some common foods in Taiwan. The experimental results show that type-2 fuzzy logic system (FLS) performs better than type-1 FLS, proving that type-2 FLS can provide a powerful paradigm to handle the high level of uncertainties present in diet.
Article
In many industrial contexts, knowledge and data provided by experts are imprecise as there seems to be an understanding that “experts do not need precise details as they understand anyway what is meant”. The imprecision inherent in the knowledge that experts acquire in their practice require decision support tools that can be tailored to the specific application contexts to aid complex decisions. As a specific example, expert knowledge expressed in linguistic terms is not precisely structured and concepts are not defined specifically enough in order to be easy to use and process. If we want to represent and use expert knowledge for knowledge-based systems on a general level, that is easily adaptable, we need to find ways to represent and process knowledge elements; our approach is to use interval-valued fuzzy sets, fuzzy ontology and aggregation operators. We show that these instruments will offer us a novel approach for aggregation of imprecise data to obtain actionable knowledge to aid complex decisions. The framework is described and the approach is shown through the context of a fuzzy wine ontology; the problem formulation resembles many features of important and complex decision making problems found in different industries. We describe the potential application of the framework in the case of paper machine maintenance. A web-based application is introduced to better demonstrate the benefits decision-makers can receive from the proposed framework. Additionally, we present an approach to utilize the framework in finding consensual solutions in situations involving several experts.