ArticlePDF Available

CORE: A tool for collaborative ontology reuse and evaluation

Authors:

Abstract and Figures

Ontology evaluation can be defined as assessing the quality and the adequacy of an ontology for being used in a specific context, for a specific goal. In this work, a tool for Collaborative Ontol-ogy Reuse and Evaluation (CORE) is presented. The system receives an informal description of a semantic domain and de-termines which ontologies, from an ontology repository, are the most appropriate to describe the given domain. For this task, the environment is divided into three main modules. The first com-ponent receives the problem description represented as a set of terms and allows the user to refine and enlarge it using Word-Net. The second module applies multiple automatic criteria to evaluate the ontologies of the repository and determine which ones fit best the problem description. A ranked list of ontologies is returned for each criterion, and the lists are combined by means of rank fusion techniques that combine the selected crite-ria. A third component of the system uses manual user evalua-tions of the ontologies in order to incorporate a human, collabo-rative assessment of the quality of ontologies.
Content may be subject to copyright.
CORE: A Tool for Collaborative Ontology Reuse and
Evaluation
Miriam Fernández, Iván Cantador, Pablo Castells
Escuela Politécnica Superior
Universidad Autónoma de Madrid
Campus de Cantoblanco, 28049 Madrid, Spain
{miriam.fernandez, ivan.cantador, pablo.castells}@uam.es
ABSTRACT
Ontology evaluation can be defined as assessing the quality and
the adequacy of an ontology for being used in a specific context,
for a specific goal. In this work, a tool for Collaborative Ontol-
ogy Reuse and Evaluation (CORE) is presented. The system
receives an informal description of a semantic domain and de-
termines which ontologies, from an ontology repository, are the
most appropriate to describe the given domain. For this task, the
environment is divided into three main modules. The first com-
ponent receives the problem description represented as a set of
terms and allows the user to refine and enlarge it using Word-
Net. The second module applies multiple automatic criteria to
evaluate the ontologies of the repository and determine which
ones fit best the problem description. A ranked list of ontologies
is returned for each criterion, and the lists are combined by
means of rank fusion techniques that combine the selected crite-
ria. A third component of the system uses manual user evalua-
tions of the ontologies in order to incorporate a human, collabo-
rative assessment of the quality of ontologies.
Categories and Subject Descriptors
H.3.3 [Information Storage and Retrieval]: Information
Search and Retrieval – information filtering, retrieval models,
selection process.
General Terms
Algorithms, Measurement, Human Factors.
Keywords
Ontology evaluation, ontology reuse, rank fusion, collaborative
filtering, WordNet.
1. INTRODUCTION
The Semantic Web is envisioned as a new flexible and struc-
tured Web that takes advantage of explicit semantic information,
understandable by machines, and therefore classifiable and suit-
able for sharing and reuse in a more efficient, effective and sat-
isfactory way. In this vision, ontologies are proposed as the
backbone technology to supply the required explicit semantic
information. Developing ontologies from scratch is a high-cost
process that requires major engineering efforts, even for a me-
dium-scale ontology. In order to properly face this problem, we
believe efficient ontology reuse and evaluation techniques and
methodologies are needed. The lack of appropriate support
tools, and the lack of automatic measurement techniques for
certain ontology features are often a barrier for the implementa-
tion of successful ontology reuse methods.
In this work, we present CORE, a Collaborative Ontology
Reuse and Evaluation system. This tool provides automatic
similarity measures for comparing a certain problem or Golden
Standard to a set of available ontologies, retrieving not only
those most similar to the domain described by the Golden Stan-
dard, but the best rated ones by prior ontology users, according
to the selected criteria. For similarity assessment, a user of
CORE selects a subset from a list of comparison techniques that
the tool provides. Based on this, the tool retrieves a ranked list
of ontologies for each criterion. Finally, a unique ranking is
defined by means of a global aggregated measure which com-
bines the different selected criteria, using rank fusion techniques
[1].
Once the system retrieves those ontologies closely related to
the Golden Standard, it supports an additional step in the evalua-
tion process, by implementing a Collaborative Filtering ap-
proach [12][14][19]. Since some ontology features can only be
assessed by humans, the last evaluation step takes into consid-
eration the manual feedback provided by users of the ontologies,
to re-rank the list of ontologies, thus retrieving not only the
ontologies that best fit the Golden Standard, but the most quali-
fied ones according to human evaluations.
The paper is organized by the following structure. In Section
2 we present the relevant work related to our research; section 3
describes the system architecture; sections 4 and 5 present the
automatic evaluation measures used by the system; and some
conclusions are given in section 6.
2. RELATED WORK
Our research addresses problems in three different research
areas, where we draw from prior related work. These are: ontol-
ogy evaluation and reuse, which is the primary goal of our
work; rank fusion, which is used to combine the ratings pro-
vided by different ontology evaluation criteria; and collabora-
tive filtering, by which we get further evaluation measures for
ontology features that are better assessed by human judgment.
2.1 Ontology Evaluation
Different methodologies for ontology evaluation have been
proposed in the literature considering the characteristics of the
ontologies and the specific goals or tasks that the ontologies are
intended for. An overview of ontology evaluation approaches is
presented in [2], where four different categories are identified:
Those that evaluate an ontology by comparing it to a
Golden Standard, which may itself be an ontology [11] or
some other kind of representation of the problem domain
for which an appropriate ontology is needed.
Those that evaluate the ontologies by plugging them in an
application, and measuring the quality of the results that
the application returns [16].
Those that evaluate ontologies by comparing them to un-
structured or informal data (e.g. text documents [3]) which
represent the problem domain.
Those based on human interaction to measure ontology
features not recognizable by machines [10].
In each of the above approaches, a number of different
evaluation levels might be considered to provide as much in-
formation as possible. Several levels can be identified in the
literature:
The lexical level [3][11][21] which measures the quality
by comparing the words (lexical entries) of the ontology
with a set of words that represent the problem domain.
The taxonomy level [11] which considers the hierarchical
connection between concepts using the is-a relation.
Other semantic relations besides hierarchical ones [6][8].
The syntactic level [7] which considers the syntactic re-
quirements of the formal language used to describe the on-
tology.
Context or application level [4] which considers the con-
text of the ontology, such as the ontologies that reference
or are referenced by the one being evaluated, or the appli-
cation it is intended for.
The structure, architecture and design levels [10] which
take into account the principles and criteria involved in the
ontology construction itself.
Table 1 summarizes all these approaches [2].
Table 1. An overview of approaches to ontology evaluation
Approach to evaluation
Level Golden
Standard
Application
based
Data
Driven
Assessment
by humans
Lexical entries,
vocabulary,
concept, data
X X X X
Hierarchy, tax-
onomy X X X X
Other
semantic
relations
X X X X
Context,
application X X
Syntactic X X
Structure, archi-
tecture, design X
In the present paper, two novel evaluation measures are pro-
posed. The first one is based on a Golden Standard approach and
the lexical level measure proposed by Maedche and Staab [11].
The second one is based on assessment by humans in a collabo-
rative filtering approach.
2.2 Rank Fusion
Rank fusion has been a widely addressed research topic in the
field of Information Retrieval [1][5][9]. Given a set of rankings
which apply to a common universe of information objects, the
task of rank aggregation consists of combining this list in a way
to optimize the performance of the combination. Examples
where rank fusion takes place include, for instance, metasearch
[18] distributed search from heterogeneous sources, personal-
ized retrieval, classification based on multiple evidence, etc.
Fusion techniques typically bring better recall, better preci-
sion, and more consistent performance than the individual sys-
tems being combined [1]. Fusion techniques can be character-
ized by:
The input data they require: ranks, scores, or full informa-
tion of the objects.
Whether or not training data is used, which usually con-
sists of manual relevance judgments on the information
objects.
The degree of overlap between the sets of rated objects,
ranging from total overlap (a.k.a. data fusion), to com-
pletely disjoint sets (a.k.a. collection fusion), and arbitrar-
ily overlapping.
The application level, which can be a) external, if autono-
mous rating systems are integrated into a new meta layer,
or b) internal, if the combination takes place at heart of a
retrieval system, where different subsystems collect evi-
dence from several sources or different criteria.
In our work, rank fusion techniques are used to combine the
individual ontology lists retrieved by partial evaluation criterion
into an aggregated ontology ranking. This can be understood as
a metasearch problem where a) the input data are the evaluation
ratings from different criteria, b) no training data is used (there
are no prior manual rating or reference judgements for compari-
son), c) the overlap is complete (all the evaluation criteria are
applied on the same ontology repository), and d) the level of
application is internal (the rating sources are components within
the CORE system).
2.3 Collaborative Filtering
Collaborative filtering strategies make automatic predictions
(filter) about the interests of a user by collecting taste informa-
tion from many users (collaborating). This approach usually
consists of two steps: 1) look for users that have a similar rating
pattern to that of the active user (the user for whom the predic-
tion is done), and 2) use the ratings of users found in step 1 to
compute the predictions for the active user. These predictions
are specific to the user, differently to those given by more sim-
ple approaches that provide average scores for each item of
interest, for example based on its number of votes.
Collaborative filtering is a widely explored field. Three main
aspects typically distinguish the different techniques reported in
the literature [14]: user profile representation and management,
filtering method, and matching method.
User profile representation and management can be divided
into five different tasks:
Profile representation. Accurate profiles are vital for the
content-based component (to ensure recommendations are
appropriate) and the collaborative component (to ensure
that users with similar profiles are in fact similar). The
type of profile chosen in this work is the user-item ratings
matrix (ontology evaluations based on specific criteria).
Initial profile generation. The user is not usually willing to
spend too much time in defining her/his interests to create
a personal profile. Moreover, user interests may change
dynamically over time. The type of initial profile genera-
tion chosen in this work is a manual selection of values for
only five specific evaluation criteria.
Profile learning. User profiles can be learned or updated
using different sources of information that are potentially
representative of user interests. In our work, profile learn-
ing techniques are not used.
The source of user input and feedback to infer user inter-
ests from. Information used to update user profiles can be
obtained in two different ways: using information explic-
itly provided by the user, and using information implicit
observed in the user’s interaction. Our system uses no
feedback to update the user profiles.
Profile adaptation. Techniques are needed to adapt the user
profile to new interests and forget old ones as user interests
evolve with time. Again, in our approach profile adapta-
tion is done manually (manual update of ontology evalua-
tions).
Filtering method. Products or actions are recommended to a
user taking into account the available information (items and
profiles). There are three main information filtering approaches
for making recommendations:
Demographic filtering: Descriptions of people (e.g. age,
gender, etc) are used to learn the relationship between a
single item and the type of people who like it.
Content-based filtering: The user is recommended items
based on the descriptions of items previously evaluated by
other users. Content-based filtering is chosen approach in
our work (the system recommends ontologies using previ-
ous evaluations of those ontologies).
Collaborative filtering: People with similar interests are
matched and then recommendations are made.
Matching method. Defines how user interests and items are
compared. Two main approaches can be identified:
User profile matching: People with similar interests are
matched before making recommendations.
User profile-item matching: A direct comparison is made
between the user profile and the items. The degree of ap-
propriateness of the ontologies is computed by taking into
account previous evaluations of those ontologies.
In CORE, a new ontology evaluation measure based on collabo-
rative filtering is proposed, considering user’s interest and pre-
vious assessments of the ontologies.
3. SYSTEM ARCHITECTURE
In this section we describe the architecture of CORE, our Col-
laborative Ontology Reuse and Evaluation environment. Figure
1 shows the overview of the system. We distinguish three dif-
ferent modules. The first one, the left module, receives the
Golden Standard definition as a set of initial terms and allows
the user to modify and extend it using WordNet [13]. The sec-
ond one, represented in the center of the figure, allows the user
to select a set of ontology evaluation techniques provided by the
system to recover the ontologies closest to the given Golden
Standard. The third one, or right one, is a collaborative module
that re-ranks the list of recovered ontologies, taking into consid-
eration previous feedback and evaluations of the users.
Figure 1. CORE architecture
3.1 Golden Standard Definition Phase
The Golden Standard Definition module receives an initial set of
terms. These terms are supposed to be obtained by an external
Natural Language Processing (NLP) module from a set of
documents related to the specific domain in which the user is
interested. This NLP module would receive the repository of
documents and return a list of pairs (lexical entry, part of
speech), that roughly represents the domain of the problem. This
phase is part of future work. Here in our experiments, the list of
initial (root) terms has been manually assigned.
The module allows the user to expand the root terms using
WordNet [13] and some of the relations it provides: hypernym,
hyponym and synonym. The new terms added to the Golden
Standard using these relations might also be extended again and
added to the problem definition.
The final representation of the Golden Standard can be de-
fined as a set of terms T(LG, POS, LGP, R, Z) where:
LG is the set of lexical entries defined for the Golden Stan-
dard.
POS corresponds to the different Parts Of Speech consid-
ered by WordNet: noun, adjective, verb and adverb.
LGP is the set of lexical entries of the Golden Standard that
have been extended.
R is the set of relations between terms of the Golden Stan-
dard: synonym, hypernym, hyponym and root (if a term has
not been obtained by expansion, but is one of the initial
terms).
Z is an integer number that represents the depth or distance
of a term to the root term from which it has been derived.
Example:
T1 (“pizza”, noun, “”, ROOT, 0). T1 is one of the root terms of
the Golden Standard. The lexical entry that it represents is
“pizza”, its part of speech is “noun”, it has not been expanded
from any other term so its lexical parent is the empty string, its
relation is ROOT and its depth is 0.
T2 (“pizza pie”, noun, “pizza”, Synonym, 1). T2 is a term ex-
panded from T1. The lexical entry it represents is “pizza pie”, its
part of speech is “noun”, the lexical entry of its parent is
“pizza”, it has been expanded by the synonym relation and the
number of relations that separated it from the root term T1 is 1.
The left part of Figure 2 shows the interface of the Golden
Standard Definition phase. In the top level we can see the list of
root terms. The user is allowed to manually insert new root
terms giving their lexical entries and selecting their parts of
speech. The correctness of these new insertions is controlled by
verifying that all the considered lexical entries belong to the
WordNet [13] repository. In the bottom level we can see the
final Golden Standard definition: the final list of (root and ex-
panded) terms that represent the domain of the problem. In the
intermediate level it can be seen how the user can make a term
expansion. The user selects one of the previous terms from the
Golden Standard definition and the system shows him all its
meanings contained in WordNet [13]. After he chooses one, the
system automatically presents him three different lists with the
synonyms, hyponyms and hypernyms of the term. The user can
choose one or more elements of these lists and they will auto-
matically be added to the expanded terms list. For each expan-
sion the depth of the new term is increased by one unit. This
will be used later to measure the importance of the term within
the Golden Standard: the greater the depth of the derived term
with respect to its root term, the less its relevance will be.
3.2 System Recommendation Phase
In this phase the system should retrieve the ontologies that bet-
ter conceptualize the Golden Standard domain. The middle
module of Figure 1 represents the structure of the recommenda-
tion phase of the system. Firstly, the user selects a set of evalua-
tion criteria to be performed. After considering the selected
criteria and taking into account the Golden Standard and the
ontologies of the repository, the system retrieves a ranked list of
ontologies (ordered by their similarity to the Golden Standard)
for each criterion. Then, all these lists are merged using rank
fusion techniques [1] to obtain a global measure.
The middle part of Figure 2 represents the user interface of
the System Recommendation module. In the upper level we
distinguish the criteria selection phase. By now, two content
evaluation criteria can be selected to retrieve the most similar
ontologies: 1) the lexical criterion, which measures similarity
between the lexical entries of the Golden Standard and the lexi-
cal entries of the ontologies and, 2) the taxonomic criterion,
which evaluates the hierarchical structure between them. The
user can also select the relevance of each criterion in the rank
aggregation process, using a range of discrete values [1, 2, 3, 4,
5], where 1 symbolizes the lowest relevance value and 5 the
highest. Moreover, different kinds of lexical and taxonomic
similarity measures have been implemented and tested using this
tool. These may now be selected in this phase. These measures
will be explained in section 4 of this document. The intermedi-
ate level of Figure 2 shows a different ranked list for each crite-
rion and the final fused list. In each of these tables, two different
ratings are displayed for each ontology. The first one refers to
the similarity between the ontology and the Golden Standard.
The second rating, score, shows the similarity value normalized
by the sum of all the values. The score measure exhibits the
distribution of the ratings and allows us to better evaluate the
different techniques.
Fi
g
ure 2. CORE user interface
Once the final ranked list has been retrieved, the system al-
lows the user to select a subset of ontologies that he considers
adequate for the Collaborative Evaluation Phase.
3.3 Collaborative Evaluation Phase
This module has been designed to confront the challenge of
evaluating those ontology features that are by their nature, more
difficult for machines to address. Where human judgment is
required, the system will attempt to take advantage of Collabo-
rative Filtering techniques [12][14][19]. Some approaches for
ontology development [20] have been presented in the literature
concerning collaboration techniques. However to our knowl-
edge, Collaborative Filtering strategies have not yet been used
in the context of ontology reuse.
The collaborative module ranks and presents the best on-
tologies for the user, taking into consideration previous manual
evaluations.
Several issues have to be considered in a collaborative sys-
tem. The first one is the representation of user profiles. The type
of user profile selected for our system is a user-item rating ma-
trix (ontologies evaluations based on specific criteria). The
initial profile is designed as a manual selection of five prede-
fined criteria [15]:
Correctness: specifies whether the information stored in
the ontology is true, independently of the domain of inter-
est.
Readability: indicates the non-ambiguous interpretation of
the meaning of the concept names.
Flexibility: points out the adaptability or capability of the
ontology to change.
Level of Formality: highly informal, semi-informal, semi-
formal, rigorously-formal.
Type of model: upper-level (for ontologies describing gen-
eral, domain-independent concepts), core-ontologies (for
ontologies describing the most important concepts on a
specific domain), domain-ontologies (for ontologies de-
scribing some domain of the world), task-ontologies (for
ontologies describing generic types of tasks or activities)
and application-ontologies (for ontologies describing some
domain in an application-dependent manner).
The above criteria can be divided in two different groups: 1) the
discrete criteria (correctness, readability and flexibility) that are
represented by discrete numeric values [0, 1, 2, 3, 4, 5] where 0
indicates that the ontology does not fulfill the criterion, and 5
indicates the ontology completely satisfies the criterion and, 2)
the boolean criteria (level of formality and type of model) are
represented by a specific value that is either satisfied by the
ontologies, or not.
The collaborative system does not implement any profile
learning technique or relevance feedback to update user profiles.
But, the profiles may be modified manually.
After the user profile has been defined, it is important to se-
lect an appropriate type of filtering. For this work, a content-
based filtering technique has been chosen; this means, ontolo-
gies (our content items) are recommended based on previous
users evaluations.
Finally, a type of matching must also be picked out for the
recommendation process. In this work, a novel technique of
User Profile-Item matching is proposed. To evaluate the levels
of relevance of the ontologies, the technique will make compari-
sons between the user’s interests and the ontology’s evaluations
stored into the system. This will be explained in section 5.
The right portion of Figure 2 shows the Collaborative
Evaluation module. At the top level the user’s interest can be
selected as a subset of criteria with associated values represent-
ing thresholds that manual evaluations of the ontologies should
fulfil. For example, when a user sets a value of 3 for the correct-
ness criterion, the system recognizes that he is looking for on-
tologies whose correctness value is greater than or equal to 3.
Once the user’s interests have been defined, the set of manual
evaluations stored in the system is used to compute which on-
tologies fit his interest best. The intermediate level shows the
final ranked list of ontologies returned by the Collaborative
Filtering module. To add new evaluations to the system, the user
must select an ontology from the list and choose one of the pre-
determined values for each of the five aforementioned criteria.
The system also allows the user to add some comments to the
ontology evaluation in order to provide more feedback.
One more action has to be performed in order to visualize
the evaluation results of a specific ontology. Figure 3 shows the
user’s evaluation module. On the left side we can see the sum-
mary that the system provides of the existing ontology evalua-
tions with respect to the user’s interest. In Figure 3, 3 of 6
evaluations of the ontology have fulfilled the correctness crite-
ria, 5 of 6 evaluations have fulfilled the readability criteria, and
so on. On the right side, we can see how the system enables the
user to observe all the evaluations stored into the system about a
specific ontology. This may be of interest since we may trust
some users more than others during the Collaborative Filtering
process.
4. CONTENT ONTOLOGY EVALUATION
In order to obtain similarities between the Golden Standard and
the stored ontologies, two different content ontology evaluation
levels have been considered, the lexical and the taxonomic.
Several measures have been developed and tested for each level.
Figure 3. CORE user’s evaluations
In the following sections we present the approaches that have
shown better performance.
4.1 Lexical Evaluation Measures
The lexical evaluation assesses the similarity between the do-
main of the problem as described by the Golden Standard and an
ontology by comparing the lexical entries, or words that repre-
sent them. A new lexical evaluation measure based on Maetche
and Staab work [11] will be explained in this section. Some
definitions must first be introduced.
Definition 1 (Lexical entry). A lexical entry l represents a
string or word.
Definition 2 (Golden Standard Lexicon). The Golden Stan-
dard Lexicon, G
L
is defined by the set of lexical entries ex-
tracted from the terms of the Golden Standard, where each term
has a single lexical entry that represents it.
Definition 3 (Ontology Lexicon). The Ontology Lexicon, O
L
is defined as the set of lexical entries extracted from the Con-
cepts of the Ontology. Each concept is represented by one or
more lexical entries that are extracted from the concept name,
the rdfs:label property value, or other properties that could be
added to the lexical extraction process considering the charac-
terization of each ontology.
Definition 4 (Levenshtein distance). The Levenshtein distance,
(, )
ij
ed l l between two lexical entries i
l and
j
l measures the
minimum number of token insertions, deletions and substitu-
tions to transform i
l into
j
l using a dynamic algorithm.
Example: (" "," _ ") 1ed pizzapie pizza pie =
Maedche and Staab [11] propose a lexical similarity measure for
strings called String Matching. This method compares two lexi-
cal entries i
l, taking into account the Levenshtein distance
against the shortest lexical entry.
min(| |,| |) ( , )
SM ( , ) max(0, ) [0,1]
min(| |, | |)
ll edll
ij ij
ll
ij ll
ij
=∈
SM returns a degree of similarity between 0 and 1, where 0 is a
null match and 1 represents a perfect match.
Example: SM(“pizzapie”, “pizza_pie”) = 7/8.
Based on the String Matching they propose a lexical similarity
measure to compare an ontology to a Golden Standard, by com-
puting the average string matching between the set of Golden
Standard lexical entries and the set of ontology lexical entries:
1
SM ( , ) max ( , )
|| O
Gj
i
GO
ij
GlL
lL
L
LSMll
L
=
SM ( , )
GO
L
L is an asymmetric measure that determines the ex-
tent to which the lexical level of the Golden Standard is covered
by the lexical level of the Ontology. Future work must be done
in order to penalize those ontologies which contain all the
strings of the Golden Standard but also many others.
There is one principle difference between that approach and
ours; Maedche defines the Golden Standard as an ontology,
while we use our own model. This fact provides us with the
capability to use all the additional information stored in the
Golden Standard in order to improve content evaluation meas-
ures.
When a domain is modeled as a set of lexical entries, some
lexical entries have greater relevance when defining the seman-
tics than do others. Assuming this characteristic we have de-
cided to distinguish the importance of the Golden Standard
terms. The root terms are considered the most representative
ones while the relevance of the expanded terms depends on the
number of relations that separate them from a root term. With
this modification we emphasize the main semantics and relegate
the complementary ones into the background. In this work we
define the Golden Standard Lexical weight measure to evaluate
the importance of each term.
Definition 5 (Golden Standard lexical weight). Given a list of
lexical entries
{
}
li
L= expanded from a common root lexical
entry, we define the weight of lL as:
max ( ( )) ( )
1 [1, 2] if 1
max ( ( ))
()
2
ii
ii
Depth l Depth l L
Depth l
wl
otherwise
∈>
=
The value returned is represented as a degree of relevance be-
tween 1 (the farthest distance to the root lexical entry), and 2 (no
distance to the root lexical entry). If the root lexical entry has
not been expanded we assign it the weight value 2.
Figure 4 shows an example of this measure, where T1
is the root
term and consequently has the greater weight. T3 is the most
remote term and it has the smaller weight. The intermediate
terms like T2 have a weight between the maximum and the
minimum relative to their distance from the root term.
Figure 4. Golden Standard lexical weight measure
In our approach, we have modified the previous lexical measure
taking into account the weight or relevance of each term to rep-
resent the semantics of the domain.
1
SM ( , ) max ( , )· ( )
||
G
GO ij i
lj Lo
li L
G
LL SMllwl
L
=
Through our experiments, this new measure has been shown to
better discriminate the ontologies, giving a higher similarity
value to the ontologies that are closer to the Golden Standard
and lower rating to the ontologies that worst fit the problem
domain. Future work is needed in order to give more or less
relevance to the derived terms of the Golden Standard using not
only their distances to the root terms but also, the kind of rela-
tion from which they have been derived, synonym, hypernym or
hyponym.
4.2 Taxonomic Evaluation Measures
The taxonomic evaluation assesses the degree of overlapping
between the hierarchical structure of the ontology, defined by
the “is-a” relation and the Golden Standard structure, defined by
the derivations of terms to complete the domain representation.
The following notations and definitions will be used to define
our measure:
GG
i
TT represents a Golden Standard term.
OO
i
CC represents an Ontology concept.
Definition 8 (Semantic Cotopy of a Golden Standard Term).
The Semantic Cotopy of a Golden Standard term ()
G
i
SC T is
defined as the set of lexical entries of the terms derived from the
same root term as G
i
T, including the lexical entries of G
i
T.
Definition 9 (Semantic Cotopy of an Ontology Concept).
The Semantic Cotopy of an Ontology concept ()
O
i
SC C is de-
fined as the set of lexical entries of the concepts related with O
i
C
in the ontology with a direct relation, including the lexical en-
tries of O
i
C.
Given Maedche and Staab [11] measure, an adaptation is per-
formed to obtain a similarity between an ontology Concept and
a Golden Standard Term relative to the new Golden Standard
definition. The similarity is computed as the intersection be-
tween the Semantic Cotopy of the Golden Standard term and the
Semantic Cotopy of the ontology concept normalized by the
total possible overlap.
|() ()|
(, )|() ()|
GO
GO ii
ii GO
ii
SC T SC C
TS T C SC T SC C
=
In order for two lexical entries to be considered a match, their
similarity must be greater than a threshold empirically estimated
as 0.2. For similarities below this value we have observed there
is no significant morphological resemblance between terms.
The taxonomic similarity measure considers all the overlaps
between the Ontology and the Golden Standard. In order to
optimize the evaluation, only a subset of terms and concepts are
used to assess the taxonomic similarity. This subset is obtained
through the lexical measurement, this is done by selecting only
the terms and concepts that have matched with a similarity value
greater than 0.2.
,^
,:(,)0.2
1
(, ) (, )
|| GGO O
ij
GO
ii j j ij
GO G O
ij
G
TTC C
lT l C SMll
TS T C TS T C
T∈∈
∃∈ ∃∈ >
=
5. COLLABORATIVE FILTERING FOR
ONTOLOGY REUSE
In this section, a new automatic evaluation measure that exploits
the advantages of Collaborative filtering is proposed. It will
match the set of ontologies or items that better fulfill the user’s
interest exploring the set of manual evaluation stored into the
system. As we explained in section 3 user’s evaluations are
represented like a set of five defined criteria and their respective
values manually determined by the user who makes the evalua-
tion. On the other hand, user’s interests are expressed like a
subset of those criteria, and their respective values, meaning a
threshold or restriction to be satisfied by user’s evaluations.
Two main steps are presented for this measure. The first
one describes how the similarity degree between a user’s
evaluation criterion and a user’s interest threshold for the same
criterion is assessed. The second one describes how calculate the
final rankings of the ontologies.
5.1 Collaborative Evaluation Measures
As we explained in section 3.3, the user evaluation about a spe-
cific ontology is made considering five different criteria. These
five criteria are divided in two different groups: 1) the discrete
criteria (correctness, readability and flexibility), which take
discrete numeric value [0, 1, 2, 3, 4, 5], where 0 means that the
ontology does not fulfill the criterion, and 5 means the ontology
completely satisfy the criterion, and 2) the boolean criteria
(level of formality and type of model) that are represented by
specific values that can be or not satisfied by the ontology.
User’s interests are defined like a subset of those criteria and
their respective values representing a set of thresholds that the
ontologies should fulfill. The user’s interests are sized up
against the respective values of those criteria in the user’s
evaluations or user’s profiles.
For the boolean case, a value of 0 is returned if the value of
the criterion n in the evaluation m does not fulfill the user’s
requirements for that criterion, and 2 otherwise.
0if
()
2if
mn mn
bool mn
mn mn
evaluation threshold
similarity criterion
evaluation threshold
==
For the discrete case, the measure includes different aspects:
a similarity assessment and a penalty assessment. The similarity
assessment is based on the distance between the value of the
criterion n into the evaluation m, and the threshold specified in
the user’s interest for that criterion. The more the value of the
criterion n in evaluation m overcomes the threshold specified for
this criterion, the greater the similarity value is. The penalty
assess considers how difficult is to surpass this threshold. The
more difficult to surpass the threshold, the lower the penalty
value is.
)((1
)(
*thresholdpenaltycriterionsimilarity
criterionsimilarity
nummnnum
mnnum
+=
=
This measure also returns values between 0 and 2. The con-
sideration of retrieving a similarity value between 0 and 2 has
taken from other collaborative matching measures [19] to not
manage negative numbers. In the case of this collaborative
measure, negative similarity values would be returned when the
value of the criterion in the user’s evaluations does not surpass
the threshold required in the user’s interests.
5.2 Collaborative Evaluation Ranking
The user’s interests and the user’s profiles, or evaluations of the
ontologies stored into the system, are used to make the final
ranking of the ontologies. The similarity between an ontology
evaluation and the user’s requirements is measured as the aver-
age of its N criteria similarities.
=
=
N
n
mnm criterionsimilarity
N
evaluationsimilarity
1
)(
1
)(
The similarity of a specific ontology to the user’s requirements
is measured as the average of the M evaluations similarities for
that specific ontology.
∑∑
==
=
=
=
M
m
N
n
mn
M
m
m
criterionsimilarity
MN
evaluationsimilarity
M
ontologysimilarity
11
1
)(
1
)(
1
)(
In case of ties, the final collaborative ranking sorts the ontolo-
gies taking into account not only the average similarity between
the ontologies and the evaluations stored into the system, but
also the total number of evaluations of those ontologies, provid-
ing more relevance to those ontologies that have been rated
more times.
()
total
M
s
imilarity ontology
M
6. CONCLUSIONS AND FUTURE WORK
In this work a new tool for ontology evaluation and reuse have
been presented, including some interesting features like a new
Golden Standard model, new lexical evaluation criteria, the use
of rank fusion techniques to combine different content ontology
evaluation measures, and the use of a novel Collaborative filter-
ing strategy to take advantage of user’s opinions in order to
automatically evaluate features that only can be assessed by
humans.
Some initial experiments, not explained in this paper, have
been developed using a set of ontologies form the Protégé OWL
repository [17] obtaining positive results, but a more detailed
and rigorously experimentation must be done in order to achieve
relevant conclusions.
7. ACKNOWLEDGMENTS
Thanks to Enrico Motta, the “great director” of the SSSW2005,
Aldo Gangemi and Elke Michlmayr, also members of this fan-
tastic School, for our 2 a.m. conversations about ontology
evaluation. Thanks to Alexander Gomperts for his help review-
ing the paper, and special gratefulness to Denny Vrandecic for
encourage us to do this work. This research was supported by
the Spanish Ministry of Science and Education (TIN2005-0685).
8. REFERENCES
[1] Aslam, J. A., Montague, M. Models for metasearch. 24th
Annual International ACM SIGIR Conference on Research
and Development in Information Retrieval (SIGIR 2001).
New Orleans, Louisiana, 2001, pp. 276-284.
[2] Brank J., Grobelnik M., Mladenić D. A survey of ontology
evaluation techniques. SIKDD 2005 at multiconference IS
2005, 17 Oct 2005, Ljubljana, Slovenia.
[3] Brewester, C. et al. Data driven ontology evaluation. Pro-
ceedings of Int. Conf. on Language Resources and Evalua-
tion, Lisbon, 2004.
[4] Ding, L., et al., Swoogle: A search and metadata engine for
the semantic web. Proc. CIKM 2004, pp. 652–659.
[5] Fox, E. A., Koushik, M. P., Shaw, J., Modlin, R., Rao, D.
Combining evidence from multiple searches. 1st Text RE-
trieval Conference (TREC 1). Gaithersburg, Maryland,
March 1992, pp. 319-328.
[6] Gangemi, A., Catenacci, C., Ciaramita, M., Lehmann, J. A
Theoretical Framework for Ontology Evaluation and Vali-
dation. In Proceedings of SWAP2005.
[7] Gomez-Perez, A. Some Ideas and Examples to Evaluate
Ontologies. In Proceeding of the 11th Conference on Arti-
ficial Intelligence. Los Angeles, CA, February, pp. 299-
305, 1995.
[8] Guarino, N., Welty, C. Evaluating Ontological Decisions
with OntoClean. Communications of the ACM. 45(2):61-
65. New York: ACM Press, 2002.
[9] Lee, J. H. Analyses of multiple evidence combination. 20th
ACM SIGIR Conference on Research and Development in
Information Retrieval (SIGIR 97). New York, 1997, pp.
267-276.
[10] Lozano-Tello, A., Gómez-Pérez, A. Ontometric: A method
to choose the appropriate ontology. J. Datab. Mgmt.,
15(2):1–18 (2004).
[11] Maedche, A., Staab, S. Measuring similarity between on-
tologies. Proc. CIKM 2002. LNAI vol. 2473, pp. 251-263.
[12] Masthoff J. Group modeling: Selecting a Sequence of
Television Items to Suit a Group of Viewers. User Model-
ing and User-Adapted Interaction 14: 37-85, 2004.
[13] Miller, G. WordNet: A lexical database. Communications
of the ACM, 38(11):39-41. 1995.
[14] Montaner M., López B., De la Rosa J.L. A Taxonomy of
Recommended Agents on the Internet. Artificial intelli-
gence Review 19: 285-330, 2003.
[15] Paslaru, E. Using Context Information to Improve Ontol-
ogy Reuse. Doctoral Workshop at the 17th Conference on
Advanced Information Systems Engineering CAiSE'05.
[16] Porzel, R., Malaka, R. A task-based approach for ontology
evaluation. ECAI 2004 Workshop Ont. Learning and Popu-
lation.
[17] Protégé OWL ontology Repository.
protege.stanford.edu/plugins/owl/owl-library/index.html
[18] Renda, M. E., Straccia, U. Web metasearch: rank vs. score
based rank aggregation methods. ACM symposium on Ap-
plied Computing. Melbourne, Florida, 2003, pp. 841-846.
[19] Resnick P. et al. GroupLens: An Open Architecture for
Collaborative Filtering of Netnews. Internal Research Re-
port, MIT Center for Coordination Science, March 1994.
[20] Sure Y., Erdmann M., Angele J., Staab S., Studer R.,
Wenke D. “OntoEdit: Collaborative Ontology Develop-
ment for the Semantic Web”. ISWC 2002.
[21] Velardi, P., et al. Evaluation of OntoLearn, a methodology
for automatic learning of domain ontologies. In Ont. Learn-
ing from Text: Methods, Evaluation and Applications, IOS
Press, 2005.
... The former is designed to contain four layers, namely, input, projection, hidden, and output layer. It is calculated using equation (2). In equation (2), TV is the number of previous words of a specific word, and they are encoded using 1-of-Vcoding, at which Fis the size of the vocabulary. ...
Conference Paper
Full-text available
Ontologies are the heart of the semantic web. They are designed to be reused in web applications. This paper aims to discover a given concept's representation against existing ontologies in a corpus, and if the concept is represented, other similar concepts and terms to it are extracted. A corpus formed of several ontologies in the agricultural domain was constructed. SPARQL queries were used to extract the required data from existing ontologies. And a machine learning technique, the Word2Vec, was employed for ontology reuse process to measure concepts similarity against the existing ontologies. The experimental results showed that the proposed methodology successfully detected previously seen vocabularies during the training on the data in the ontology corpus, and retrieved other similar concepts from the ontologies as well as their degree of similarity (Cosin similarity). Furthermore, the proposed model could process over two million terms in around one minute, reflecting its effectiveness in this context. The proposed method would be useful to ontology and knowledge engineers to conduct a preliminary investigation about which existing ontologies are suitable for reuse in the process of developing new ontologies. Other applications of the proposed method may include ontology alignment to measure the degree of similarity between existing ontologies.
... Current studies of the evaluation of modularization approaches focus on modularization algorithms and the evaluation of the taxonomical structure of a created module [41]. According to [14], ontology evaluation determines the quality and adequacy of an ontology for reuse in a specific context for a specific goal. ...
Article
Full-text available
Ontologies are the backbone of the Semantic Web. As a result, the number of existing ontologies and the number of topics covered by them has increased considerably. With this, reusing these ontologies becomes preferable to constructing new ontologies from scratch. However, a user might be interested in a part and/or a set of parts of a given ontology, only. Therefore, ontology modularization, i.e., splitting up an ontology into smaller parts that can be independently used, becomes a necessity. In this paper, we introduce a new approach to partition ontology based on the seeding-based scheme, which is developed and implemented through the Ontology Analysis and Partitioning Tool (OAPT). This tool proceeds according to the following methodology: first, before a candidate ontology is partitioned, OAPT optionally analyzes the input ontology to determine, if this ontology is worth considering using a predefined set of criteria that quantify the semantic and structural richness of the ontology. After that, we apply the seeding-based partitioning algorithm to modularize it into a set of modules. To decide upon a suitable number of modules that will be generated by partitioning the ontology, we provide the user a recommendation based on an information theoretic model selection method. We demonstrate the effectiveness of the OAPT tool and validate the performance of the partitioning approach by conducting an extensive set of experiments. The results prove the quality and the efficiency of the proposed tool.
... Fernandez et al. [137] developed a collaborative ontology reuse and evaluation tool CORE. The tool receives as input a set of terms which expand using WordNet, than, performs a keyword-based searches to return a a ranked list of its indexed ontologies for reuse. ...
Thesis
Full-text available
With the emergence of the Web of Data, most notably Linked Open Data (LOD), an abundance of data has become available on the web. However, LOD datasets and their inherent subgraphs vary heavily with respect to their size, topic and domain coverage, the schemas and their data dynamicity (respectively schemas and metadata) over the time. To this extent, identifying suitable datasets, which meet specific criteria, has become an increasingly important, yet challenging task to support issues such as entity retrieval or semantic search and data linking. Particularly with respect to the interlinking issue, the current topology of the LOD cloud underlines the need for practical and efficient means to recommend suitable datasets: currently, only well-known reference graphs such as DBpedia (the most obvious target), YAGO or Freebase show a high amount of in-links, while there exists a long tail of potentially suitable yet under-recognized datasets. This problem is due to the semantic web tradition in dealing with “finding candidate datasets to link to”, where data publishers are used to identify target datasets for interlinking. While an understanding of the nature of the content of specific datasets is a crucial prerequisite for the mentioned issues, we adopt in this dissertation the notion of “dataset profile” — a set of features that describe a dataset and allow the comparison of different datasets with regard to their represented characteristics. Our first research direction was to implement a collaborative filtering-like dataset recommendation approach, which exploits both existing dataset topic profiles, as well as traditional dataset connectivity measures, in order to link LOD datasets into a global dataset-topic-graph. This approach relies on the LOD graph in order to learn the connectivity behaviour between LOD datasets. However, experiments have shown that the current topology of the LOD cloud group is far from being complete to be considered as a ground truth and consequently as learning data. Facing the limits the current topology of LOD (as learning data), our research has led to break away from the topic profiles representation of “learn to rank” approach and to adopt a new approach for candidate datasets identification where the recommendation is based on the intensional profiles overlap between different datasets. By intensional profile, we understand the formal representation of a set of schema concept labels that best describe a dataset and can be potentially enriched by retrieving the corresponding textual descriptions. This representation provides richer contextual and semantic information and allows to compute efficiently and inexpensively similarities between profiles. We identify schema overlap by the help of a semantico-frequential concept similarity measure and a ranking criterion based on the tf*idf cosine similarity. The experiments, conducted over all available linked datasets on the LOD cloud, show that our method achieves an average precision of up to 53% for a recall of 100%. Furthermore, our method returns the mappings between the schema concepts across datasets, a particularly useful input for the data linking step. In order to ensure a high quality representative datasets schema profiles, we introduce Datavore— a tool oriented towards metadata designers that provides ranked lists of vocabulary terms to reuse in data modeling process, together with additional metadata and cross-terms relations. The tool relies on the Linked Open Vocabulary (LOV) ecosystem for acquiring vocabularies and metadata and is made available for the community.
Article
Full-text available
With the ongoing digital transformation and multi-domain interaction occurring in the buildings, a huge amount of heterogeneous data is generated and stored on a daily basis. To take advantage of the gathered data and help better decision makings, suitable methods are needed to meet the demand for building operations and reinvestment planning. Ontology, which provides not only the vocabulary of a certain domain but also the relationship between each other has been used in multiple engineering fields to manage heterogeneous data. A plethora of ontology development methodologies have been developed in the last decade, whereas those methods are still really time-consuming and in a low degree of automation. In this paper, we approach the problem by first presenting a semi-automatic ontology development framework that integrates existing automatic ontology tools and reuses existing ontology and data model. Based on this framework, we create a building energy management ontology and evaluate the data coverage of several real-life data sets.
Article
Reuse of elements from existing ontologies in the construction of new ontologies is a foundational principle in ontological design. It offers the benefits, among others, of consistency and interoperability between such knowledge structures as well as sharing resources. Reuse is widely found within important collections of established ontologies, such as BioPortal and the OBO Foundry. However, reuse comes with its own potential problems involving ontological commitment, granularity, and ambiguity. Guidelines are proposed to aid ontology developers and curators in their prospective reuse of content. These guidelines have been gleaned over years of practice in the ontology field. The guidelines are couched in experiential reports on designing and curating particular ontologies (e.g., EXACT and EXACT2) and using generally accepted approaches (e.g., MIREOT) in doing so. Various software tools to assist in ontology reuse are surveyed and discussed.
Chapter
The growing number of online communities has changed the way users consume information and knowledge in the web. Large amount of users seek information about problem-solving tasks in different domains in web-based communities. In the case of online communities of breastfeeding, mothers are asking and posting questions about breastfeeding difficulties and how to manage them according to the members experiences. In online communities there isn’t intelligent systems capable of extracting and managing this problem-solving knowledge to reuse it. In this paper, we propose a method for extracting and representing problem-solving knowledge shared in an online community about breastfeeding using semi-automated ontology construction method and machine learning. We use for ontology creation Protégé, an ontology development environment. The resulted ontology in evaluated using corpus-based evaluation approach and is compared to the World Health Organization textbook entitled Infant and Young Child Feeding. The results show that our ontology has successfully covered the breastfeeding difficulties domain.KeywordsOntology constructionExperiential knowledgeKnowledge managementOnline communitySocial networkMachine learningBreastfeeding
Chapter
Ontology-based technology has achieved a level of maturity which allows it to become a serious candidate for the resolution of several major IT problems in contemporary businesses, be that enterprise application integration, data modeling or enterprise search. As it implies considerable additional efforts, building and deploying ontologies at industrial level has to be supported by elaborated methodologies, methods and tools, which are available to a large extent and at feasible quality to date. However, sophisticated methods alone are not sufficient for the industrial purposes. They have to be accompanied by extended case studies and comprehensive best practices and guidelines, which are of benefit in particular in real-world situations and in the absence of deep knowledge engineering expertise. In this chapter we report our practical experiences in building an ontology-based eRecruitment system. Our case study confirms previous findings in ontology engineering literature: (1) building ontology-based systems is still a tedious process due to the lack of proved and tested methods and tools supporting the entire life cycle of an ontology; and (2) reusing existing ontologies within new application contexts is currently related to efforts potentially comparable to the costs of a new implementation. We take this study a step further and use the findings to further elaborate existing best practices towards a list of recommendations for the eRecruitment domain, which, far from claiming completeness, might speed-up the development of similar systems.
Chapter
A frequent challenge faced by ontologists and knowledge engineers is the choice of the correct or most appropriate ontology for reuse. Despite the importance of ontology evaluation and selection and the widespread research on these topics, there are still many unanswered questions and challenges. Most of the evaluation metrics and frameworks in the literature are mainly based on a limited set of internal characteristics of ontologies, e.g., their content and structure, which ignore how the community uses and evaluates them. This paper used a survey questionnaire to investigate the notion of quality and reusability in ontology engineering, and to explore and identify the set of metrics that can affect the process of ontology evaluation and selection for reuse. Responses from 157 ontologists and knowledge engineers were collected, and their analysis suggests that the process of ontology evaluation and selection for reuse, not only depends on different internal characteristics of ontologies, but that it also depends on different metadata, and social and community related metrics. Findings of this research can contribute to facilitating and improving the process of selecting an ontology for reuse.
Chapter
A deep exploration of what the term “quality” implicates in the field of ontology selection and reuse takes us much further than what the literature has mostly focused on, that is the internal characteristics of ontologies. A qualitative study with interviews of ontologists and knowledge engineers in different domains, ranging from biomedical field to manufacturing industry reveals novel social and community related themes, that have long been neglected. These themes include responsiveness of the developer team or organization, knowing and trusting the developer team, regular updates and maintenance, and many others. This paper explores such connections, arguing that community and social aspects of ontologies are generally linked to their quality. We believe that this work represents a significant contribution to the field of ontology evaluation, with the hope that the research community can further draw on these initial findings in developing relevant social quality metrics for ontology evaluation and selection.
Article
Full-text available
The Semantic Web emerged with the vision of eased integration of heterogeneous, distributed data on the Web. The approach fundamentally relies on the linkage between and reuse of previously published vocabularies to facilitate semantic interoperability. In recent years, the Semantic Web has been perceived as a potential enabling technology to overcome interoperability issues in the Internet of Things (IoT), especially for service discovery and composition. Despite the importance of making vocabulary terms discoverable and selecting the most suitable ones in forthcoming IoT applications, no state-of-the-art survey of tools achieving such recommendation tasks exists to date. This survey covers this gap by specifying an extensive evaluation framework and assessing linked vocabulary recommendation tools. Furthermore, we discuss challenges and opportunities of vocabulary recommendation and related tools in the context of emerging IoT ecosystems. Overall, 40 recommendation tools for linked vocabularies were evaluated, both empirically and experimentally. Some of the key findings include that (i) many tools neglect to thoroughly address both the curation of a vocabulary collection and effective selection mechanisms, (ii) modern information retrieval techniques are underrepresented, and (iii) the reviewed tools that emerged from Semantic Web use cases are not yet sufficiently extended to fit today’s IoT projects.
Article
Full-text available
The need for the establishment of evaluation methods that can measure respective improvements or degradations of ontological models-e.g. yielded by a precursory ontology population stage-is undisputed. We propose an evaluation scheme that allows to employ a number of different ontologies and to measure their performance on specific tasks. In this paper we present the resulting task-based approach for quantitative ontology evaluation, which also allows for a bootstrapping approach to ontology population. Benchmark tasks commonly feature a so-called gold-standard defining perfect performance. By selecting ontology-based approaches for the respective tasks, the ontology-dependent part of the performance can be measured. Following this scheme, we present the results of an experiment for testing and incrementally augmenting ontologies using a well-defined benchmark problem based on a evaluation gold-standard.
Conference Paper
Full-text available
Ontologies now play an important role for many knowledge-intensive applications for which they provide a source of precisely defined terms. However, with their wide-spread usage there come problems concerning their proliferation. Ontology engineers or users frequently have a core ontology that they use, e.g., for browsing or querying data, but they need to extend it with, adapt it to, or compare it with the large set of other ontologies. For the task of detecting and retrieving relevant ontologies, one needs means for measuring the similarity between ontologies. We present a set of ontology similarity measures and a multiple-phase empirical evaluation.
Conference Paper
Full-text available
Swoogle is a crawler-based indexing and retrieval system for the Semantic Web. It extracts metadata for each discovered document, and computes relations between documents. Discovered documents are also indexed by an information retrieval system which can use either character N-Gram or URIrefs as keywords to find relevant documents and to compute the similarity among a set of documents. One of the interesting properties we compute is ontology rank, a measure of the importance of a Semantic Web document.
Conference Paper
Full-text available
The need for evaluation-methodologies emerged very early in the field of ontology development and reuse and it has grown steadily. Yet, no comprehensive and global approach to this problem has been proposed to date. This situation may become a serious obstacle for the success of ontology-based Knowledge Technology, especially in the industrial and commercial sectors. In this paper we look at existing ontology-evaluation methods from the perspective of their integration in one single framework. Based on a catalogue of qualitative and quantitative measures for ontologies, we set up a formal model for ontology. The proposed formal model consists of a meta-ontology - O 2 - that characterizes ontologies as semiotic objects. The meta-ontology is complemented with an ontology of ontology evaluation and validation - oQual. Based on O 2 and oQual, we identify three main types of measures for ontology evaluation: structural measures, that are typical of ontologies represented as graphs; functional measures, that are related to the intended use of an ontology and of its components, i.e. their function; usability-related measures, that depend on the level of annotation of the considered ontology.
Conference Paper
Full-text available
Ontologies now play an important role for enabling the semantic web. They provide a source of precisely defined terms e.g. for knowledge-intensive applications. The terms are used for concise communication across people and applications. Typically the development of ontologies involves collaborative ef- forts of multiple persons. OntoEdit is an ontology editor that integrates numerous aspects of ontology engineering. This paper focuses on collaborative development of ontologies with OntoEdit which is guided by a comprehensive methodology.
Conference Paper
The evaluation of ontologies is vital for the growth of the Semantic Web. We consider a number of problems in evaluating a knowledge artifact like an ontology. We propose in this paper that one approach to ontology evaluation should be corpus or data driven. A corpus is the most accessible form of knowledge and its use allows a measure to be derived of the ‘fit’ between an ontology and a domain of knowledge. We consider a number of methods for measuring this ‘fit’ and propose a measure to evaluate structural fit, and a probabilistic approach to identifying the best ontology.