Conference PaperPDF Available

Learning Disjointness

June 2007

June 2007
4519:175-189

DOI:10.1007/978-3-540-72667-8_14

Source
DBLP

Conference: The Semantic Web: Research and Applications, 4th European Semantic Web Conference, ESWC 2007, Innsbruck, Austria, June 3-7, 2007, Proceedings

Authors:

Johanna Völker

Universität Mannheim

Denny Vrandečić

Google Inc.

York Sure-Vetter

Karlsruhe Institute of Technology

An increasing number of applications benefits from light-weight ontologies, or to put it differently “a little semantics goes a long way”. However, our experience indicates that more expressiveness can offer significant advantages. Introducing disjointness axioms, for instance, greatly facilitates consistency checking and the automatic evaluation of ontologies. In an extensive user study we discovered that proper modeling of disjointness is a difficult and very time-consuming task. We therefore developed an approach to automatically enrich learned or manually engineered ontologies with disjointness axioms. This approach relies on several methods for obtaining syntactic and semantic evidence from different sources which we believe to provide a solid base for learning disjointness. After thoroughly evaluating the implementation of our approach we think that in future ontology engineering environments the automatic discovery of disjointness axioms may help to increase the richness, quality and usefulness of any given ontology.

. Tagged Pairs (Majority Vote)

…

. Differences between Majority Votes 100% of A (Experts) and B (Students)

…

Figures - uploaded by Johanna Völker

Content may be subject to copyright.

Content uploaded by Johanna Völker

Content may be subject to copyright.

Learning Disjointness

Johanna V¨

olker1, Denny Vrandeˇ

ci´

c1, York Sure1and Andreas Hotho2

1Institute AIFB, University of Karlsruhe, Germany

2University of Kassel, Germany

{voelker,vrandecic,sure}@aifb.uni-karlsruhe.de,

hotho@cs.uni-kassel.de

Abstract. An increasing number of applications beneﬁts from light-weight on-

tologies, or to put it differently “a little semantics goes a long way”. However,

our experience indicates that more expressiveness can offer signiﬁcant advan-

tages. Introducing disjointness axioms, for instance, greatly facilitates consis-

tency checking and the automatic evaluation of ontologies. In an extensive user

study we discovered that proper modeling of disjointness is a difﬁcult and very

time-consuming task. We therefore developed an approach to automatically en-

rich learned or manually engineered ontologies with disjointness axioms. This

approach relies on several methods for obtaining syntactic and semantic evidence

from different sources which we believe to provide a solid base for learning dis-

jointness. After thoroughly evaluating the implementation of our approach we

think that in future ontology engineering environments the automatic discovery

of disjointness axioms may help to increase the richness, quality and usefulness

of any given ontology.

1 Introduction

An increasing number of applications beneﬁts from light-weight ontologies, or, to put

it differently, “a little semantics goes a long way” (Jim Hendler). Our experience in

building ontology-based systems indicates, however, that adding more expressivity in a

controlled manner can reap further beneﬁts. Introducing disjointness axioms, for exam-

ple, greatly facilitates consistency checking and the automatic evaluation of individuals

in a knowledge base with regards to a given ontology.

In description logics two classes are considered as disjoint iff their taxonomic over-

lap, i.e. the set of common individuals, must be empty. This does not include classes

with actual extensions that coincidentally do not have common individuals, for instance

Woman and US President, but only those where the common subset must be empty in

all possible worlds – like, for example, Woman and Car.

Disjointness allows for far more expressive and meaningful ontologies, as shown

exemplary in the following. An ontology language with the expressivity of RDFS does

not constrain the possible assertions in any way. Even after we set up an ontology deﬁn-

ing terms like Book,Student and University, stating that John is both a Student and a

University is logically perfectly viable, and would not be recognized as an error by a

reasoner. Only if we deﬁne these classes as being disjoint, the reasoner will be able to

infer the error in the above ontology, guaranteeing that particular constraints are met by

the knowledge base and a certain quality of facts is achieved – thus raising the quality

of the whole ontology-based system [17].

Despite the obvious importance of stating disjointness among classes, many of to-

day’s ontologies do not contain any disjointness axioms. In fact, a survey of 1,275

ontologies [22] recently found only 97 of them to include disjointness axioms. We can

only speculate about the reasons, but it is very likely that ontology engineers often for-

get to introduce disjointness axioms, simply because they are not aware of the fact that

classes which are not explicitly declared to be disjoint will be considered as overlap-

ping. Particularly, inexperienced users usually assume the semantics of partitions, or

even complete partitions, when they build a subsumption hierarchy (see [15]). Also, as

the size of an ontology is a major cost driver for ontologies [2], the manual engineering

and addition of the axioms actually costs more time, and thus money.

Therefore, we believe that an approach to automatically introduce disjointness ax-

ioms into an ontology would be a valuable addition to any ontology learning or engi-

neering framework. The principle feasibility of learning disjointness based on simple

lexical evidence has already been shown by [9]. However, our experiments indicate that

a single heuristic is not suitable for detecting disjointness with sufﬁciently high preci-

sion, i.e. better than an average human could do.

For this paper, we performed an extensive survey in order to collect experience

with modeling disjoint classes, and identiﬁed several problems frequently encountered

by users who try to introduce disjointness axioms. Based on the results of our survey

we developed a variety of different methods in order to automatically extract lexical

and logical features which we believe to provide a solid basis for learning disjoint-

ness. These methods take into account the structure of the ontology, associated textual

resources, and other types of data sources in order to compute the likeliness of two

classes to be disjoint. The features obtained from these methods are used to build an

overall classiﬁcation model which we evaluated against more than 10,000 disjointness

axioms provided by 30 human annotators. Due to the encouraging evaluation results

we are conﬁdent that our implementation can be used, for example, to extend state-of-

the-art ontology learning systems, to support ontology debugging [17], or to evaluate

manually added disjointness axioms.

The survey also showed that deciding if two classes are disjoint is far from trivial.

Although experts have a higher agreement on disjointness than non-expert users, their

agreement is still lower than we expected. Discussing these problematic formalizations,

we uncovered a number of problems humans have with formal disjointness.

In this paper, we will, in Section 2, ﬁrst present the features we have used in order

to automatically learn disjointness axioms. Section 3 describes the set up and execution

of the experiments we conducted in order to train a classiﬁer and evaluate the results of

our implementation (Section 4). We close with an overview of related work in Section 5

and a summary of the key contributions and remaining open questions in Section 6.

2 Features for Learning Disjointness

Assuming that there is not the one and only approach to determine the disjointness

of two classes in an ontology, we developed a variety of different methods to obtain

evidence for or against disjointness from different sources. The features delivered by

these methods will help us to train a classiﬁer which is able to distinguish between

disjoint and non-disjoint classes.

Preliminaries: In this paper we adopt the OWL ontology model, although we do

not restrict our approach to OWL. Any ontology model that allows to state disjointness

between two classes can be used with all the methods described in this paper.

The methods are provided with an unsorted list of all the pairs previously tagged by

human annotators. In the following the set of pairs will be denoted by P={p1, ...pn}

for 0≤n≤ |C|2, where Cis the set of all classes in the ontology. Each pair pk=

(ck1, ck2)consists of two classes ck1, ck2∈Cand ck16=ck2. The conﬁdence of the

system in ck1and ck2being (not) disjoint is denoted by conf(pk,+) or conf(pk,−)

respectively.

All methods are allowed to look up these classes within their semantic context, i.e.

the domain ontology they have been extracted from (see Section 3.1). And ﬁnally, as

additional sources of background knowledge, the methods may make use of a corpus

of textual resources associated with the ontology. We automatically selected a subset

of 957 documents from the Reuters corpus1[16]. For efﬁciency reasons we only chose

those documents with at least 20 occurrences of any of the classes in the ontology.

It is important to mention, that we assume ’meaningful’ labels for all classes in

the ontology, i.e. labels which may be understood by humans even without knowing

the whole taxonomy. This assumption is particularly relevant for all methods which

make use of textual resources such as the pattern-based disjointness extraction (cf. Sec-

tion 2.4), the computation of extensional overlap with respect to Del.icio.us2and the

algorithms for learning taxonomic relationships (see Section 2.1).

2.1 Taxonomic Overlap

In description logics two classes are disjoint iff their taxonomic overlap, i.e. the set of

common individuals, must be empty. Because of the open world assumption in OWL,

these individuals do not necessarily have to exist in the ontology. The taxonomic overlap

of two classes is considered not empty as long as there could be common individuals

within the domain of interest which is modeled by the ontology.

We developed three methods which determine the likeliness for two classes to be

disjoint by considering their overlap with respect to (i) individuals and subclasses

in the ontology – or learned from a corpus of associated textual resources – and (ii)

Del.icio.us documents tagged with the corresponding class labels.

Ontology Both, individuals and subclasses can be imported from an ontology (see Sec-

tion 3.1) or from a given corpus of text documents. In the latter case, subclass-of

and instance-of relationships are extracted by different algorithms provided by the

Text2Onto3ontology learning framework. A detailed description of these algorithms

1http://trec.nist.gov/data/reuters/reuters.html

2http://del.icio.us/

3http://ontoware.org/projects/text2onto/

can be found in [4]. All taxonomic relationships – learned and imported ones – are as-

sociated with rating annotations rsubclass−of (or rinstance−of respectively) indicating

the certainty x>0of the underlying ontology learning framework in the correctness of

its results. For imported relationships the conﬁdence is 1.0.

rsubclass−of (c1, c2) = (xc1subclass-of c2

0 otherwise (1)

The following formula deﬁnes the conﬁdence conf(p, −)for a pair p= (c1, c2)to

be not disjoint based on the taxonomic overlap of c1and c2with respect to common

subclasses (the same for instance):

conf(p, −) = Pc∈sub1∩sub2(rsubclass−of (c, c1)·rsubclass−of (c, c2))

Pc∈sub1rsubclass−of (c, c1) + Pc∈sub2rsubclass−of (c, c2)(2)

where subidenotes the set of subclasses of ci.

Del.icio.us Del.icio.us is a server-based system with a simple-to-use interface that al-

lows users to organize and share bookmarks on the internet. It associates each URL with

a description, a note, and a set of tags (i.e. arbitrary class labels). For our experiments,

we collected |U|= 75,242 users, |T|= 533,191 tags and |R|= 3,158,297 resources,

related by in total |Y|= 17,362,212 triples. The idea underlying the use del.icio.us in

this case is that two labels which are frequently used to tag the same resource are likely

to be disjoint, because users tend to avoid redundant labeling of documents.

conf(p, −) = |{d|c1∈t(d), c2∈t(d)}|

Pc∈C|{d|c1∈t(d), c ∈t(d)}| +Pc∈C|{d|c2∈t(d), c ∈t(d)}|(3)

where t(d)is the set of del.icio.us tags associated with document d. The normal-

ized number of co-occurrences of c1and c2(their respective labels to be precise) as

del.icio.us tags aims at capturing the degree of association between the two classes.

2.2 Subsumption

If one class is a subclass of the other we assume the two classes of a pair p=

(c1, c2)to be not disjoint with a conﬁdence equal to the likeliness associated with the

subclass-of relationship (cf. Section 2.1).

conf(p, −) = max(rsubclass−of (c1, c2), rsubclass−of (c2, c1)) (4)

2.3 Semantic Similarity

The assumption that a direct correspondence between the semantic similarity of two

classes indicates their likeliness to be disjoint led to the development of three further

methods: The ﬁrst one implements the similarity measure described by [24] to compute

the semantic similarity sim of two classes c1and c2with respect to WordNet [6]:

conf(p, −) = sim(s1, s2) = 2∗depth(lcs(s1, s2))

depth(s1) + depth(s2)(5)

where si=first(ci)denotes the ﬁrst sense of ci,i∈ {1,2}with respect

to WordNet, and lcs(s1, s2)is the least common subsumer of s1and s2. The

depth of a node nin WordNet is recursively deﬁned as follows: depth(root)=1,

depth(child(n)) = depth(n) + 1.

The second method measures the distance of c1and c2with respect to the given

background ontology (see Section 3.1) by computing the minimum length of a path q

of subclass-of relationships connecting c1and c2.

conf(p, +) = min

p∈paths(c1,c2)length(q)(6)

And ﬁnally, the third method computes the similarity of c1and c2based on their

lexical context. Along with the ideas described in [5] we exploit Harris’ distributional

hypothesis [10] which claims that two words are semantically similar to the extent to

which they share syntactic contexts.

For each occurrence of a class label in a corpus of textual documents (see preli-

maries of this section) we consider all the lemmatized tokens in the same sentence

(except for stop words) as potential features in the context vector of the correspond-

ing class. After the context vectors for both classes have been constructed, we assign

weights to all features using a modiﬁed version of the tf-idf formula:

Let vi= (fi

1...fi

n)be the context vector of class ciwhere each fi

j,n≥

1is the frequency of token jin the context of ci. Then we deﬁne T F (fi

j) =

Pd∈doc(ci)freq(fi

j, d)and N=|doc(ci)|and DF =|doc(ci)∩doc(fi

j)|, where

doc(t)is the set of documents containing term tand freq(t, d)is the frequency of term

tin document d. And ﬁnally, we get T F IDF (fi

j) = T F (fi

j)·log( N

DF ).

Given the weighted context vectors v0

1and v0

2the conﬁdence in c1and c2being not

disjoint is deﬁned as conf(p, −) = cos(v0

1, v0

2).

2.4 Patterns

Since we found that disjointness of two classes is often reﬂected by human language,

we deﬁned a number of lexico-syntactic patterns to obtain evidence for disjointness

relationships from a given corpus of textual resources. The ﬁrst type of pattern is based

on enumerations as described in [9]. The underlying assumption is similar to the idea

described in section 2.1, i.e. terms which are listed separately in an enumeration mostly

denote disjoint classes. Therefore, from the sentence

The pigs, cows, horses, ducks, hens and dogs all assemble in the big barn, thinking

that they are going to be told about a dream that Old Major had the previous night.

we would conclude that pig,cow,horse,duck,hen and dog are disjoint classes. This

is because we believe that – except for some idiomatic expressions it would be rather

unusual to enumerate overlapping classes such as dogs and sheep dogs separately which

would result in semantic redundancy. More formally:

Given an enumeration of noun phrases NP1, N P2, . . . , (and|or)N Pnwe con-

clude that the concepts c1, c2, . . . , ckdenoted by these noun phrases are pairwise

disjoint, where the conﬁdence for the disjointness of two concepts is obtained from

the number of evidences found for their disjointness in relation to the total number of

evidences for the disjointness of these concepts with other concepts.

The second type of pattern is designed to capture more explicit expressions

of disjointness in natural language by phrases such as either NP1or N P2or

neither NP1nor N P2. For both types of patterns we compute the conﬁdence for

the disjointness of two classes c1and c2as follows:

conf(p, +) = freq(c1, c2)

Pj6=1 freq(c1, cj) + Pi6=2 freq(ci, c2)(7)

where freq(ci, cj)is the number of patterns providing evidence for the disjointness

of ciand cjwith 0≤i, j ≤ |C|2and i6=j.

2.5 OntoClean

In [20] we introduced AEON, an approach to automatically evaluate ontologies accord-

ing to the OntoClean methodology [8]. The basic idea is to use a pattern-based approach

on top of the Web (and other textual data sources) for annotating classes of a given on-

tology with the OntoClean properties such as unity, identity and rigidity. Parts of the

approach can be reused for learning disjointness axioms.

Two classes are disjoint if they have incompatible unity or identity criteria. This im-

plies that a class carrying anti-unity (∼U) must be disjoint of a class carrying unity (+U)

– and similarly for identity. Since we use the same subset of the PROTON ontology as

in our AEON experiments, we can rely on the manual OntoClean taggings we collected

earlier for the evaluation of AEON.

conf(p, +) = 









1 if c1tagged with φΩ,c2tagged with ψΩ,

for Ω∈ {U, I},φ, ψ ∈ {∼,+},φ6=ψ

0 otherwise

(8)

2.6 Meta Algorithm

The meta algorithm considers superclasses known to be disjoint (from previously com-

puted conﬁdence values) and propagates this information downwards in the taxonomic

hierarchy4. For p= (c1, c2)the conﬁdence for c1and c2being disjoint is computed as

follows:

4This algorithm was not used in the ﬁnal evaluation, since early experiments indicated that it

introduces too much noise. However, we report on it for reasons of completeness. And we still

believe that it constitutes a potentially interesting direction of future work, because it allows

for integrating subsumption information into any other algorithm.

conf(p, +) = Pps(conf (ps,+) −conf (ps,−))

|super(c1)| · |super(c2)|(9)

where ps= (cs

1, cs

2)with cs

i∈ {c|subclass −of (ci, c)}for i∈ {1,2}and

subclass −of (ci, cj)being the subclass-of relationship between ciand cj. More-

over, super(c)denotes the set of superclasses of c.

3 Experiment: Human Annotation of Disjointness

We thoroughly evaluated our approach by performing a comparison of learned dis-

jointness axioms against a large number of manually created ones to calculate (among

other things) the degree of overlap. This section describes the generation of the evalu-

ation dataset consisting of 2000 pairs of classes tagged by 30 annotators and discusses

methodological aspects related to the manual creation of disjointness axioms. The com-

plete dataset is available from http://www.aifb.uni-karlsruhe.de/WBS/

jvo/data/disjointness-111206.zip.

3.1 Ontology

As a basis for the creation of the evaluation datasets and as background knowledge

for the ontology learning algorithms we took a subset (system,top and upper module)

of the freely available PROTON ontology (PROTo ONtology)5. In total our subset of

PROTON contains 266 classes, 77 object properties, 34 datatype properties and 1388

siblings.

PROTON is a basic upper-level ontology to facilitate the use of background or pre-

existing knowledge for automatic metadata generation. PROTON covers the general

concepts necessary for a wide range of tasks, including semantic annotation, indexing,

and retrieval of documents. The design principles can be summarized as follows (as

described in [19]) (i) domain-independence; (ii) light-weight logical deﬁnitions; (iii)

alignment with popular standards; (iv) good coverage of named entities and concrete

domains (i.e. people, organizations, locations, numbers, dates, addresses).

3.2 Evaluation Setting: Manual Taggings

To be able to compare the results of our trained model with the results generated by

manual annotation we created a dataset consisting of 2000 pairs of classes as follows:

First, we manually selected 200 (potentially) non-disjoint pairs from the ontology, since

we assumed the set of non-disjoint pairs to constitute a weak minority class (which

would have hampered the construction of a good model for our classiﬁer). Then, we

randomly chose 500 siblings – which constitute a subset of the data, which is of partic-

ular interest from a practical and theoretical aspect. And ﬁnally, we added another 1300

pairs chosen randomly without any selection criteria.

5PROTON is available from http://proton.semanticweb.org.

Once the dataset was complete, each pair was randomly assigned to 6 different

people – 3 from each of two groups, the ﬁrst one consisting of PhD students from

our institute (all of them professional ”ontologists”), the second being composed of

under-graduate students without profound knowledge in ontological engineering. Each

of the annotators was given between 385 and 406 pairs along with natural language

descriptions of the classes whenever those were available. Possible taggings for each

pair were +(disjoint), −(not disjoint) and ?(unknown). The result were two datasets

Aand Bfor ”ontologists” and ”students”. A third dataset Cwas created by merging A

and B(cf. table 1a). Dataset Dis a subset of Cconsisting of all siblings, whereas E

contains all those pairs of classes which were randomly selected.

In order to get cleaner and less ambiguous training data for our classiﬁcation model

(see Section 4) we computed the majority votes for all the above mentioned datasets

by considering the individual taggings for each pair (3 in the case of Aand B, and 6

for C). If at least 50% (or 100% respectively) of the human annotators agreed upon +

or −this decision was assumed to be the majority vote for that particular pair. In case

of equally many positive and negative taggings, the majority vote was deﬁned as ?or

unknown. These pairs were not used for training purposes. In this way we reduced the

noise the classiﬁer had to deal with in the training phase, and obtained a better overall

model. Some statistical properties of the majority vote datasets are given by table 2.

3.3 Analysis of Human Annotations

In order to determine how difﬁcult it is for humans to tag pairs of classes as being dis-

joint or not we measured the human agreement within and across the different subsets

of the data. Table 1b shows the average agreement among the individual taggers, i.e.

the average maximum ratio of annotators who agreed upon the same tag for a pair of

classes. By analysing the ﬁgures we ﬁnd that the average agreement for Dis signiﬁ-

cantly lower than the agreement for any of the other datasets – which seems to imply

that pairs of siblings (classes with a common direct superclass) are much more difﬁcult

to tag for human annotators than randomly chosen pairs of classes. This might be due

to the fact that it is comparably hard to determine the differences between the intension

and extension of classes which are semantically very close.

Table 1. a) Evaluation Datasets b) Tagged Pairs (Individual)

ID Dataset Annotators Tags per Pair Pairs

AExperts 15 3 2000

BStudents 15 3 2000

CAll 30 6 2000

DSiblings 30 6 541

ERandom 30 6 1300

Dataset Individual Taggings

+−?all −/+avg. agree.

A3849 2007 144 6000 0.521 0.869

B3881 2106 13 6000 0.543 0.858

avg. 3865.0 2056.5 78.5 6000 0.532 0.864

C7730 4113 157 12000 0.532 0.824

D1362 1822 62 3246 1.338 0.754

E6166 1554 80 7800 0.252 0.853

In addition to the computation of the agreement within each of the datasets, we also

tried to capture commonalities and differences between the taggings of people from the

two groups of annotators – ontologists (A) and students (B).

First, we measured the average agreement of the individual taggings of the experts

with the majority vote 100% of the students and vice versa. The ﬁgures – 0.852 for the

agreement between Aand the majority vote of B, and a slightly lower value of 0.834

for the agreement between Band the majority vote of A– indicate that, maybe due

to the relatively higher disagreement among the students (see table 1b), those tend to

agree mainly on very evident cases of disjointness.

The hypothesis that there is a considerable number of pairs which are comparably

easy to tag, thus provoking a high agreement, is supported by the ﬁgures we get for the

agreement among the majority votes 100% (0.964) and 50% (0.793) of Aand B.

And ﬁnally, we completed our analysis of the annotation results by inspecting con-

crete examples of differently tagged pairs. Table 3 shows the listing of all pairs of

classes which were assigned different tags by the majority votes 100% (which means

that all 3 annotators of Aor Bagreed upon each tag) of experts and students. An ex-

tensive discussion of the differences which tries to explain some of the problems the

human annotators encountered can be found in the following section.

3.4 Discussion

During the creation of the human annotations, we had the chance to study the prob-

lems humans face when using disjointness. Even in the taggings of the experts group –

consisting of post-graduates all involved in Semantic Web research – the overlap of the

taggings was lower than expected (cf. Section 3.3). Table 3 shows all pairs where all

experts agreed on one tagging, and all students agreed on the other. Based on an anal-

ysis of the taggings and subsequent discussions with the taggers, we identiﬁed several

types of problems regarding disjointness:

1. The label and comment of a class often do not provide an unambiguous idea of

what is meant with this class.

Table 2. Tagged Pairs (Majority Vote)

Dataset Majority Vote 50% Majority Vote 100%

+−?all −/+ + −?all −/+

A1297 649 54 2000 0.500 931 330 739 2000 0.354

B1346 648 6 2000 0.481 846 307 847 2000 0.363

avg. 1321.5 648.5 30.0 2000 0.490 888.5 318.5 793.0 2000 0.359

C1276 537 187 2000 0.421 616 194 1190 2000 0.315

D188 274 79 541 1.457 28 96 417 541 3.429

E1072 140 88 1300 0.131 588 35 677 1300 0.060

Table 3. Differences between Majority Votes 100% of A (Experts) and B (Students)

Vote Disjoint Classes ? Vote Disjoint Classes ?

A B A B

−+RailroadFacility Pipeline +−Canal Harbor

−+Order Abstract +−OfﬁcialPoliticalMeeting Parliament

−+Newspaper HomePage +−Week Month

−+School MineSite +−Mountain Peninsula

−+TelecomFacility Monument +−Island Valley

−+ReligiousLocation Canal +−Government Parliament

−+InternationalOrganization StockExchange +−Service Telecom

−+WaterRegion PoliticalRegion +−Park Festival

−+InternetDomain EntitySource +−OilField Province

−+ReligiousOrganization Airline +−Patent AirplaneModel

−+RecreationalFacility Capital +−Ministry Location

−+City Archipelago +−Delta River

−+Pipeline LaunchFacility +−TVCompany Movie

−+AstronomicalObject Mountain

−+GovernmentOrganization AmusementPark

−+AmusementPark Galaxy

−+LaunchFacility Bridge

2. Some disjointness axioms may depend on the context: whereas Dog and Livestock

may be disjoint in most parts of Europe, in the Chinese Wordnet6the latter is actu-

ally a hypernym of the former.

3. Classes can have abstract individuals, like Money,Message or Idea.

4. Often the extension of two classes are disjoint, although their intension is not, e.g.

US President and Woman. Annotators struggle with this difference.

5. Also, the extensions of two classes might be not disjoint, even though their inten-

sions are: although Weapon and Pitchfork are disjoint intensionally (in the literal

sense), their extensions do not need to be.

6. Roles and so called basic classes are often mixed, e.g. the role Professor and the

Person itself that plays the role, which may be deﬁned disjoint (depending on how

roles are modeled [11]).

7. Mereological and instantiation relations can be mixed: a Week is part of a Month,

so are these two classes disjoint? What about Delta and River?

8. Mixing other types of relations with instantiation relations may lead to misunder-

standings: see for example the pairs Movie/TVCompany,Government/Parliament,

or Patent/AirplaneModel, where the instances have close relations and thus seem

to confuse the annotators.

9. Instantiations can occur at different levels of abstraction. E.g., when describing

animals, Eagle may be the label of both an individual (e.g. of the class Species) and

of a class itself. Are then the two classes Species and Eagle disjoint? Note that the

individual Eagle is not the same as the class Eagle, but they may be connected via

an axiom like Class:Eagle ≡ ∃species.{Individual:Eagle}.

10. Sometimes, lexical information is mixed with ontological one. The PROTON ontol-

ogy contains concepts like Alias that form lexical information. Is a JobTitle disjoint

from a Job or the Person having the Job or JobTitle?

6http://www.keenage.com/

Note that this list does not speak about problems of disjointness with regards to its

deﬁnition in description logics, but rather with the problems our annotators had when

they had to decide if two classes are disjoint or not. Many of the above problem types

have a well-deﬁned answer with regards to the formal semantics of disjointness, e.g.

#7, where Week and Month are disjoint as they don’t have common instances (since a

week consists of seven days, and months consist of around 28-31 days. Note that the

deﬁnition of week and month can change, but this basically means that we introduce

new concepts which may or may not have the same name).

Recognizing the problem type would allow an ontology development environment

to offer much more appropriate help than just a general description of the meaning of

the disjointness axiom, which can be hard to apply at times.

Often the decision, if two classes are disjoint or not, will uncover underspeciﬁed or

ambiguous classes, i.e. moot points in the description of one or both classes. Instead

of simply adding (or, which is far harder to tract, not adding) a disjointness axiom,

the rationale behind this decision should also be documented, following an ontology

lifecycle methodology like DILIGENT [21] for the continuous evolution and reﬁnement

of the ontology.

4 Evaluation: Learning a Classiﬁer

In this section we present the evaluation procedure and analyse the results of the com-

parison between the classiﬁer which has been trained on the features described in Sec-

tion 2 and the sets of manual annotations (see Section 3).

4.1 Experimental Settings

To train the classiﬁer we skipped pairs of classes tagged with ?since the deﬁnition of

disjointness only distinguishes between disjoint and not disjoint classes. For the rest

of the evaluation we will consider this two-class problem. We evaluate our learned

classiﬁer against two baseline: the random and majority baseline.

Random Baseline: The idea of the random baseline is to randomly choose the

target class of the classiﬁer.As we have a two-class problem we will distribute the pairs

equally over the two classes. This will result in a 50% baseline for accuracy as 50% of

the +examples will be classiﬁed in +which means that these examples are classiﬁed

correctly. The same holds for the −class.

Majority Baseline: The majority baseline is determined by taking the largest class

as default classiﬁcation. This way, we will get a high accuracy if the classes are un-

equally distributed. In this case, of course, the majority baseline is much more difﬁcult

to beat than the random baseline. Nevertheless, since in the experiments at hand we only

have to deal with two classes (+or −) which are not equally distributed, the majority

baseline should be considered as more realistic than the random baseline.

Classiﬁer settings: In order to be able to classify each pair of classes as being

disjoint (+) or not (−), we trained a classiﬁer based on the manual taggings created

by human annotators. The features for the classiﬁer are the conﬁdence values obtained

from various sources as described in section 2.

We tested a couple of different classiﬁers made available by the Weka package7. In

general, decision trees outperformed all other classiﬁers – maybe, because of the highly

selective character of our features – while the performance of different types of decision

trees was more the less comparable. Therefore, we ﬁnally chose the ADTree classiﬁer

[7] with default settings for our experiments which shows very good performance while

at the same time providing interpretable results.

First, we performed a 10-fold cross-validation against the majority votes 100% and

50% of the datasets A(ontologists), B(students), C(all) and E(random) (cf. table

1a). The results for the random dataset are included to show the performance of our

approach for an unbiased dataset (Econtains examples chosen randomly from the set

of all possible pairs without any selection criteria). To get the results for dataset D

(siblings), we split dataset Cinto two independent parts - one for evaluation and one

for training. The training set for the evaluation with dataset Dconsists of all manually

tagged pairs except for the siblings.

4.2 Results

Table 5 and 4 list the results of our evaluation experiments by means of Precision (P),

Recall (R), F-Measure (F) and Accuracy (Acc) (for deﬁnitions cf. [23]). From the

tables it becomes evident that we easily beat the baselines for the datasets A(experts),

B(students) and Cin both cases majority vote 50% and 100%. With an accuracy of

over 90% the performance of our system for dataset Cis remarkable, especially in

the case of the total majority vote. These results are comparable with the human inter-

annotator agreement for experts and students – and even better for dataset C(90.9%) in

comparison to the human agreement of 86.4%.

Dataset D, which only contains pairs of siblings, is certainly the most difﬁcult to

handle – for the classiﬁer, but also for the human annotators – because, as explained in

Section 3.3, siblings are semantically close, so that differences between their intensions

and extensions may often be hard to grasp. As dataset Dshows a relatively low average

agreement compared to the other datasets (cf. table 1b) the classiﬁer seems to have more

7http://www.cs.waikato.ac.nz/ml/weka/

Table 4. Evaluation against Majority Vote 50% (ADTree)

Dataset P R F Acc Accrandom Accmajority

+−avg. +−avg. +−avg.

A0.815 0.638 0.727 0.823 0.626 0.725 0.819 0.632 0.726 0.757 0.500 0.666

B0.807 0.642 0.725 0.844 0.580 0.712 0.825 0.609 0.717 0.758 0.500 0.675

avg. 0.811 0.640 0.726 0.834 0.603 0.719 0.822 0.621 0.722 0.758 0.500 0.671

C0.854 0.682 0.768 0.874 0.644 0.759 0.864 0.663 0.764 0.806 0.500 0.704

D0.558 0.628 0.593 0.255 0.861 0.558 0.350 0.726 0.538 0.615 0.500 0.593

E0.910 0.761 0.836 0.990 0.250 0.620 0.948 0.376 0.662 0.904 0.500 0.884

Table 5. Evaluation against Majority Vote 100% (ADTree)

Dataset P R F Acc Accrandom Accmajority

+−avg. +−avg. +−avg.

A0.896 0.720 0.808 0.903 0.703 0.803 0.899 0.712 0.806 0.851 0.500 0.738

B0.866 0.790 0.828 0.942 0.599 0.771 0.903 0.681 0.792 0.851 0.500 0.734

avg. 0.881 0.755 0.818 0.923 0.651 0.787 0.901 0.697 0.799 0.851 0.500 0.736

C0.934 0.823 0.879 0.946 0.789 0.868 0.940 0.805 0.873 0.909 0.500 0.760

D0.237 0.806 0.522 0.786 0.260 0.523 0.364 0.394 0.379 0.379 0.500 0.774

E0.977 0.955 0.966 0.998 0.600 0.799 0.987 0.737 0.862 0.976 0.500 0.944

difﬁculties to learn it. This is also expressed by the very bad classiﬁcation accuracy with

37% for majority vote 100%.

An investigation of the learned classiﬁer revealed that the rather important taxo-

nomic feature (see Section 2.2) is not well populated in the siblings part of the dataset.

To analyse the inﬂuence of this feature we constructed a dataset without this feature. As

expected the accuracy for the training dataset drops, whereas for the evaluation set it

is improved considerably from 37.9% to 74.2%. Moreover, the results for the majority

vote 50% rise to 76.6% which can be interpreted as an indication to the noise insert by

this feature.

Our approach seems to work very well also for the random dataset Eas we got a

better accuracy in both cases. The difference to the majority baseline is much smaller

than for A,B, and Cbut the baseline of around 90% is very difﬁcult to beat. To con-

clude, the results – not only for the random dataset – are very promising and allow us

to setup a competitive classiﬁer to support ontology engineering.

In order to ﬁnd out which classiﬁcation features contributed most to the overall

performance of the classiﬁer we performed an analysis of our initial feature set with

respect to the gain ratio measure [14]. The ranking produced for data set Cclearly

indicates an exceptionally good performance of the features taxonomic overlap (Sec-

tion 2.1), similarity based on WordNet and lexical context (Section 2.3), and del.icio.us

(Section 2.1). The contribution of other features such as the one presented in Section 2.4

relying on lexico-syntactic patterns seems to be less substantial. However as the classi-

ﬁcation accuracy tested on every single feature is always below the overall performance

the combination of all features is necessary to achieve a very good overall result.

5 Related Work

Several ontology learning frameworks have been designed and implemented in the last

decade. The Mo’K workbench [1], for instance, basically relies on unsupervised ma-

chine learning methods to induce concept hierarchies from text collections. In particu-

lar, the frameworkfocuses on agglomerative clustering techniques and allows ontology

engineers to easily experiment with different parameters. OntoLT [3] is an ontology

learning plug-in for the Prot´

eg´

e ontology editor. It is targeted at end users and heavily

relies on linguistic analysis, i.e. it makes use of the internal structure of noun phrases to

derive ontological knowledge from texts. JATKE8is a Prot´

eg´

e based uniﬁed platform

for ontology learning which allows for inclusion of modules for ontology learning. The

OntoLearn framework [13] mainly focuses on the problem of word sense disambigua-

tion, i.e. of ﬁnding the correct sense of a word with respect to a general ontology or

lexical database. TextToOnto [12] is a framework implementing a variety of algorithms

for diverse ontology learning subtasks. In particular, it implements diverse relevance

measures for term extraction, different algorithms for taxonomy construction as well as

techniques for learning relations between concepts. The recent RelExt approach [18]

focusses on the extraction of triples, i.e. classes connected by a relation. None of the

mentioned approaches deals with disjointness.

6 Conclusion and Future Work

Learning of disjointness axioms is an intuitive and useful extension of existing ontology

learning frameworks. We have motivated the need for richter ontologies which include

disjointness axioms and presented an approach consisting of a number methods to ex-

tract expressive feature for learning disjointness from different sources of evidence. In

a thorough evaluation our learning approach behaved competitive to human annotators.

As a by-product we captured lessons learned from human annotators with respect to

their difﬁculties when modeling disjointness axioms.

Future work includes a combination with ontology evaluation approaches for richly

axiomatized ontologies such as [17]. Moreover, we want to integrate the novel methods

into the Text2Onto [4] framework for ontology learning from texts.

Acknowledgments: Research reported in this paper has been partially ﬁnanced by the

EU in the IST project SEKT (IST-2003-506826) (http://www.sekt-project.

com). We would like to thank our colleagues, especially Peter Haase, for fruitful dis-

cussions, and all the students and colleagues at the AIFB and the University of Kassel

for providing us with more than 10,000 taggings.

References

1. G. Bisson, C. Nedellec, and L. Canamero. Designing clustering methods for ontology build-

ing - The Mo’K workbench. In Proc. of the ECAI Ontology Learning Workshop, pages

13–19, 2000.

2. E. P. Bontas, C. Tempich, and Y. Sure. ONTOCOM: A cost estimation model for ontology

engineering. In I. Cruz et al., editors, Proceedings of the 5th International Semantic Web

Conference (ISWC 2006), volume 4273 of LNCS, pages 625–639. Springer-Verlag Berlin

Heidelberg, 2006.

3. P. Buitelaar, D. Olejnik, and M. Sintek. OntoLT: A prot´

eg´

e plug-in for ontology extraction

from text. In Proc. of the 2nd Int. Semantic Web Conference (ISWC2003), 2003.

4. P. Cimiano and J. V¨

olker. Text2onto – a framework for ontology learning and data-driven

change discovery. In Proc. of the 10th Int.l Conf. on Applications of Natural Language to

Information Systems (NLDB’05), June 2005.

8http://jatke.opendfki.de/

5. P. Cimiano and J. V¨

olker. Towards large-scale, open-domain and ontology-based named

entity classiﬁcation. In G. Angelova, K. Bontcheva, R. Mitkov, and N. Nicolov, editors,

Proc. of the International Conference on Recent Advances in Natural Language Processing

(RANLP), pages 166–172, Borovets, Bulgaria, September 2005. INCOMA Ltd.

6. C. Fellbaum. WordNet, an electronic lexical database. MIT Press, 1998.

7. Y. Freund and L. Mason. The alternating decision tree learning algorithm. In ICML, pages

124–133, 1999.

8. N. Guarino and C. A. Welty. A formal ontology of properties. In Knowledge Acquisition,

Modeling and Management, pages 97–112, 2000.

9. P. Haase and J. V¨

olker. Ontology learning and reasoning - dealing with uncertainty and

inconsistency. In Proc. of the Workshop on Uncertainty Reasoning for the Semantic Web

(URSW), pages 45–55, 2005.

10. Z. Harris. Distributional structure. In J. Katz, editor, The Philosophy of Linguistics, pages

26–47, New York, 1985. Oxford University Press.

11. K. Kozaki, E. Sunagawa, Y. Kitamura, and R. Mizoguchi. Fundamental considerations of

role concepts for ontology evaluation. In Proc. of the Workshop EON – Evaluation of On-

tologies for the Web, 2006.

12. A. Maedche and S. Staab. Ontology learning for the semantic web. IEEE IS, 16(2), 2001.

13. R. Navigli, P. Velardi, A. Cucchiarelli, and F. Neri. Extending and enriching WordNet with

OntoLearn. In Proc. of the GWC 2004, pages 279–284, 2004.

14. J. R. Quinlan. C4.5 Programs for Machine Learning. Morgan Kaufmann, California, 1993.

15. A. Rector, N. Drummond, M. Horridge, J. Rogers, H. Knublauch, R. Stevens, H. Wang,

and C. Wroe. OWL pizzas: Practical experience of teaching OWL-DL – common errors &

common patterns. In Proc. of EKAW 2004, pages 63–81, 2004.

16. T. Rose, M. Stevenson, and M. Whitehead. The reuters corpus volume 1-from yesterdays

news to tomorrows language resources. Proc. of the Third International Conference on

Language Resources and Evaluation, pages 29–31, 2002.

17. S. Schlobach. Debugging and semantic clariﬁcation by pinpointing. In Proc. of the 2nd

European Semantic Web Conference (ESWC2005), volume 3532 of LNCS, pages 226–240.

Springer, 2005.

18. A. Schutz and P. Buitelaar. RelExt: A tool for relation extraction in ontology extension. In

Proc. of the 4th International Semantic Web Conference (ISWC2005), 2005.

19. I. Terziev, A. Kiryakov, and D. Manov. Base upper-level ontology (BULO) guidance. SEKT

deliverable 1.8.1, Ontotext Lab, Sirma AI EAD (Ltd.), 2004.

20. J. V¨

olker, D. Vrandecic, and Y. Sure. Automatic evaluation of ontologies (AEON). In Proc.

of the 4th International Semantic Web Conference (ISWC2005), volume 3729 of LNCS, pages

716–731. Springer, 2005.

21. D. Vrandeˇ

ci´

c, H. S. Pinto, Y. Sure, and C. Tempich. The DILIGENT knowledge processes.

Journal of Knowledge Management, 9(5):85–96, 2005.

22. T. D. Wang. Gauging ontologies and schemas by numbers. In Proc. of the Workshop EON –

Evaluation of Ontologies for the Web, 2006.

23. I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques.

Morgan Kaufmann Series in Data Management Sys. Morgan Kaufmann, 2nd edition, June

2005.

24. Z. Wu and M. Palmer. Verbsemantics and lexical selection. In 32nd. Annual Meeting of the

Ass. for Computational Linguistics, pages 133–138, New Mexico, 1994.

Learning Class Disjointness Axioms Using Grammatical Evolution

Chapter

Full-text available

Mar 2019

Today, with the development of the Semantic Web, Linked Open Data (LOD), expressed using the Resource Description Framework (RDF), has reached the status of “big data” and can be considered as a giant data resource from which knowledge can be discovered. The process of learning knowledge defined in terms of OWL 2 axioms from the RDF datasets can be viewed as a special case of knowledge discovery from data or “data mining”, which can be called“RDF mining”. The approaches to automated generation of the axioms from recorded RDF facts on the Web may be regarded as a case of inductive reasoning and ontology learning. The instances, represented by RDF triples, play the role of specific observations, from which axioms can be extracted by generalization. Based on the insight that discovering new knowledge is essentially an evolutionary process, whereby hypotheses are generated by some heuristic mechanism and then tested against the available evidence, so that only the best hypotheses survive, we propose the use of Grammatical Evolution, one type of evolutionary algorithm, for mining disjointness OWL 2 axioms from an RDF data repository such as DBpedia. For the evaluation of candidate axioms against the DBpedia dataset, we adopt an approach based on possibility theory.

Discovering disjoint object property pairs in knowledge graphs using Probabilistic Soft Logic

Article

Full-text available

Oct 2022
KNOWL INF SYST

Although Knowledge Graphs (KGs) have turned out to become a popular and powerful tool in the industry world, the major focus of most researchers has been only on adding more and more triples to the A-Boxes of the KGs. An often overlooked but important part of a KG is its T-Box. If the T-Box contains incorrect statements or if certain correct statements are absent in it, it can lead to inconsistent knowledge in the KG or to information loss respectively. In this paper, we propose a novel system, DOPLEX, based on Probabilistic Soft Logic (PSL) to detect disjointness between pairs of object properties present in the KG. Current approaches mainly rely on checking the absence of common triples and miss out on exploiting the semantics of property names. In the proposed system, in addition to checking common triples, PSL is used to determine if property names imply disjointness. We particularly focus on knowledge graphs that are auto-extracted from large text corpora. Our evaluation demonstrates that the proposed approach discovers disjoint property pairs with better precision when compared to the state-of-the-art system without compromising much on the number of disjoint pairs discovered. Towards the end of the paper, we discuss the disjointness of properties in the context of time and propose a new notion called temporal-non-disjointness and discuss its importance and characteristics. We also present an approach for the discovery of property pairs that are potentially temporally non-disjoint.

Fundamentals of Motivation and Incentives

Chapter

Jan 2013

Graph-based Methods for Large-Scale Multilingual Knowledge Integration

Book

Jan 2012

Gerard de Melo

Ontology learning: Grand tour and challenges

Article

Feb 2021

Ontologies are at the core of the semantic web. As knowledge bases, they are very useful resources for many artificial intelligence applications. Ontology learning, as a research area, proposes techniques to automate several tasks of the ontology construction process to simplify the tedious work of manually building ontologies. In this paper we present the state of the art of this field. Different classes of approaches are covered (linguistic, statistical, and machine learning), including some recent ones (deep-learning-based approaches). In addition, some relevant solutions (frameworks), which offer strategies and built-in methods for ontology learning, are presented. A descriptive summary is made to point out the capabilities of the different contributions based on criteria that have to do with the produced ontology components and the degree of automation. We also highlight the challenge of evaluating ontologies to make them reliable, since it is not a trivial task in this field; it actually represents a research area on its own. Finally, we identify some unresolved issues and open questions.

A Multi-Objective Evolutionary Approach to Class Disjointness Axiom Discovery

Conference Paper

Dec 2020

The huge wealth of linked data available on the Web (also known as the Web of data), organized according to the standards of the Semantic Web, can be exploited to automatically discover new knowledge, expressed in the form of axioms, one of the essential components of ontologies. In order to overcome the limitations of existing methods for axiom discovery, we propose a two-objective grammar-based genetic programming approach that casts axiom discovery as a genetic programming problem involving the two independent criteria of axiom credibility and generality. We demonstrate the power of the proposed approach by applying it to the task of discovering class disjointness axioms involving complex class expression, a type of axioms that plays an important role in improving the quality of ontologies. We carry out experiments to determine the most appropriate parameter settings and we perform an empirical comparison of the proposed method with state-of-the-art methods proposed in the literature.

Axiomatic Relation Extraction from Text in the Domain of Tourism

Chapter

Dec 2020

Tourism is one of the most important activities in the economic sector. Thus, text about this topic coming from diverse data sources such as the Web should be analyzed to get information and knowledge that can be represented and consumed by people and applications. Such tasks are part of the Ontology Learning and Population (OLP) field, where the goal is to find elements of information (named entities) and their associations to create a knowledge base. In this regard, OLP deals with the discovery of named entities and how they are grouped, related, and subdivided according to their characteristics (class). However, the association between entities has not been studied in the same way, particularly axiomatic relations that exploit the type of the named entities. Therefore, this paper proposes a strategy for the extraction of axiomatic relations from text. It is based on the identification of named entities, their class, and their co-occurrence in the text to define lexical-syntactic patterns that support the extraction of axiomatic relations such as equivalence and disjointness between classes. The results demonstrate the usefulness of the strategy to produce new statements that enrich ontologies and knowledge bases.

Grammatical Evolution to Mine OWL Disjointness Axioms Involving Complex Concept Expressions

Conference Paper

Jul 2020

Discovering disjointness axioms is a very important task in ontology learning and knowledge base enrichment. To help overcome the knowledge-acquisition bottleneck, we propose a grammar-based genetic programming method for mining OWL class disjointness axioms from the Web of data. The effectiveness of the method is evaluated by sampling a large RDF dataset for training and testing the discovered axioms on the full dataset. First, we applied Grammatical Evolution to discover axioms based on a random sample of DBpedia, a large open knowledge graph consisting of billions of elementary assertions (RDF triples). Then, the discovered axioms are tested for accuracy on the whole DBpedia. We carried out experiments with different parameter settings and analyze output results as well as suggest extensions.

Using Grammar-Based Genetic Programming for Mining Disjointness Axioms Involving Complex Class Expressions

Chapter

Sep 2020

In the context of the Semantic Web, learning implicit knowledge in terms of axioms from Linked Open Data has been the object of much current research. In this paper, we propose a method based on grammar-based genetic programming to automatically discover disjointness axioms between concepts from the Web of Data. A training-testing model is also implemented to overcome the lack of benchmarks and comparable research. The acquisition of axioms is performed on a small sample of DBpedia with the help of a Grammatical Evolution algorithm. The accuracy evaluation of mined axioms is carried out on the whole DBpedia. Experimental results show that the proposed method gives high accuracy in mining class disjointness axioms involving complex expressions.

An Unsupervised Approach to Disjointness Learning based on Terminological Cluster Trees

Article

Sep 2020

In the context of the Semantic Web regarded as a Web of Data, research efforts have been devoted to improving the quality of the ontologies that are used as vocabularies to enable complex services based on automated reasoning. From various surveys it emerges that many domains would require better ontologies that include non-negligible constraints for properly conveying the intended semantics. In this respect, disjointness axioms are representative of this general problem: these axioms are essential for making the negative knowledge about the domain of interest explicit yet they are often overlooked during the modeling process (thus affecting the efficacy of the reasoning services). To tackle this problem, automated methods for discovering these axioms can be used as a tool for supporting knowledge engineers in modeling new ontologies or evolving existing ones. The current solutions, either based on statistical correlations or relying on external corpora, often do not fully exploit the terminology. Stemming from this consideration, we have been investigating on alternative methods to elicit disjointness axioms from existing ontologies based on the induction of terminological cluster trees, which are logic trees in which each node stands for a cluster of individuals which emerges as a sub-concept. The growth of such trees relies on a divide-and-conquer procedure that assigns, for the cluster representing the root node, one of the concept descriptions generated via a refinement operator and selected according to a heuristic based on the minimization of the risk of overlap between the candidate sub-clusters (quantified in terms of the distance between two prototypical individuals). Preliminary works have showed some shortcomings that are tackled in this paper. To tackle the task of disjointness axioms discovery we have extended the terminological cluster tree induction framework with various contributions: 1) the adoption of different distance measures for clustering the individuals of a knowledge base; 2) the adoption of different heuristics for selecting the most promising concept descriptions; 3) a modified version of the refinement operator to prevent the introduction of inconsistency during the elicitation of the new axioms. A wide empirical evaluation showed the feasibility of the proposed extensions and the improvement with respect to alternative approaches.

D1.8.1 Base upper-level ontology (BULO) Guidance1

Article

Full-text available

Jan 2005

An important practical approach to ontology generation is the use of background or pre-existing knowledge in the form of a basic upper-level ontology. Such an ontology can also be used for metadata generation and as a groundwork for the overall knowledge modelling and integration strategy of a KM environment. The essential contribution of this deliverable is a basic upper-level ontology called PROTON (PROTo ONtology), which is hereby introduced and documented. It contains about 300 classes and 100 properties, providing coverage of the general concepts necessary for a wide range of tasks, including semantic annotation, indexing, and retrieval of documents. The design principles can be summarized as follows (i) domain-independence; (ii) light-weight logical definitions; (iii) alignment with popular standards; (iv) good coverage of named entities and concrete domains (i.e. people, organizations, locations, numbers, dates, addresses). The ontology is originally encoded in a fragment of OWL Lite and split into four modules: System, Top, Upper, and KM (Knowledge Management).

WordNet: An Electronic Lexical Database

Book

Jan 1998

Christiane Fellbaum

Debugging and semantic clarification by pinpointing

Article

Jan 2005

Stefan Schlobach

Distributional structure

Article

Jan 1954

ZS Harris

OntoLT: A Protégé Plug-In for Ontology Extraction from Text

Article

Jan 2003

language technology tools into ontology development environments is to be expected. Language technology tools will be essential in scaling up the Semantic Web by providing automatic support for ontology monitoring and adaptation. Language technology in combination with approaches in ontology engineering and machine learning provides linguistic analysis and text mining facilities for ontology mapping (between cultures and applications) and ontology learning (for adaptation over time and between applications).

WordNet: An Electronic Lexical Database

Article

Sep 2000

The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.

Data Mining: Practical Machine Learning Tools and Techniques (Third Edition)

Book

Jan 2005

Text2Onto - A Framework for Ontology Learning and Data-driven Change Discovery

Article

Jan 2005
Lect Notes Comput Sci

In this paper we present Text2Onto, a framework for on- tology learning from textual resources. Three main features distinguish Text2Onto from our earlier framework TextToOnto as well as other state-of-the-art ontology learning frameworks. First, by representing the learned knowledge at a meta-level in the form of instantiated model- ing primitives within a so called Probabilistic Ontology Model (POM), we remain independent of a concrete target language while being able to translate the instantiated primitives into any (reasonably expressive) knowledge representation formalism. Second, user interaction is a core as- pect of Text2Onto and the fact that the system calculates a confldence for each learned object allows to design sophisticated visualizations of the POM. Third, by incorporating strategies for data-driven change dis- covery, we avoid processing the whole corpus from scratch each time it changes, only selectively updating the POM according to the corpus changes instead. Besides increasing e-ciency in this way, it also allows a user to trace the evolution of the ontology with respect to the changes in the underlying corpus.

Towards large-scale, open-domain and ontology-based named entity classification

Article

Jan 2005

Named entity recognition and classication research has so far mainly focused on supervised techniques and has typically considered only small sets of classes with regard to which to classify the recognized en- tities. In this paper we address the classication of named entities with regard to large sets of classes which are specied by a given ontology. Our approach is unsupervised as it relies on no labeled training data and is open-domain as the ontology can simply be exchanged. The approach is based on Harris' dis- tributional hypothesis and, based on the vector-space model, it assigns a named entity to the contextually most similar concept from the ontology. The main contribution of the paper is a systematic analysis of the impact of varying certain parameters on such a context-based approach exploiting similarities in vec- tor space for the disambiguation of named entities.

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Book

Jan 2005

Learning Disjointness

Abstract and Figures

Recommended publications

Automatically Discovering Semantic Links among Documents and Applications

Towards the Automatic Learning of Idiomatic Prepositional Phrases

Constraint Acquisition as Semi-Automatic Modeling

Generation by Inverting a Semantic Parser that Uses Statistical Machine Translation.