An overview of the entity resolution process.

An overview of the entity resolution process.

Source publication
Chapter
Full-text available
The problem of entity resolution is central in the field of Digital Humanities. It is also one of the major issues in the Golden Agents project, which aims at creating an infrastructure that enables researchers to search for patterns that span across decentralised knowledge graphs from cultural heritage institutes. To this end, we created a method...

Contexts in source publication

Context 1
... extend the method for identifying duplicate entities previously presented in [20] of which figure 1 gives an overview. This method makes use of an embedding, that is, in our case, an n-dimensional Euclidean space where each (non-unique) person resource is assigned a coordinate based on its neighbouring nodes in the RDF graph. ...
Context 2
... then take the k (approximate) nearest neighbours based on euclidean distance between embedding vector of each entity i ∈ V , denoted by the set N k i , and create an edge between that entity and its neighbour j ∈ N k i if their cosine similarity u i j = cosim(v i , v j ) exceeds some threshold θ . Such constructed graphs, which are illustrated in panel A on figure 1, consist of a number of connected components. The number and size of these components depend on the choice of θ : high values of θ ≈ 1 result in many small components and lower values result in fewer but larger components. ...
Context 3
... total, this produces a data set that contains 7,339 events and 22,073 persons, of whom 3,839 have been disambiguated (i.e. they participate in at least two events: one from the Occasional Poetry data set, and one from the City Archives' data sets). An example of resources involved in a single event can be seen in Figure 1. The resources are following a basic format that models resources as part of an event in a particular role. ...