ArticlePDF Available

Abstract and Figures

Linked Data principles and technologies are being investigated in various areas. In the Educational context, many studies are using Linked Data trying to solve problems of interoperability of educational data and resources, enriching educational content, and personalizing and recommending educational content and practices. This article presents a systematic mapping of proposals which have been adopting Linked Data for supporting education, and, based on analysis of these proposals, we discuss the tools, vocabularies, and datasets being used, providing a research landscape of the area. We also present challenges and trends which can foster future research in this area. IEEE
Content may be subject to copyright.
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1
Linked data in Education: a survey and a
synthesis of actual research and future
challenges
Crystiam Kelle Pereira, Sean Wolfgand Matsui Siqueira, Bernardo Pereira Nunes, Stefan Dietze
Abstract—Linked Data principles and technologies are being investigated in various areas. In the Educational context, many
studies are using Linked Data trying to solve problems of interoperability of educational data and resources, enriching
educational content, and personalizing and recommending educational content and practices. This article presents a systematic
mapping of proposals which have been adopting Linked Data for supporting education, and, based on analysis of these
proposals, we discuss the tools, vocabularies, and datasets being used, providing a research landscape of the area. We also
present challenges and trends which can foster future research in this area.
Index Terms— Semantic Web, Semantic technology, Educational technology
—————————— ——————————
1 INTRODUCTION
inked Data (LD) can be summarized as the use of the
Web to create connections between data which may
be originally stored in various databases maintained by
different organizations and distributed across different
geographic locations [1]. One of its primary objectives is
to extend the Web of Documents, where HTML documents
are interconnected through hyperlinks, to a Web of Data
[1], where data may be connected directly, following the
LD principles [2]. The LD principles can be summarized
as follows:
1. use URIs as names for things;
2. use HTTP URIs so that people can search for the-
se names;
3. provide useful information when someone
searches for a URI, using recommended stand-
ards (RDF - Resource Description Framework [3],
SPARQL Simple Protocol and RDF Query Lan-
guage);
4. add links to other URIs so that additional infor-
mation can be discovered.
A few years after these first concepts were presented,
the “5 stars of Linked Data” were defined by Tim Bern-
ers-Lee [2], classifying solutions so that the more stars a
solution has, the more powerful and easy to use it is.
Although this area presents many limitations and chal-
lenges, the use of LD has brought numerous benefits,
including transparency, reusability, knowledge discov-
ery, and interoperabilityfor various application areas.
For instance, LD has been used in various fields, in-
cluding the area of Education. Linked Data has potential
for use in Education due to the open and accessible na-
ture of the educational data and resources produced by
many institutions [4]. It can be an alternative for sharing
and reusing educational data and resources, providing
interoperability among repositories, enriching content,
exploring large datasets relevant to education perfor-
mance analysis, individualizing and personalizing learn-
ing, as well as other issues which are still challenges. The
ability to share and connect data has encouraged many
educational institutions to begin a movement to use
Linked Data principles and technologies in education.
Some notable projects in this regard are LinkedUp pro-
ject
1
, Linked Education Cloud
2
, mEducator
3
, Open Uni-
versity
4
, LAK Dataset
5
and LAK Data Challenge
6
.
Different initiatives seek to share data such as courses
offered by universities [5][6], statistical data [7], organiza-
tional data [8][9], educational resources such as videos,
presentations, lectures, books and games [10][11][12], as
well as the publication of tools to support educational
practices [13][14].
In addition to providing data in a structured way, data
integrating with other linked datasets and data reusing
for enrichment [15][16][17][18][19][20], recommendation
(and customization) of educational content available in
the Web of Data [21][22][23][24][25], and expansion of
search terms [26][27] are the main objectives of the work
analyzed. The increase in exposing educational data
through Linked Data technologies creates also
the opportunity for the development of applications in
the learning analytics (LA)/educational data mining
(EDM) field [28][29][30].
1
https://linkedup-project.eu/
2
http://data.linkededucation.org/linkedup/catalog/
3
http://www.meducator3.net/
4
http://data.open.ac.uk/
5
https://solaresearch.org/initiatives/dataset/
6
http://meco.l3s.uni-hannover.de:9080/wp2/?page_id=18
xxxx-xxxx/0x/$xx.00 © 200x IEEE Published by the IEEE Computer Society
L
————————————————
Crystiam Kelle Pereira and Sean Wolfgand Matsui Siqueira and Bernardo
Pereira Nunes are with the Department of Applied Informatics (DIA), Fed-
eral University of the State of Rio de Janeiro (UNIRIO), Av. Pasteur, 458,
Urca, 22290-240 Rio de Janeiro, RJ, Brazil. E-mail: {sean, crystiam.kelle,
bernardo.nunes}@uniriotec.br
Bernardo Pereira Nunes is also with the Department of Informatics - PUC-
Rio, Rua Marquês de São Vicente, 455, Gávea, Rio de Janeiro, RJ, Brazil.
Email: bnunes@inf.puc-rio.br
Stefan Dietze is a Research group leader at the L3S Research Center of the
Leibniz University Hanover, Germany. Email: dietze@l3s.de
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
2 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
The research and application of LD principles and
technologies are continually evolving. For this reason,, it
is important that current initiatives are mapped to indi-
cate opportunities, identify technologies, approaches and
tools available, report experiences and address challenges
and trends for future research.
To map out this research landscape, this article con-
ducts a systematic mapping of state of the art in LD in
Education. It collects the various ways in which LD is
being used in the context of Education, for instance, the
tools supporting the processes of data publishing and
reuse, as well as the relevant vocabularies and datasets. In
addition, the challenges currently faced in each stage of
the process are raised. Finally, to provide an overview of
the current developments in the area, this study can also
serve to guide future work and to suggest possible ave-
nues for further research.
The remainder of this paper is organized as follows.
Section 2 reviews the literature. Section 3 presents the
systematic review and mapping. Section 4 overviews the
use of LD in Education. Section 5 presents datasets and
vocabularies commonly used. Section 6 presents the chal-
lenges faced and, finally, Section 7 concludes the paper.
2 RELATED WORK
The application of LD in Education was discussed as
early as 2009 [31]. On that occasion, the authors reviewed
the status of this research area, examining the pioneering
initiatives, the main challenges, and the datasets, tools,
and prominent applications.
The application of semantic technologies in education-
al environments on the Web was also the subject of a
survey in 2011 [32]. The central axis of that investigation
was a discussion of ontologies and vocabularies geared
towards the area of Education.
LD in Education is also discussed by D’Aquin et al. [4] as
a potentially revolutionary Web technology for problems
in the area of distance learning and open education. The
authors argue that the use of LD can help with naviga-
tion, delivery, discovery and recommendation of educa-
tional resources, and cite possibilities for customizing
learning and establishing social connections. Likewise,
D’Aquin et al. [4] describe how organizations are using
LD technologies. They also provide a brief overview of
the data publishing process, as well as key tools and
technologies involved in this process.
Another relevant survey in LD in Education is presented
by Dietze et al. [33] where the authors state that the appli-
cation of LD principles in education offers excellent po-
tential for addressing problems involving interoperability
of educational data and resources. They also discuss vari-
ous scenarios that can benefit from the use of LD in Edu-
cation as well as present an overview of the data publica-
tion and interlinking process and the various tools and
technologies adopted so far along with the main chal-
lenges faced on the use of LD in Education.
In another survey conducted by Navarrete and Lujan-
Mora [34] the topic of LD in Education is presented under
the perspective of the enrichment of open educational
resources, prioritizing the description of technologies and
proposals related to this specific topic.
Although there are other studies[33][34][35] that also
reviewed the use of Linked Data in Education, this paper
aims not only to provide an overview of the works in the
area, but mainly to understand the objectives of using
Linked Data in Education. It also analyses outdated and
trends on Linked Data tools and vocabularies in the Edu-
cational context. This paper is a supplement of previous
works presenting a comprehensive survey with a total of
114 papers reviewed out of 1384 pre-selected papers in
the field (see selection criteria in Section 3). The analysis
of a larger volume of papers provides more evidence
with regards to which LD initiatives in Education are
using tools, vocabularies, and datasets. Additionally, the
present paper provides an overview of the current chal-
lenges faced in each stage of the process of applying LD
in Education.
3 SYSTEMATIC REVIEW AND SYSTEMATIC
MAPPING
The selection of papers reviewed in this study was per-
formed using a systematic mapping process. According to
Kitchenham and Charters [36], a systematic mapping
should allow guiding the focus of future systematic re-
views while also identifying areas for further primary
studies to be conducted. As stated by Kitchenham and
Charters [36], “the well-defined methodology makes it
less likely that the results of the literature are biased, alt-
hough it does not protect against publication bias in the
primary studies”.
Therefore, following strictly the recommendations
proposed by Kitchenham and Charters [36], the present
survey was conducted using the following steps: planning,
implementation, and reporting.
3.1. Planning
According to Biolchini et al. [37], in the planning phase,
the research objectives are listed, and the research ques-
tions are formulated, thereby generating a search string.
The present systematic mapping was designed to an-
swer two primary research questions:
(RQ01) What are the objectives that Linked Data is
being used for in the context of Education?
(RQ02) What are the Linked Data datasets used in
the context of Education?
Based on analysis of the studies performed in this area,
we also verified some secondary questions:
(SQ01) What ontologies and vocabularies are being
used in the publication of Linked Data in the con-
text of Education?
(SQ02) What tools are being employed to support
the process of using Linked Data in Education?
Although the initial objective was to answer questions
associated with the goals, datasets, ontolo-
gies/vocabularies, and tools related to Linked Data in the
context of Education, the review of 114 papers, as well as
the numerous research references, allowed us to amplify
the discussion, resulting in an up-to-date survey of the
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
area.
So, the search queries were formulated in three stages:
(i) creating search questions based on PICOC (Population,
Intervention, Comparison, Outcome, and Context) [36];
(ii) identifying synonyms for the keywords; and (iii) con-
structing a search string.
Based on the definition of the research question, it was
possible to extract the keywords (linked data, education)
and their synonyms (linked open data, open data, learning).
The combination of these terms resulted in the following
search string:
(“Linked data” OR “Open data” OR “Linked Open Da-
ta”) AND (“Education” OR “Learning”)
During the development of the protocol, inclusion cri-
teria were also defined (proposals using LD in the context
of Education), as well as criteria for exclusion of articles,
as follows: (1) derivative works; (2) works not available in
English, Portuguese or Spanish; (3) articles lacking online
access; (4) short papers (articles having less than 4 pages);
(5) studies having technical problems in cataloging or
retrieval; (6) descriptions of workshops and/or confer-
ences; (7) studies not in the field of learning technologies;
(8) studies not related to LD; (9) studies not covering the
use of datasets in education; and (10) studies on LD
where the data was not actually linked.
In order to verify the quality of the search string, con-
trol articles were considered [9][38][39][40][41]. The ex-
pectation was that the control articles should be retrieved
using the defined search string. All the control articles
were indeed retrieved. Some of them were discarded in
the selection steps because they were secondary papers.
The tools used to support the implementation of the
systematic mapping were: (1) StArt
7
(State of the Art
through Systematic Review): a tool responsible for sup-
porting the entire mapping process, from planning to
final reporting, as well as the stages of filtering and select-
ing articles; (2) Mendeley
8
: used to support reading and
marking the papers retrieved and managing the refer-
ences; (3) Excel: to support extraction and manipulation
of data, and generation of reports.
3.2. Implementation of the Systematic Mapping
Once the protocol was defined, the systematic mapping
was started by applying the search string to the following
six databases: ACM Digital Library
9
, IEEE Xplore Digital
Library
10
, Springer
11
, Scopus
12
, ScienceDirect
13
and Web of
Science
14
. Adjustments in the search string were needed
for some databases to follow their specific formats, with-
out, however, changing the terms.
1,957 papers were retrieved, of which 573 were dupli-
cates, thus leaving 1,384 articles to be analyzed. The dis-
tribution of articles retrieved from each database is shown
in Table 1.
7
http://lapes.dc.ufscar.br/tools/start_tool
8
https://www.mendeley.com/
9
http://dl.acm.org/advsearch.cfm
10
http://ieeexplore.ieee.org/search/advsearch.jsp
11
https://www.springer.com
12
https://www.scopus.com/
13
http://www.sciencedirect.com/science/search
14
https://webofknowledge.com/
In the first step of the article selection process, the titles
and abstracts of the papers were read, resulting in 1,157
being discarded, and 227 being selected for the subse-
quent step of the more detailed analysis. A summary of
the number of studies at each step is shown in Table 2.
In the second stage, a closer analysis of the papers was
performed, by reading the abstracts and the whole paper.
111 studies were rejected using the exclusion criteria al-
ready mentioned. Also, two duplicate studies were identi-
fied. Thus, 114 papers were selected for the data extrac-
tion step.
An overview of the studies by date shows (Figure 1)
that the topic started being researched in 2009, with sig-
nificant growth in the following years through 2012. The
same behavior can be observed in the study by Vega-
Gorgojo et al. [35]. The topic reached a new level of
growth in 2015. However, it is not possible to state
whether there is a growth trend since the search cut-off
point for studies in 2016 was April of that year.
4. OVERVIEW OF LINKED DATA IN EDUCATION
According to Dietze et al. [33], the adoption of LD princi-
ples can benefit the broad integration of educational data,
which involves two steps:
1. Educational Data Integration: this deals with the
integration of heterogeneous educational data,
through its exposure and interconnection with
other data sets.
2. Educational Services Integration: this seeks to re-
solve heterogeneity between standards adopted by
different APIs (Application Programming Inter-
faces) and repositories.
3. Linked Data has a potential for use in Education
due to the open and accessible nature of the educa-
tional data and resources produced by many uni-
versities [4]. While the content of open educational
resources is, by definition, accessible and reusable,
publishing its corresponding metadata as LD can
make the content of different repositories more
discoverable, accessible and connectable.
0
10
20
30
2009 2010 2011 2012 2013 2014 2015 2016
Number of papers per year
FIGURE 1 - NUMBER OF PAPERS PER YEAR
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
4 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
TABLE 1
DISTRIBUTION OF PAPERS BY RESEARCH DATABASE
Research database
ACM
IEEE
Springer
Scopus
ScienceDirect
Web of Science
TABLE 2
NUMBER OF STUDIES ANALYZED AT EACH STAGE
SELECTION
EXTRACTION
INITIAL NUMBER OF STUDIES
1,957
227
ACCEPTED
227
114
DISCARDED
1,157
111
DUPLICATES
573
2
Various initiatives for publishing data on the Web in
an open, structured and connected manner are notewor-
thy in the context of education:
1. LinkedEducation.org
15
is a project focused on shar-
ing resources and information related to LD in
education.
2. The University of Southampton
16
provides public
access to some of its administrative data to allow
the construction of tools leveraging this data, with
the primary objective being transparency of its ac-
tions.
3. The Open University in the UK
17
was the first edu-
cational organization to create a LD platform ex-
posing information from all its departments,
courses, multimedia and scientific publications
[42].
Table 3 shows the main institutions, ordered by the
number of publications retrieved and analyzed, that are
researching and applying linked data technologies in the
educational context.
In this research, we examined 114 proposals to identify
what are the objectives that Linked Data is being used for in the
context of Education?(RQ01), that is, how the linked data
are contributing to solving problems in the educational
field.
The analysis of the proposals indicates that the use of
Linked Data in Education has three main objectives:
1. Availability of educational data as Linked Data;
2. Educational data and systems integration and in-
teroperability from different sources; and,
3. Consumption of Linked Data for different educa-
tional purposes.
A general discussion about the objectives is presented
with examples of Linked Data use. Furthermore, this
section also provides an overview about “What tools are
15
http://linkededucation.org/
16
http://data.southampton.ac.uk/
17
http://dados.open.ac.uk
being employed to support the process of using Linked Data in
Education?”(SQ02).
TABLE 3
SOME INSTITUTIONS THAT ARE RESEARCHING AND APPLYING
LINKED DATA TECHNOLOGIES IN THE EDUCATIONAL CONTEXT.
INSTITUTION
COUNTRY
L3S Research Center
Germany
National Research Council of Italy Inst. for
Educational Technology
Italy
Departamento de Ciencias de la Computaci-
ón Universidad Técnica Particular de Loja,
UTPL Loja
Ecuador
Democritus University of Thrace/School of
Medicine, Alexandroupoli
Greece
Facultad de Informática Universidad Politéc-
nica de Madrid
Spain
Dept of Electronics and Information Systems
– Multimedia Lab Ghent University – IBBT
Belgium
Olayan School of Business American Univer-
sity of Beirut
Lebanon
Medical Informatics Laboratory, School of
Medicine, Aristotle University of Thessaloniki
Greece
Web Science Program School of Mathematics
Aristotle University of Thessalonik
Greece
The Open University
UK
Dipartimento di Ingegneria Elettrica, Elettro-
nica e Informatica, Università di Catania
Italy
School of Medicine, Democritus University of
Thrace
Greece
Information Engineering Research Unit,
Computer Science Department, University of
Alcalá
Spain
School of Telecommunication Engineering,
University of Valladolid
Spain
Computer Science Department of the Univer-
sity of Alcalá
Spain
4.1 Availability of educational data as Linked Data
The area of Technology Enhanced Learning has long been
striving for wider reuse and sharing of resources, and the
use of Linked Data has been an alternative.
Note that more than 63% of the studies are focusing on
issues related to the sharing and publishing of education-
al resources as Linked Data. This trend is quite under-
standable when considering two important aspects of this
process. The first aspect is that the initiative of providing
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
open data, in a structured format, using a semantic repre-
sentation, is still very recent. Many of the studies sur-
veyed are still in the initial stages of the process of pub-
lishing and using (consuming) LD. The second aspect is
that many data repositories, from different domains, must
go through a series of standardization, analysis and ad-
justment processes to be made available using the LD
principles. The challenges involved in this data prepara-
tion process motivate various studies.
According to the literature mapping, the nature of the
data being published is diverse: data from courses offered
by universities [5][6], statistical data [7], organizational
data [8][9], educational resources such as videos, presen-
tations, lectures, books and games [10][11][12], as well as
the publication of tools to support educational practices
[13][14].
The publication of data can be made following 5-star
deployment scheme suggested by Tim Berners-Lee [2].
Some tools can support the 5-star Linked Data publishing
process, which are organized according to their function:
tools for converting and mapping data to RDF, tools for
storing and accessing LD, and tools for data interlinking.
A. Tools for converting and mapping data to RDF
Many tools have been developed and made available to
assist data publishers in the process of transforming and
publishing data following LD standards. The transition
from data represented in spreadsheets, e-mails, or even in
relational databases is not trivial and demands that the
publishers know vocabularies that can describe their data
as well as how to connect their data to other different
datasets.
The D2RQ
18
platform [43] is the most used in the stud-
ies analyzed. The D2RQ platform has some tools for
transforming a relational database to RDF. These include
the D2R Server for publishing relational databases on the
Semantic Web and D2RQ mapping tool for the creation of
mappings between a relational database and a specific
RDF schema. For example, Rajabi et al. [44] use the D2RQ
platform to expose some learning objects elements of
GLOBE repository aiming at evaluating the interlinking
results between GLOBE repository and several educa-
tional datasets on the Web of Data.
Like D2RQ, Triplify
19
e OpenRefine
20
are used to pub-
lish RDF and Linked Data from relational databases.
Triplify [17] is used in the MetaMorphosis+ environment
[45] for publishing, sharing and repurposing educational
content in medical education. The Open Refine is used by
Penteado [30] to extract and convert the data from the
data sources, reconcile with DBpedia, and generate RDF
triples. (Answer R4C2)
Although there are tools to support the data publica-
tion step, they often do not capture complex domain se-
mantics that is required by many educational applica-
tions. A lot of work is needed to create mappings manual-
ly, as discussed by Sahoo et al. [46]. In September 2012,
W3C published the R2RML recommendation [47], a
18
http://d2rq.org/
19
http://semanticweb.org/wiki/Triplify.html
20
http://openrefine.org/
standard language to describe mappings between a rela-
tional database and an equivalent RDF dataset, encourag-
ing RDB-to-RDF tool developers to comply with a stand-
ard mapping language.
Because of the difficulty in finding solutions focused
on the educational context, some proposals have devel-
oped customized tools. Pirrotta [7] developed a tool to
convert and publish statistical data about University stu-
dent activities. Metadata mapping processes were pro-
posed by Thangsupachai et al. [48] to convert database to
RDF to Linked Open Data in a particular case of mathe-
matics courses, using a variety of tools.
B. Tools for storing and accessing Linked Data
After the data have been converted, a next step is to store
them. A triplestore
21
is a type of database management
system that uses RDF as its data model. These tools often
provide access to their data via SPARQL endpoints to
support queries. The Sesame
22
and OpenLink Virtuoso
23
were the most mentioned tools in the studies analyzed.
Other triplestore systems mentioned were: Allegro-
Graph
24
, Mulgara
25
, ARC2
26
, Apache Jena Fuseki
27
and
BigOWLIM
28
.
Note that triplestores are systems aimed to store and
retrieve data through semantic queries. Therefore, none of
the papers analyzed mentioned specific educational
needs. A comparative analysis between triplestore tools
can be found in [49][50].
Despite the increase of tools for publishing and storing
Linked Data on the Web, their use is not a trivial task and
requires technical knowledge and understanding of Se-
mantic Web concepts and technologies. Most of the pa-
pers analyzed needed to use several different types of
tools or even create tools to support the data publishing
process. Alcantara et al. [51] propose the development of
embedded friendly interface in data provider systems for
supporting the transformation in different formats.
On the other hand, the data publication demands
knowledge in the domain of education to lead the crea-
tion of mappings that encompasses educational issues.
The combination of efforts from both areas (education
and technology) can enhance the quality of some specific
tools and approaches for educational data publication. A
listing of some other tools can be found in several studies
[31][32][35][52][31][33][35].
4.2 Educational data and systems integration and
interoperability
One of the major challenge in the Learning Technologies
is integration and interoperability of educational systems,
resources and data. This challenge is due to the variety of
systems being used around the world, and the need for
reuse of educational resources, and data.
In recent years, many initiatives have sought to create
21
http://www.dataversity.net/introduction-to-triplestores/
22
https://www.w3.org/2001/sw/wiki/Sesame
23
https://virtuoso.openlinksw.com/
24
http://franz.com/agraph/allegrograph/
25
http://www.mulgara.org/
26
https://github.com/semsol/arc2/wiki
27
http://jena.apache.org/documentation/fuseki2/index.html
28
[Bishop et al. 2011]
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
6 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
vocabularies and standards that allow the integration and
interoperability of educational data. The multiple results
of these efforts eventually created different patterns, us-
ing different models, technologies, and languages.
As explained by Zouaq et al. [53] Linked Data has the
potential to be a possible solution by allowing the map-
ping between various data models. Thus, semantically
similar models that are represented differently can still be
aligned using links to establish meaningful connections
between concepts from different models.
Some proposals specifically dealt with investigating
the possibility of system and data interoperability using
LD technologies [13][54][55][56][57], through the inter-
linking between educational data and resources across
different datasets [54][55][57], generation of links between
different datasets, definition of mapping between differ-
ent educational datasets [56], evaluation of the interlink-
ing of two datasets [44][58].
A. Tools for Data interlinking
Once data are made available, it is recommended to con-
nect them with other datasets to achieve the highest level
of maturity in data provision [2]. Some of the tools men-
tioned in the studies analyzed that can support the data
interlinking process are: Silk
29
, LIMES
30
, UCI [59] and
RDF-IA [60].
According to Lehmann et al. [61], SILK and LIMES are
the most popular frameworks among the LD publishers.
SILK is used by Konstantinidis et al. [62] to discover rela-
tionships between the metadata of the educational re-
sources and other data sources on the Web and to create
RDF links to link them. LIMES is used by Rajabi et al. [44]
to interlink the GLOBE repository to 20 datasets in the
Linked Open Data (LOD) cloud. When a new dataset is
published on the Web of Data, it is important, as recom-
mended by [2], to identify other possible datasets to inter-
link with. However, identifying these datasets is not a
trivial task. For instance, Leme et al. [63] and Wölger et al.
[64] suggest candidate datasets recommendation tech-
niques based on the vocabularies, classes, and properties
commonly used. A comparative analysis of the interlink-
ing tools is provided by Wölger et al. [65] and Rajabi et
al.[66]. Techniques and common issues for the data inter-
linking process can be found in [20][52][58][67][68][69].
4.3 Consumption of Linked Data in Education
Once the data are made available, there are significant
opportunities for the development of innovative solu-
tions, either through analysis of the current educational
context and decision making more conscious and directed
to the revealed problems or by the development of appli-
cations that explore the data and resources available to
improve the quality of education.
Some of the papers analyzed in the systematic review
consume Linked Data from different datasets for different
purposes, such enrichment, recommendation and reuse.
Some examples are enrichment of books through associa-
tion with other data [15][16][17][18][19][20], the use of
29
http://silkframework.org/
30
http://aksw.org/Projects/LIMES.html
collection of a museum or library, made available as LD
to support the explanation of a particular subject in the
classroom, or preparing questionnaires through datasets
of questions and tasks [70][71], recommendation (and
customization) of educational content available in the
Web of Data [21][22][23][24][25], and expansion of search
terms [26][27].
To semantically enrich an educational resource, sys-
tems often extract structured information from unstruc-
tured data and link to external knowledge bases in the
Linked Open Data cloud (LOD), such as DBpedia. This
annotation process can benefit systems, enhance infor-
mation retrieval and improve interoperability. Infor-
mation retrieval is improved by the ability to perform
searches which exploit the ontology to make inferences
about data from heterogeneous resources; furthermore,
annotations based on a common ontology can provide a
common framework for the integration of information
from heterogeneous sources [72].
A. Linked Data for Learning Analytics
The increase in exposing educational data through Linked
Data technologies creates the opportunity for retrieving
information and knowledge about educational resources,
learning processes, data interlinking between different
sources and analysis of educational content. These many
possibilities may contribute to the development of appli-
cations in the learning analytics (LA)/educational data
mining (EDM) field.
Zouaq et al. [53] discuss some opportunities and chal-
lenges associated with the use of Linked Data in educa-
tional data integration and analysis. They argue that the
use of LD is particularly relevant for the LA field because
it enables the collection and management of learner and
content data from a variety of sources (applications and
services) used in informal and lifelong learning.
The Learning Analytics and Knowledge (LAK) da-
taset
31
[28] exposes a collection (over five years) of biblio-
graphic resources about Learning Analytics and Educa-
tional Data Mining. The dataset reuses vocabularies and
provides links to schemas and entity coreferences in re-
lated datasets. Penteado [30] and Maturana et al. [29]
created applications that allow users to perform data
analysis and to explore data about researchers, confer-
ences, and publications available in the LAK dataset.
The application of LD in learning analytics (LA) is the
subject of several proposals. Reflection and prediction of
trends in Personal Learning Environment (PLE) is a pri-
mary focus of [23]. They are using semantic context mod-
eling and creation of Linked Data from logs. Mey-
mandpour and Davis [73] developed a metric to create a
rank of universities based on educational Linked Data.
The proposed metric uses the quality and value of some
semantic properties to identify the relative position of
universities worldwide. Penteado [30] uses both educa-
tion and economic indicators for analysis of school per-
formance of Brazilian schools. The datasets containing the
indicators were made available as Linked Data and en-
riched by indicators information extracted from DBpedia,
31
http://lak.linkededucation.org/
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
allowing a comparative analysis.
B. Tools for Consumption of Linked Data
DBpedia offers some tools to support the document en-
richment process. DBpedia Spotlight
32
adds markups
with semantic information surrounding atomic elements
(entities). These entities are the ones found in DBpedia
dataset, and each one contains structured information
extracted from Wikipedia. The DBpedia Lookup Service
can be used to look up DBpedia URIs by related key-
words.
In the educational context, tools are used to extract and
enrich entities found in educational forums [74]. Mudu et
al. [10] present a software architecture for linking the
SCORM standard to the LOD cloud. Zenaminer [10] al-
lows users to enrich the SCORM resources with com-
ments. Unstructured comments are automatically anno-
tated with DBpedia Spotlight linking them to the LOD
cloud through DBpedia. Chae et al. [75] present a system
that can help users to provide various types of multime-
dia data as linked data and considers RelFinder [5] as a
tool for exploration of the semantic relationships between
the multimedia resources.
Consumption may also happen directly inside applica-
tions, using APIs or SPARQL endpoints, for example, the
SPARQL endpoint for DBpedia
33
, which allows queries to
be performed against the data, taking advantage of con-
nections between different datasets.
Although the enrichment has been seen as an alterna-
tive in the educational domain, their costs are relatively
high. Enrichment typically requires finding suitable
Linked Data entities [76], but fully automatic annotations
are still challenging [33][35].
C. Semantic Web Browsers
Consumption of LD is discussed by Dadzie and Rowe
[77], the authors address the approaches and challenges
involved in data visualization. In this scenario, semantic
web browsers play an important role in supporting access
to data. The authors divide semantic browsers into two
types: text-based browsers and browsers offering data
visualization options.
Text-based browsers display the data using text struc-
tures such as tables and lists and may have more ad-
vanced features, allowing navigation through the data.
For example, semantic web browsers of this type include
Marbles
34
and URIburner
35
.
Browsers offering data visualization options use struc-
tures such as images, charts, graphs and timelines to pre-
sent the data. Examples include DBpedia Mobile
36
, IsaV-
iz
37
, OpenLink Data Explorer (ODE)
38
and Tabulator
39
.
32
http://www.dbpedia-spotlight.org/
33
http://dbpedia.org/sparql
34
https://sourceforge.net/projects/marbles/files/marbles/Marbles%2
0Linked%20Data%20Engine%201.0/
35
http://linkeddata.uriburner.com/fct/
36
http://wiki.dbpedia.org/projects/dbpedia-mobile
37
https://www.w3.org/2001/11/IsaViz/ (latest version 2007)
38
http://ode.openlinksw.com/
39
https://www.w3.org/2005/ajar/tab.html
5. EDUCATIONAL DATASETS AND VOCABULARIES
5.1 Datasets
Datasets are particularly important for applications which
need to consume data and for publishers and data pro-
vider to connect your data with other possible datasets. In
the context of education, two types of datasets should be
considered [31]: (1) those directly related to educational
information, containing, for example, educational re-
sources, institutional data and educational indicators; and
(2) datasets from different domains (museums, agrono-
my, libraries, encyclopedias, etc.) which may be used in
educational settings.
The number of educational datasets is continuously
growing, making it difficult to provide an exhaustive
listing in the present study. In this article, datasets are
mentioned according to the higher number of citations in
the proposals analyzed and in the secondary studies re-
trieved by the search string. The list was also comple-
mented based on a search in the Open Data Inception
40
dataset repository.
The extraction of the main datasets used in the map-
ping studies provides support for answering the research
question (RQ02): What are the Linked Data datasets used in
the context of Education?
Based on analysis of the datasets, it was found that
there was significant use of datasets where the data is not
from a specific domain.
Some of the most cited datasets are DBpedia
41
, Free-
base
42
and GeoNames
43
. The DBpedia project [78] pro-
vides data automatically extracted from Wikipedia, using
structured information such as infobox tables, categoriza-
tion information, geo-coordinates and information from
external links. The DBpedia contains many links to other
dataset such as Wikidata, OpenCyc
44
, UMBEL
45
,
GeoNames, MusicBrainz
46
, UniProt
47
and Bio2RDF
48
.
Freebase is a Google Inc’s collaborative knowledge bases.
It was discontinued in 2010.However, the data was trans-
ferred to the collaborative knowledge base Wikimedia
Foundation's project called Wikidata
49
. GeoNames is a
geographical database containing over 8 million geo-
graphical names, including cities, towns, currencies,
mountains, etc. A comparison of some large datasets used
in LD can be found in [79].
The data in these datasets is used for: enriching educa-
tional content [10][12][80][81][82][83]; discovering new
information which can help educational practices
[22][84][26][24][85] and connecting local datasets to the
LOD cloud [86][87][88][89][90]. DBpedia, for instance, is
used in [73] to analyze universities ranking based on their
structured information. Robinson et al. [91] make use of
the categories provided by DBpedia to select the most
40
http://opendatainception.io/
41
http://wiki.dbpedia.org/
42
https://developers.google.com/freebase/
43
http://www.geonames.org/
44
http://sw.opencyc.org/
45
http://umbel.org/
46
https://musicbrainz.org/
47
http://www.uniprot.org/
48
http://bio2rdf.org/
49
https://www.wikidata.org/
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
8 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
suitable categories for describing learning objects.
Note that the most cited datasets are also known as
hubs in the LOD cloud, i.e., datasets that are interlinked
with others. Interlink datasets with DBpedia and other
hubs are essential to make a dataset findable to others.
Datasets from different domains which may be used in
educational settings include the dataset of TEDTalks
50
[92], which holds various conferences on a wide range of
topics, and data from museum collections (British Muse-
um
51
), cultural data (Europeana
52
), and libraries (Biblio-
thèque Nationale de France
53
).
Among the studies analyzed, some addressed the con-
struction and operation of datasets in the fields of medi-
cine, agriculture [93][94] and tourism [88]. Studies around
agriculture cited organic.edunet
54
, the NCBI Gene Expres-
sion Omnibus dataset
55
, Agris
56
, AGROVOC
57
, ASFA
58
and JITA
59
.
Datasets cited in the medical field included PubMed
60
and mEducator [62]. PubMed is a service of the US Na-
tional Library of Medicine which includes citations from
MEDLINE
61
and other scientific journals in life sciences.
The mEducator Linked Educational Resources dataset is
intended to provide educational resources in a LD format.
Resources for mEducator are focused on the medical
field, covering content ranging from traditional teaching
to open learning, as well as experimental studies. (Answer
R4C2)
Several datasets used in Education were also frequent-
ly cited in the studies surveyed, such as Open University
[95]. Open University publishes information about the
courses it offers through its website and makes them
available in RDF format.
OCW Consortium is one of the initiatives in the global
promotion of free and unrestricted access to academic
knowledge. To solve the problem of standardization and
interoperability of the linked datasets made available,
some studies [24][96] using LD are underway to create an
infrastructure to integrate and access content across dif-
ferent repositories of universities associated with OCW.
Another project cited by several studies [56][97]
[98][99][100][101] is LinkedUp
62
, which maintains a cata-
log and a data repository, both of which are relevant and
useful in the educational setting. The objectives of the
LinkedUp Dataset Catalog (or Linked Education Cloud)
are to collect and make available all types of data sources
relevant for Education, to provide a shared resource, and
to develop the community interested in the Web of Data
for Education [40]. The datasets are obtained from various
sources, usually originating from a university which pub-
50
http://data.linkededucation.org/request/ted/sparql
51
http://collection.britishmuseum.org/
52
http://labs.europeana.eu/api/linked-open-data-SPARQL-endpoint
53
http://data.bnf.fr/
54
http://data.organic-edunet.eu/sparql
55
https://www.ncbi.nlm.nih.gov/gds
56
http://202.45.139.84:10035/catalogs/fao/repositories/agris
57
http://202.45.139.84:10035/catalogs/fao/repositories/agrovoc
58
http://www4.fao.org/asfa/asfa.htm
59
https://datahub.io/dataset/jita
60
http://pubmed.bio2rdf.org/sparql
61
https://www.nlm.nih.gov/bsd/pmresources.html
62
http://data.linkededucation.org/linkedup/catalog/
lishes its data, resource repositories or research. There are
also government datasets which contribute significantly
to linked data sets related to Education, for example,
generating statistics about the records and results of edu-
cational institutions.
Universities are also evolving considering the publica-
tion of their data, including administrative, academic,
scientific, and educational information, as well as other
data. These initiatives generate linked datasets containing
data which can be reused. These linked datasets include
the University of Southampton
63
, Greek University Open
Data
64
and the Linking Italian University Statistics Pro-
ject
65
.
The LAK dataset
66
provides a collection of data ex-
tracted from publications in the field of learning analytics
and educational data mining. It is regularly updated with
data from the LAK and EDM conferences.
Although there are a wide variety and number of
linked datasets available, they significantly vary regard-
ing correctness, accessibility, and quality [102], posing
some challenges on the reuse, retrieval, and linkage of
existing linked datasets. According to Auer et al. [103],
approx. 70% of linked datasets listed in DataHub present
issues, such as unresponsive SPARQL endpoints, lack of
descriptive metadata and outdated linked datasets, that
makes linked datasets inaccessible or unreliable.
Due to these problems, there is a demand for (automat-
ic) approaches that constantly check the availability, qual-
ity, and reliability of linked datasets. Moreover, addition-
al information about the status of the dataset, topics cov-
ered and the ephemerality of the information may be
considered to help data consumers and maintainers.
5.2 Vocabularies
Ontologies have become popular in the educational
field primarily because of what they promise: a shared
and common understanding between people and applica-
tion systems, reusable domain knowledge, enable explicit
domain assumptions, separate the domain knowledge
from the operational knowledge, and a comprehensive
analysis of domain knowledge [104].
Tiropanis et al. [105] state that the Linked Data move-
ment is related to an alternative approach to the develop-
ing of semantic technologies. Briefly, LD gives priority to
exposing data sources using lightweight vocabularies
first, to enable data interoperability and integration before
considering more complex ontologies in the context of
specific applications.
Reusing a vocabulary or ontology is equivalent to
adopting a standard, i.e., a common language which peo-
ple and machines can understand. This kind of standardi-
zation is crucial to improving discovery and reuse of
educational data and resources, as well as allowing in-
teroperability between repositories [4]. The use of com-
mon vocabularies to describe the subjects being dealt
with allows the contents of these repositories to be global-
63
http://sparql.data.southampton.ac.uk/
64
http://www.auth.gr/sparql
65
http://sw.unime.it:8890/sparql
66
http://lak.linkededucation.org/
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
ly addressable, and to retrieve resources based primarily
on their relevance, without needing to consider their
origin [4].
The extraction of the main ontologies/vocabularies
used in the mapping studies provide support for answer-
ing the secondary question: What ontologies and vocabu-
laries are being used in the publication of Linked Data in
the context of Education? (SQ01).
The Dublin Core
67
is the most used vocabulary in the
analyzed proposals to describe of educational resources.
Other vocabularies for this same purpose are: LRMI
68
,
LOCO
69
, ALOCoM Core Ontology
70
(Educational re-
sources fragmentation) and Learning Object Metadata
Ontology (a mapping of IEEE LOM - Learning Object
Metadata
71
elements to RDF based on the Linked Data
principles).
Some proposals need to describe educational resources
of a specific domain, such as medicine, agriculture, arts,
chemistry or music. For this reason, many vocabularies
arise to represent characteristics of a domain, for example,
AGROVOC
72
, Body System
73
, BRO Biomedical Resource
Ontology
74
, ChEBI
75
, Galen
76
, MeSH
77
, Music Ontology
78
,
PubChem
79
.
The AIISO
80
vocabulary provides classes and properties
to describe the internal organizational structure of educa-
tional institutions. Like AIISO are the Bowlogna ontolo-
gy
81
, the Italian University Project (SWIUP) Ontology
82
and IntelLEO Organization Ontology
83
. Other vocabular-
ies and Ontologies can be found at Linked Open Vocabu-
laries (LOV)
84
, BioPortal
85
, JoinUp
86
, schema.org
87
and
https://linkededucation.wordpress.com/data-
models/schemas/.
Some recommendations are provided on the use of vo-
cabularies [4][35]: reuse of existing vocabularies, instead
of creating new ones; select vocabularies based on their
popularity, simplicity, consistency, scope, dataset com-
patibility and availability of documentation; and, where
necessary, create vocabulary mappings to reconcile terms.
Note that even though several educational datasets are
published following the LD principles, there are still a
few challenges concerning the vocabularies that may
hinder the creation of links inter datasets. For instance,
67
http://dublincore.org/specifications/
68
http://lrmi.net
69
http://jelenajovanovic.net/LOCO-Analyst/loco.html
70
http://jelenajovanovic.net/ontologies/loco/alocom-
core/spec/#overview
71
http://kmr.nada.kth.se/static/ims/md-lomrdf.html
72
http://aims.fao.org/vest-registry/vocabularies/agrovoc-
multilingual-agricultural-thesaurus
73
https://bioportal.bioontology.org/ontologies/ICD11-BODYSYSTEM
74
https://bioportal.bioontology.org/ontologies/BRO
75
http://www.ebi.ac.uk/chebi/
76
https://bioportal.bioontology.org/ontologies/GALEN
77
http://bioportal.bioontology.org/ontologies/MESH
78
http://musicontology.com/
79
https://pubchem.ncbi.nlm.nih.gov/rdf/rdf1.6.html
80
http://vocab.org/aiiso/schema
http://diuf.unifr.ch/main/xi/bowlogna
hema.rdf
84
http://lov.okfn.org/dataset/lov/
85
http://bioportal.bioontology.org/
86
https://joinup.ec.europa.eu/catalogue/repository
87
http://schema.org/docs/schemas.html
D’Aquin et al [106] show that many linked data sets use
specific ad-hoc vocabularies that are hardly adopted by
others. Moreover, general use vocabularies, such as
FOAF, Dublin Core and SKOS, may not provide descrip-
tive information about the nature of the dataset. Thus, the
choice of vocabularies to be used to represent the dataset
must be carefully taken, otherwise it may make the reuse,
enrichment, and linkage of educational resources more
difficult. Although many efforts [104][107] [108][109] have
been devoted to the development and use of ontologies in
the educational field, there is no ontology that fully de-
scribes the different domains and pedagogical issues
related to teaching and learning process.
6. CHALLENGES
Dietze et al. [33] performed a study on the challenges and
approaches involved in connecting educational resources,
investigating the wealth of LD which already exists on
the Web. It deals with approaches for publishing and
connecting educational resources and data under the LD
principles, and the possibility of enriching their content
via connections between linked data sets.
The authors discuss four main challenges to achieving
interoperability of educational resources on the Web: (1)
integrating data distributed among heterogeneous educa-
tional repositories; (2) dealing with continuous changes
existing in the repositories; (3) mediating and transform-
ing metadata which describes resources; and (4) enriching
and interlinking unstructured metadata.
Regarding the challenges mentioned above, four ap-
proaches are also presented for building what the authors
call Linked Education
88
. These are: (P1) Principle of
Linked Data: applied to modeling and displaying
metadata of educational resources and educational ser-
vices and APIs; (P2) Services integration: heterogeneous
and distributed learning repositories i.e. their Web inter-
faces (services); (P3) Schema matching: the task of auto-
matically finding correspondences between elements or
concepts between two or more data models to allow the
interoperability between the wide variety of learning
repositories and property educational schemas available
on the Web. (P4) Data Interlinking, Clustering and En-
richment: automated enrichment and grouping mecha-
nisms are investigated to interlink the data produced by
(P3) with existing linked data sets as part of the LD cloud.
Many of the challenges mentioned in the studies analyzed
here are in-line with those presented by Dietze et al.[33].
These are common challenges in the use of Linked Data in
various areas. In this paper we expand the discussion of
these challenges by focusing on the educational context.
Some other challenges reported by the proposals ana-
lyzed were also mapped.
Even with the significant increase in educational re-
sources available on the Web, reuse is still a challenge.
According to Dietze et al. [110] some of reasons include:
the lack of quality of resources as well as their annota-
tions, diversity of metadata standards and vocabularies,
88
http://linkededucation.org
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
10 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
accessibility of data and the often poorly maintained
metadata descriptions, and raising concerns with respect
to trust and reliability when attempting to reuse third-
party data and resources.
A common problem in various areas is the difficulty in
defining what can / should be made available. There are
privacy and ownership issues regarding educational data,
with considerable concern on exposure of data, because
data linked with other data might reveal sensitive and
confidential information. As addressed in
[111][32][76][51], this question is still a challenge requir-
ing further discussion and a major commitment by vari-
ous stakeholders and sectors.
Although the privacy of educational data, e. g. person-
al data, academic performance, grades, etc. is a concern,
the availability of these data is extremely important to
understand the educational context, examine successful
and unsuccessful strategies and promote an innovative
environment in educational area. For this reason, it is
essential to design solutions that allow to publish educa-
tional data, to maintain the anonymity of individuals and
/ or to publish data from multiple users, making it more
difficult to infer individual characteristics.
The effort required to publish data as LD, as well as
the computational costs involved in its consumption, is a
problem discussed by Vega-Gorgojo et al. [35]. Depend-
ing on the volume of the knowledge base, the potential
for interactivity in LD may require a high computational
cost for data processing and availability. The authors
argue that although the impact of this problem is still
considerable in the education sector, it is not limited to
this context. Therefore, improvement can be expected
because of advances in LD discussions including, for
example, advances in developing practical methods,
standards, and tools to assist the publishing and interlink-
ing processes. Low quality in publication and interlinking
of educational data and resources is still an open issue,
discussed by several authors [18][111][76][112][113][51].
Dealing with inaccurate, incomplete and unexpected
information from the Web of Data becomes a problem
when there is no control in the publishing process and the
nature of the data is very diverse.
In the educational context, continuous changes are oc-
curring in repositories. As the context is constantly chang-
ing throughout the learning activities, Navarrete and
Lujan-Mora [34] explain that adjustments are needed
continuously to correctly represent the state of how peo-
ple, groups and activities are organized. Controlling such
changes and keeping them updated in the linked data sets
is a major challenge in using Linked Data in Education.
Considering the continuous changes occurring in re-
positories, Vasconcellos et al. [114] discussed automatic
ontology refinement as one possible way to allow updates
to ontologies so that they can reflect updates to the do-
main. Ontology refinement is the process which aims to
keep ontology coherent with the domain that it repre-
sents.
The problem of dataset unavailability and several oth-
er challenging issues are discussed in [115][116]. The
unavailability of linked datasets was a problem noticed
while performing the mapping in the present study, since
many linked datasets mentioned in the studies being
analyzed are no longer available. The most common sta-
tus codes include failures in the server to provide a valid
response such as ‘500 Internal Server Error’, ‘502 Bad
Gateway’, or ‘503 Service Unavailable’ and ‘404 Not
Found’.
Low quality of data results in additional complexity to
retrieving data even when it is openly available using
standards facilitating its discovery
[18][111][76][89][51][88].
Both the low quality of the data and the unavailability
of some linked datasets can have important impacts in the
educational context. The Learning Analytics area requires
on reliable and available information to analyze educa-
tional performance, generate indicators and make deci-
sions. Unavailability of a dataset or lack of update may,
for example, generate false analyzes.
The use of common vocabularies and completeness in
publication would be a way to alleviate the problems of
information retrieval, standardization in publishing and
interoperability between different educational reposito-
ries. However, the diversity of data in the context of edu-
cation often means that the vocabularies which are
known and available are not sufficient to ensure quality
in publication while also respecting the specificity of var-
ious subdomains in Education. Therefore, there is a need
for understanding the vocabularies available for Educa-
tion.
Another challenge in the use of vocabularies in the
context of Education [76] involves the need to build do-
main ontologies for different target domains. In addition,
it is important to make these vocabularies available in
service annotation knowledge bases, significantly im-
proving vocabulary reuse. Mapping between narrower
vocabularies and broader, more widely used, vocabular-
ies is also discussed as an alternative to address the spe-
cifics of educational data.
The use of metadata standards in educational resource
repositories could be a facilitator in providing educational
resources as Linked Data and ensuring interoperability of
repositories. However, certain challenges remain, as dis-
cussed in [117][70][89]. While the main repositories have
metadata standards (Learning Object Metadata, Dublin
Core etc.), other repositories describe their resources us-
ing their methods, such as XML-based schemas and het-
erogeneous taxonomies. Consequently, the transfor-
mation of schema and data to RDF is a continuing prob-
lem. Thus, the translation of existing metadata standards
to RDF vocabularies is one of the main challenges.
The whole argument regarding new knowledge dis-
covery, interoperability between repositories and enrich-
ment of educational resources is supported by the possi-
bility of linking data from different linked data sets. The
data interlinking step also presents various challenges
which are related, for example, to identifying correspond-
ing instances, discovering semantic relations between
data, scalability, and need for developments in tools and
practices. These are challenges that still need to be ad-
dressed in the LD area. The evolution of research will
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
undoubtedly have an impact in the context of Education.
A far-ranging discussion regarding the challenges in-
volved in data interlinking can be found in [67][65]. As-
pects which are more specific to the context of Education
are addressed in [89][52]. The authors explain that the
existing metadata for educational resources is usually
provided based on informal and unstructured data, and
the use of controlled vocabularies is limited and frag-
mented. To enable machine processing and interoperabil-
ity, educational metadata needs to be enriched, trans-
forming it into structured and formal descriptions, and
linking it to LD vocabularies and linked datasets widely
established on the Web.
7. FINAL CONSIDERATIONS
The LD area is constantly evolving and many challenges
continue to motivate the development of research in this
field. The proposals analyzed in this work showed that
there is an effort by different institutions to apply the LD
principles and technologies to solve known problems
around Educational Technologies, such as interoperabil-
ity of educational data and resources, access to data for
various analyses, enrichment of material, and recommen-
dation and personalization content.
The study conducted in the present work has aimed to
provide a mapping of the Linked Data research scenario
within the context of Education. For this purpose, this
literature review has attempted to identify the proposals,
the objectives for using LD, the tools, linked datasets and
vocabularies used on LD in Education. The primary goal
was to provide an overview of how this topic is being
investigated, while also guiding future decisions on pro-
jects seeking to adopt LD in Education. Analyzing a larg-
er volume of papers (114 out of 1,384 pre-selected papers)
also made it possible to obtain evidence regarding how
the results of the recent researches in Semantic Technolo-
gies in education, more specifically the movement of
linked data, are being applied in practice. Thus, it is pos-
sible to understand which approaches, technologies,
tools, ontologies and linked datasets are being adopted
and what challenges are encountered by the projects.
In addition, the present study has identified challenges
currently faced in each stage of the process, while also
providing an overview of the current landscape of the
area, which can serve to guide future work and suggest
possible avenues for further research.
ACKNOWLEDGMENT
This work was partially supported by FAPERJ (through
grant E-26-102.256/2013 - BBP/Bursary Associa: Explor-
ing a Semantic and Social Learning-Teaching Environ-
ment), CNPQ (project: 312039/2015-8 DT/Bursary Inte-
grating Pedagogical Practices and Methods and Tools of
Educational Data Analysis) and the National Institute of
Science and Technology (INCT) on Web Science.
REFERENCES
[1] C. Bizer, T. Heath, and T. Berners-lee, Linked Data - The
Story So Far, Int. J. Semant. Web Inf. Syst., vol. 5, no. 3, pp.
122, 2009.
[2] T. Berners-Lee, Linked Data,
http://www.w3.org/DesignIssues/LinkedData.html, 2006.
[Online]. Available:
http://www.w3.org/DesignIssues/LinkedData.html.
[3] O. Lassila and Ralph R. Swick, Resource Description
Framework \uppercase{(RDF)} Model and Syntax
Specification, no. February, 1999.
[4] M. DAquin, Linked Data for Open and Distance
Learning, 2012.
[5] F. Zablith, M. Fernandez, and M. Rowe, Production and
consumption of university Linked Data, Interact. Learn.
Environ., vol. 23, no. 1, pp. 5578, 2015.
[6] S. Kagemann, MOOCLink: Building and Utilizing Linked
Data from Massive Open Online Courses, pp. 373380,
2015.
[7] G. Pirrotta, Linking Italian university statistics, Proc. 6th
Int. Conf. Semant. Syst. - I-SEMANTICS 10, p. 1, 2010.
[8] G. G. Juanes, A. R. Barrios, J. L. R. García, L. G. Medina, R.
D. Adán, and P. G. Yanes, Linked data strategy to achieve
interoperability in higher education, WEBIST 2014 - Proc.
10th Int. Conf. Web Inf. Syst. Technol., vol. 2, pp. 5764, 2014.
[9] F. Zablith, Interconnecting and Enriching Higher
Education Programs Using Linked Data, Proc. 24th Int.
Conf. World Wide Web, pp. 711716, 2015.
[10] E. Mudu, L. Schiatti, G. Rizzo, and A. Servetti, Zenaminer:
Driving the SCORM standard towards the web of data,
CEUR Workshop Proc., vol. 717, pp. 115, 2011.
[11] N. Piedra and J. Chicaiza, Using linked open data to
improve the search of open educational resources for
engineering students, pp. 68, 2013.
[12] H. Q. Yu, C. Pedrinaci, S. Dietze, and J. Domingue, Using
linked data to annotate and search educational video
resources for supporting distance learning, IEEE Trans.
Learn. Technol., vol. 5, no. 2, pp. 130142, 2012.
[13] G. Gutiérrez-carreón, T. Daradoumis, and J. Jorba,
Integrating Learning Services in the Cloud: An Approach
that Benefits Both Systems and Learning, Educ. Technol.
Soc., vol. 18, pp. 145157, 2015.
[14] Adolfo Ruiz-Calleja, G. Vega-Gorgojo, J. I. Asensio-Pérez,
M. L. Bote-Lorenzo, E. Gómez-Sánchez, and C. Alario-
Hoyos, A Linked Data approach for the discovery of
educational ICT tools in the Web of Data, Comput. Educ.,
vol. 59, no. 3, pp. 952962, 2012.
[15] E. Kaldoudi, N. Dovrolis, D. Giordano, S. Dietze, and M.
Keynes, Educational Resources as Social Objects in
Semantic Social Networks, 2011.
[16] C. Kiourt, A. Koutsoudis, F. Arnaoutoglou, G. Petsa, S.
Markantonatou, and G. Pavlidis, A dynamic web-based
3D virtual museum framework based on open data, 2015
Digit. Herit., vol. 2, pp. 647650, 2015.
[17] M. Fernandez, M. DAquin, and E. Motta, Linking data
across universities: An integrated video lectures dataset,
Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif.
Intell. Lect. Notes Bioinformatics), vol. 7032 LNCS, no. PART
2, pp. 4964, 2011.
[18] E. Otero-García, J. C. Vidal, M. Lama, A. Bugarín, and J.
Domenech, A context-based algorithm for annotating
educational content with Linked Data, in 1st Workshop on
Mining the Future Internet, 2010, vol. 685.
[19] G. Á. Rey, I. Celino, P. Alexopoulos, D. Damljanovic, M.
Damova, N. Li, and V. Devedzic, Semi-automatic
generation of quizzes and learning artifacts from linked
data, CEUR Workshop Proc., vol. 840, pp. 68, 2012.
[20] A. Carbonaro, Interlinking e-learning resources and the
web of data for improving student experience, J. E-
Learning Knowl. Soc., vol. 8, no. 2, pp. 3344, 2012.
[21] J. Y. Wu and C. Le Wu, The Study of User Model of
Personalized Recommendation System Based on Linked
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
12 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
Course Data, Appl. Mech. Mater., vol. 519520, pp. 1609
1612, Feb. 2014.
[22] J. Chicaiza, N. Piedra, J. Lopez, and E. T. Caro, Promotion
of self-learning by means of Open Educational Resources
and semantic technologies, 2015 Int. Conf. Inf. Technol.
Based High. Educ. Training, ITHET 2015, pp. 16, 2015.
[23] S. Softic, L. De Vocht, B. Taraghi, M. Ebner, E. Mannens,
and R. V De Walle, Leveraging Learning Analytics in a
Personal Learning Environment using Linked Data, Bull.
IEEE Tech. Comm. Learn. Technol., vol. 16, no. 4, pp. 1013,
2014.
[24] N. Piedra, J. López, C. Jorge, and E. Tovar, Seeking Open
Educational Resources to Compose Massive Open Online
Courses in Engineering Education An Approach Based on
Linked Open Data, J. Univers. Comput. Sci., vol. 21, no. 5,
pp. 679711, 2015.
[25] M. Zotou, A. Papantoniou, K. Kremer, V. Peristeras, and E.
Tambouris, Implementing Rethinking Education :
Matching Skills Profiles with Open Courses through Linked
Open Data technologies, vol. 16, no. 4, 2014.
[26] J. Lopez-Vargas, N. Piedra, J. Chicaiza, and E. Tovar, OER
Recommendation for Entrepreneurship Using a Framework
Based on Social Network Analysis, Rev. Iberoam. Tecnol. del
Aprendiz., vol. 10, no. 4, pp. 262268, 2015.
[27] N. Piedra, J. Chicaiza, J. López, and E. Tovar Caro,
Supporting openness of MOOCs contents through of an
OER and OCW framework based on Linked Data
technologies, in IEEE Global Engineering Education
Conference, EDUCON, 2014, pp. 11121117.
[28] D. Taibi and S. Dietze, Fostering analytics on learning
analytics research: The LAK dataset, CEUR Workshop Proc.,
vol. 974, pp. 57, 2013.
[29] R. A. Maturana, M. E. Alvarado, S. Lopez-Sola, M. J. Ibanez,
and L. R. Elosegui, Linked Data based applications for
Learning Analytics Research: Faceted searches, enriched
contexts, graph browsing and dynamic graphic
visualisation of data, CEUR Workshop Proc., vol. 974, 2013.
[30] B. E. Penteado, Correlational Analysis Between School
Performance and Municipal Indicators in Brazil Supported
by Linked Open data, pp. 507512, 2016.
[31] C. Keßler, M. DAquin, and S. Dietze, Linked Data for
Science and Education, Semant. Web, vol. 1, pp. 15, 2009.
[32] K. Krieger and D. Rösner, Linked Data in E-Learning: A
Survey, Semant. Web 0, vol. 0, pp. 19, 2011.
[33] S. Dietze, S. Sanchez-Alonso, H. Ebner, H. Q. Yu, D.
Giordano, I. Marenzi, and B. P. Nunes, Interlinking
educational resources and the web of data: A survey of
challenges and approaches, Program, vol. 47, no. 1, pp. 60
91, 2013.
[34] R. Navarrete and S. Lujan-Mora, Use of Linked Data to
enhance Open Educational Resources, Inf. Technol. Based
High. Educ. Train. (ITHET), 2015 Int. Conf. on. IEEE, pp. 16,
2015.
[35] G. Vega-Gorgojo, J. I. Asensio-Pérez, E. Gómez-Sánchez, M.
L. Bote-Lorenz, J. A. Muñoz-Cristóbal, and A. Ruiz-Calleja,
A review of linked data proposals in the learning
domain, J. Univers. Comput. Sci., vol. 21, no. 2, pp. 326364,
2015.
[36] B. Kitchenham and S. Charters, Guidelines for performing
Systematic Literature Reviews in Software Engineering,
Engineering, vol. 2, p. 1051, 2007.
[37] J. Biolchini, P. G. Mian, A. Candida, and C. Natali,
Systematic Review in Software Engineering, Engineering,
vol. 679, no. May, pp. 165176, 2005.
[38] H. Qing, S. Dietze, D. Giordano, D. Taibi, E. Kaldoudi, and
N. Dovrolis, Linked education: interlinking educational
resources and the web of data, 27th Annu. ACM Symp.
Appl. Comput., pp. 366371, 2012.
[39] D. Mouromtsev and M. DAquin, Eds., Open Data for
Education: Linked, Shared, and Reusable Data for Teaching and
Learning, vol. 9500. Cham: Springer International
Publishing, 2016.
[40] M. DAquin, On the Use of Linked Open Data in
Education: Current and Future Practices, Lect. Notes
Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect.
Notes Bioinformatics), vol. 9500, pp. 153165, 2016.
[41] N. Piedra, J. Chicaiza, J. López, and E. Tovar, Using linked
open data to improve the search of open educational
resources for engineering students, Front. Educ. Conf., pp.
558560, 2013.
[42] F. Zablith, M. DAquin, S. Brown, and L. Green-Hughes,
The Open University s repository of research
publications Consuming Linked Data within a Large
Educational Organization Conference Item, in Second
International Workshop on Consuming Linked Data (COLD) at
10th International SemanticWeb Conference, 2011, pp. 2327.
[43] C. Bizer and R. Cyganiak, D2RQ Lessons Learned, in
W3C Workshop on RDF Access to Relational Databases, 2007,
pp. 110.
[44] E. Rajabi, M.-A. Sicilia, and S. Sanchez-Alonso,
Discovering duplicate and related resources using an
interlinking approach: The case of educational datasets, J.
Intell. Mater. Syst. Struct., vol. 26, no. 5, pp. 599613, 2015.
[45] N. Dovrolis, T. Stefanut, S. Dietze, H. Q. Yu, C. Valentine,
and E. Kaldoudi, Semantic annotation and linking of
medical educational resources, IFMBE Proc., vol. 37, pp.
14001403, 2011.
[46] S. S. Sahoo, W. Halb, S. Hellmann, K. Idehen, T. Thibodeau
Jr, S. Auer, J. Sequeda, and A. Ezzat, A survey of Current
approaches for mapping of relational databases to RDF,
Web Semant. Sci. Serv. Agents World Wide Web, vol. 6, no. 1,
p. 15, 2009.
[47] S. Das, S. Sundara, and R. Cyganiak, R2RML: RDB to RDF
Mapping Language, W3C, 2012. [Online]. Available:
https://www.w3.org/TR/2012/PR-r2rml-20120814/.
[48] N. Thangsupachai, S. Niwattanakul, and N. Chamnongsri,
Learning Object Metadata Mapping for Linked Open
Data, Emerg. Digit. Libr. - Res. Pract., vol. 8839, no. 2, pp.
122129, 2014.
[49] K. Rohloff, M. Dean, I. Emmons, D. Ryder, and J. Sumner,
An Evaluation of Triple-Store Technologies for Large Data
Stores, Move to Meaningful Internet Syst. 2007 OTM 2007
Work., vol. 4806, pp. 11051114, 2007.
[50] W3C, LargeTripleStores, 2016. [Online]. Available:
https://www.w3.org/wiki/LargeTripleStores. [Accessed:
28-Nov-2016].
[51] W. Alcantara, J. Bandeira, A. Barbosa, and A. Lima,
Desafios no uso de Dados Abertos Conectados na
Educação Brasileira, An. do 4o DesafIE - Work. Desafios da
Comput. Apl. à Educ., 2015.
[52] S. Dietze, H. E. Salvador Sanchez-Alonso, H. Q. Yu, D.
Giordano, I. Marenzi, and B. P. Nunes, Interlinking
educational resources and the web of data: A survey of
challenges and approaches, Emerald insight, vol. 29, no. 5,
pp. 494519, 2009.
[53] A. Zouaq, J. Jovanovic, S. Joksimovic, and D. Gasevic,
Linked Data for Learning Analytics: Potentials and
Challenges, Handb. Learn. Anal., pp. 347355, 2017.
[54] D. Taibi, B. Fetahu, and S. Dietze, Towards integration of
web data into a coherent educational data graph, Proc.
22nd Int. Conf. World Wide Web companion, pp. 419424, 2013.
[55] H. Q. Yua, S. Dietzeb, D. Taibic, D. Giordanod, E.
Kaldoudie, and J. Domingue, A Linked Dataset of Medical
Educational Resources, Br. J. Educ. Technol., vol. 46, no. 5,
pp. 11231129, 2015.
[56] H. Hajri, Y. Bourda, and F. Popineau, Querying
Repositories of OER Descriptions: The Challenge of
Educational Metadata Schemas Diversity, Des. Teach.
Learn. a Networked World, vol. 9307, pp. 582586, 2015.
[57] X. Ma, Knowledge point-based approach to interlink open
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
education resources, Lect. Notes Comput. Sci. (including
Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol.
7882 LNCS, pp. 717721, 2013.
[58] E. Rajabi, M.-Á. Sicilia, and S. Sánchez-Alonso,
Interlinking Educational Data: An Experiment with
GLOBE Resources, TEEM 13 Proc. First Int. Conf. Technol.
Ecosyst. Enhancing Multicult., pp. 339348, 2013.
[59] M. Hausenblas, W. Halb, and Y. Raimond, Scripting user
contributed interlinking, CEUR Workshop Proc., vol. 368,
2008.
[60] F. Scharffe, Y. Liu, and C. Zhou, Rdf-ai: an architecture for
rdf datasets matching, fusion and interlink, Proc. IJCAI
2009 Work. Identity, Ref. , 2009.
[61] J. Lehmann, S. Auer, S. Capadisli, K. Janowicz, C. Bizer, T.
Heath, A. Hogan, and T. Berners-Lee, LDOW2017: 10th
Workshop on Linked Data on the Web, Proc. 26th Int. Conf.
World Wide Web Companion, pp. 16791680, 2017.
[62] S. T. Konstantinidis, L. Ioannidis, D. Spachos, C. Bratsas,
and P. D. Bamidis, MEducator 3.0: Combining semantic
and social web approaches in sharing and retrieving
medical education resources, Proc. - 2012 7th Int. Work.
Semant. Soc. Media Adapt. Pers. SMAP 2012, pp. 4247, Dec.
2012.
[63] L. A. P. P. Leme, G. R. Lopes, B. P. Nunes, M. A. Casanova,
and S. Dietze, Identifying candidate datasets for data
interlinking, Int. Conf. Web Eng., vol. 7977 LNCS, pp. 354
366, 2013.
[64] G. R. Lopes, L. Andr, G. Rabello Lopes, L. Paes Leme, B.
Pereira Nunes, M. Casanova, and S. Dietze, Two
Approaches to the Dataset Interlinking Recommendation
Problem, Web Inf. Syst. Eng. WISE 2014 SE - 25, vol. 8786,
pp. 324339, 2014.
[65] S. Wölger, C. Hofer, K. Siorpaes, S. Thaler, E. Simperl, and
T. Bürger, Interlinking data-approaches and tools,
Innsbruck - Austria, 2011.
[66] E. Rajabi, M.-A. Sicilia, and S. Sanchez-Alonso, An
empirical study on the evaluation of interlinking tools on
the Web of Data, J. Inf. Sci., vol. 40, no. 5, pp. 637648, Oct.
2014.
[67] K. Nguyen, R. Ichise, and B. Le, Interlinking Linked Data
Sources Using a Domain-Independent System, in Joint
International Semantic Technology Conference, 2013, pp. 113
128.
[68] L. von Ahn, Games with a Purpose, Computer (Long.
Beach. Calif)., vol. 39, no. 6, pp. 9294, 2006.
[69] O. Hassanzadeh, R. Xin, R. J. Miller, A. Kementsietsidis, L.
Lim, and M. Wang, Linkage Query Writer, Proc. VLDB
Endow., vol. 2, no. 2, pp. 15901593, 2009.
[70] F. Zablith, M. DAquin, S. Brown, and L. Green-Hughes,
Consuming Linked Data within a large educational
organization, CEUR Workshop Proc., vol. 782, 2011.
[71] M. Foulonneau, Generating educational assessment items
from linked open data: The case of DBpedia, Lect. Notes
Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect.
Notes Bioinformatics), vol. 7117 LNCS, pp. 1627, 2012.
[72] V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-vera,
E. Motta, and F. Ciravegna, Semantic annotation for
knowledge management: Requirements and a survey of the
state of the art, Web Semant. Sci. Serv. Agents World Wide
Web, vol. 4, no. 1, pp. 1428, 2006.
[73] R. Meymandpour and J. G. Davis, Ranking Universities
Using Linked Open Data, J. Stud. Int. Educ., vol. 18, no. 2,
pp. 318327, 2007.
[74] B. P. Nunes, R. Kawase, B. Fetahu, M. A. Casanova, and G.
H. B. de Campos, Educational forums at a glance: Topic
extraction and selection, Lect. Notes Comput. Sci. (including
Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol.
8787, pp. 351364, 2014.
[75] J. Chae, Y. Cho, M. Lee, S. Lee, M. Choi, and S. Park,
Design and implementation of a system for creating
multimedia linked data and its applications in education,
Multimed. Tools Appl., 2015.
[76] Z. Jeremić, J. Jovanović, and D. Gašević, Personal Learning
Environments on the Social Semantic Web, Semant. web,
vol. 4, no. 1, pp. 2351, 2013.
[77] A. S. Dadzie and M. Rowe, Approaches to visualising
Linked Data: A survey, Semant. Web, vol. 2, no. 2, pp. 89
124, 2011.
[78] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak,
and Z. Ives, DBpedia: A Nucleus for a Web of Open Data,
Semant. Web, vol. 4825, pp. 722735, 2007.
[79] M. Färber, B. Ell, C. Menne, and A. Rettinger, A
Comparative Survey of DBpedia , Freebase , vol. 1, pp. 1
5, 2015.
[80] V. Vasiliev, F. Kozlov, D. Mouromtsev, S. Stafeev, and O.
Parkhimovich, ECOLE: An Ontology-Based Open Online
Course Platform, in Lecture Notes in Computer Science
(including subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics), vol. 9500, 2016, pp. 4166.
[81] T. Kelkar, A. Ray, and V. Choppella, SangeetKosh: An
open web platform for music education, Proc. - IEEE 15th
Int. Conf. Adv. Learn. Technol. Adv. Technol. Support. Open
Access to Form. Informal Learn. ICALT 2015, pp. 59, 2015.
[82] F. Moritz, M. Siebert, and C. Meinel, Improving Search in
Tele-Lecturing: Using Folksonomies as Trigger to Query
Semantic Datasets to extract additional metadata, Proc. Int.
Conf. Web Intell. Min. Semant. - WIMS 11, 2011.
[83] J. Robinson, J. Stan, and M. Ribière, Using linked data to
reduce learning latency for e-book readers, Lect. Notes
Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect.
Notes Bioinformatics), vol. 7117 LNCS, no. January, pp. 28
34, 2012.
[84] C. Daraio and A. Bonaccorsi, Beyond university rankings?
Generating new indicators on universities by linking data
in open platforms, J. Assoc. Inf. Sci. Technol., vol. 14, no. 4,
p. n/a-n/a, Mar. 2016.
[85] J. Y. Wu and C. L. Wu, The study of user model of
personalized recommendation system based on linked
course data, Appl. Mech. Mater., vol. 519520, no.
Computer and Information Technology, p. 1607, 2014.
[86] G. Kobilarov, T. Scott, Y. Raimond, S. Oliver, C. Sizemore,
M. Smethurst, C. Bizer, and R. Lee, Media meets semantic
web - How the bbc uses dbpedia and linked data to make
connections, Lect. Notes Comput. Sci. (including Subser. Lect.
Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 5554 LNCS,
pp. 723737, 2009.
[87] J. Chicaiza, N. Piedra, J. López-Vargas, and Tovar-
Edmundo, Domain Categorization of Open Educational
Resources Based on Linked Data, Knowl. Eng. Semant. Web,
pp. 1528, 2014.
[88] A. M. Fermoso, M. Mateos, M. E. Beato, and R. Berjón,
Open linked data and mobile devices as e-tourism tools. A
practical approach to collaborative e-learning, Comput.
Human Behav., vol. 51, pp. 618626, 2015.
[89] S. Dietze, E. Kaldoudi, N. Dovrolis, D. Giordano, C.
Spampinato, M. Hendrix, A. Protopsaltis, D. Taibi, and H.
Q. Yu, Socio-semantic integration of educational resources
- The case of the mEducator project, J. Univers. Comput.
Sci., vol. 19, no. 11, pp. 15431569, 2013.
[90] D. Hladky, LOD Russia Enabling Russian National
Knowledge with Scientific Open Data Research challenges,
pp. 15, 2011.
[91] M. Lama, J. C. Vidal, E. Otero-Garc??a, A. Bugar??n, and S.
Barro, Semantic linking of learning object repositories to
DBpedia, Educ. Technol. Soc., vol. 15, no. 4, pp. 4761, 2012.
[92] D. Taibi, S. Chawla, S. Dietze, I. Marenzi, and B. Fetahu,
Exploring TED talks as linked data for education, Br. J.
Educ. Technol., vol. 46, no. 5, pp. 10921096, 2015.
[93] E. Rajabi, S. Sanchez-Alonso, M. A. Sicilia, and N.
Manouselis, A linked and open dataset from a network of
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
14 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
learning repositories on organic agriculture, Br. J. Educ.
Technol., 2015.
[94] M.-A. Sicilia, H. Ebner, S. Sánchez-Alonso, A. Abián, and E.
García-Barriocanal, Navigating learning resources through
linked data: a preliminary report on the re-design of
Organic . Edunet, Proc. Linked Learn., vol. 2011, p. 1st, 2011.
[95] H. Qing, S. Dietze, D. Giordano, D. Taibi, E. Kaldoudi, and
N. Dovrolis, The Open University s repository of
research publications Linked education: interlinking
educational resources and the web of data, in The 27th
ACM Symposium On Applied Computing (SAC-2012), Special
Track on Semantic Web and Applications, 2012.
[96] N. Piedra, E. Tovar, R. Colomo-Palacios, J. Lopez-Vargas,
and J. Alexandra Chicaiza, Consuming and producing
linked open data: the case of OpenCourseWare, Progr.
Electron. Libr. Inf. Syst., vol. 48, no. 1, pp. 1640, 2014.
[97] D. Taibi, S. Dietze, B. Fetahu, and G. Fulantelli, Exploring
type-specific topic profiles of datasets: a demo for
educational linked data, 2011.
[98] M. Coccoli and I. Torre, Interacting with annotated objects
in a semantic web of things application, J. Vis. Lang.
Comput., vol. 25, no. 6, pp. 10121020, 2014.
[99] A. Mikroyannidis and J. Domingue, Interactive learning
resources and linked data for online scientific
experimentation, Proc. 22nd Int. Conf. World Wide Web
companion, pp. 431434, 2013.
[100] S. Zerr, M. DAquin, I. Marenzi, D. Taibi, A. Adamou, and
S. Dietze, Towards analytics and collaborative exploration
of social and linked media for technology-enchanced
learning scenarios, CEUR Workshop Proc., vol. 1151, 2014.
[101] M. Coccoli and I. Torre, Interaction with objects and
objects annotation in the semantic web of things, Proc.
DMS 2014 - 20th Int. Conf. Distrib. Multimed. Syst., pp. 383
390, 2014.
[102] C. Guéret, P. Groth, C. Stadler, and J. Lehmann, Assessing
linked data mappings using network measures, Simperl,
E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V., vol. 7295
LNCS, pp. 87102, 2012.
[103] S. Auer, I. Ermilov, J. Lehmann, and M. Martin,
LODStats. [Online]. Available: http://stats.lod2.eu/.
[Accessed: 26-Apr-2017].
[104] D. Kanellopoulos, Ontology-based Learning Applications:
A Development Methodology., Conf. Softw. , no.
JANUARY, 2006.
[105] T. Tiropanis, H. C. Davis, S. A. Cerri, T. Tiropanis, H. C.
Davis, S. A. Cerri, and S. Technologies, Semantic
Technologies and Learning, pp. 30293032, 2012.
[106] M. dAquin, A. Adamou, and S. Dietze, Assessing the
educational linked data landscape, Proc. 5th Annu. ACM
Web Sci. Conf. - WebSci 13, pp. 4346, 2013.
[107] P. MULHOLLAND, Z. ZDRAHAL, J. DOMINGUE, M.
HATALA, and A. BERNARDI, A methodological
approach to supporting organizational learning, Int. J.
Hum. Comput. Stud., vol. 55, no. 3, pp. 337367, 2001.
[108] R. Mizoguchi and G. H. Canada, Using Ontological
Engineering to Overcome Common AI-ED Problems, J.
Artif. Intell. Educ., vol. 11, pp. 107121, 2000.
[109] V. Devedzi, Education and the Semantic Web, Int. J. Artif.
Intell. Educ., vol. 14, pp. 3965, 2004.
[110] S. Dietze, D. Taibi, and P. Barker, Analysing and
Improving Embedded Markup of Learning Resources on
the Web, pp. 283292, 2017.
[111] M. Svensson, A. Kurti, and M. Milrad, Enhancing
emerging learning objects with contextual metadata using
the linked data approach, 6th IEEE Int. Conf. Wireless, Mob.
Ubiquitous Technol. Educ., pp. 5056, Apr. 2010.
[112] J. C. Vidal, M. Lama, E. Otero-García, and A. Bugarín,
Graph-based semantic annotation for enriching
educational content with linked data, Knowledge-Based
Syst., vol. 55, pp. 2942, 2014.
[113] N. Bassiliades, Collecting University Rankings for
Comparison Using Web Extraction and Entity Linking
Techniques, Commun. Comput. Inf. Sci., vol. 469, pp. 2346,
2014.
[114] S. Vasconcellos, K. Revoredo, and F. Bai, How Can
Ontology Design Patterns Help Ontology Refinement?,
vol. 12, pp. 416, 2014.
[115] A. Hogan, P. Hitzler, and K. Janowicz, Linked Dataset
Description Papers at the Semantic Web Journal: A Critical
Assessment, vol. 7, pp. 13, 2016.
[116] F. Radulovic, N. Mihindukulasooriya, R. García-Castro, and
A. Gómez Pérez, A comprehensive quality model for
Linked Data, Undefined, vol. 1, no. 0, pp. 15, 2015.
[117] H. Q. Yu, S. Dietze, N. Li, C. Pedrinaci, D. Taibi, N.
Dovrolls, T. Stefanut, E. Kaldoudi, and J. Domingue, The
Open University s repository of research publications A
linked data-driven & service-oriented architecture for
sharing educational resources, 1st Int. Work. eLearning
Approaches Linked Data Age (Linked Learn. 2011), 8th Ext.
Semant. Web Conf., 2011.
Crystiam Kelle Pereira e Silva
http://lattes.cnpq.br/1675106850769427
Crystiam is PhD Candidate at the
Graduate Program of Informatics at the
Federal University of the State of Rio
de Janeiro (UNIRIO). She also works as
system analyst at the Federal
University of Juiz de Fora, UFJF. Her
main interests lie in the Semantic Web
and Distance Learning fields.
Sean Wolfgand Matsui Siqueira
http://lattes.cnpq.br/2562652838103607
DSc in Computer Science, Sean is an
associate professor at the Federal
University of the State of Rio de
Janeiro (UNIRIO). He is editor-in-chief
of the RBIE: Brazilian Journal on
Computers in Education and co-editor
of the iSYS: Brazilian Journal of
Information Systems. He is the
founder and coordinator of the
Semantics and Learning research
group and is a member of the special committees on
Computers and Education (CEIE) and on Information
Systems (CESI), both from the Brazilian Computer Society
(SBC). His research interests are knowledge
representation, web science (including social and
semantic web) and advanced technologies for teaching
and learning.
Bernardo Pereira Nunes
http://lattes.cnpq.br/1728746187630338
DSc in Computer Science, Bernardo
is researcher and adjunct professor
at the Pontifical University Catholic
of Rio de Janeiro (PUC-Rio) and
collaborator at the Graduate
Program of Informatics at the
Federal University of the State of
Rio de Janeiro (UNIRIO). He also
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
works as senior systems analyst at the Center of Distance
Learning, PUC-Rio and is associate editor of the RBIE:
Brazilian Journal on Computers in Education. He has
experience in Computer Science, focusing on Semantic
Web, Web Science and Distance Learning.
Stefan Dietze
https://stefandietze.wordpress.com/
Research group leader at the L3S
Research Center of the Leibniz
University Hanover, Germany,
following previous positions at the Knowledge Media
Institute (KMI) of The Open University (UK) and the
Fraunhofer Institute for Software and Systems
Engineering (ISST, now part of Fraunhofer FOKUS).
Stefan’s interests are in the fields of web & data science,
artificial intelligence and knowledge engineering &
discovery, and in particular, the extraction and fusion of
knowledge from heterogeneous Web data in a variety of
application domain.
1939-1382 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TLT.2017.2787659,
IEEE Transactions on Learning Technologies
16 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID
... The authors of the cited study provided a methodical mapping of recommendations for using linked data to assist educational goals. (Pereira, Siqueira, Nunes, & Dietze, 2017) ...
Article
Full-text available
Big data technologies have facilitated innovation across various industries, including healthcare, technology, and education. The utilization of this innovation is now required within the education sector at all levels. It is a progressing framework named as "Education 4.0" to accommodate the numerous needs of this sector. Specifically, this article first describes the architecture of big data and its properties compatible to Education sector. The process of utilizing large educational data and data mining techniques in education are explained. To find solutions to diverse educational problems, an ever-increasing variety of mining tools have been assembled. This study also provides insights into these techniques and other data sources for gathering educational data. Several applications have been developed to promote the use of big data techniques in higher education and highlight its benefits for the sector, but there are still a number of issues that need to be considered. We will highlight the obstacles of using big data techniques in higher education.
... Struggling with confusing online content (e.g., learning content of low quality on social media), students face major challenges in acquiring significant knowledge efficiently. Therefore, researchers have focused on improving online learning environments by constructing education-efficient knowledge graphs (d 'Aquin, 2016;Pereira et al, 2017). For example, to facilitate online learning and establish connections between formal learning and social media, Zablith (Zablith, 2022) proposed to construct a knowledge graph by integrating social media and formal educational content, respectively. ...
Preprint
Full-text available
With the explosive growth of artificial intelligence (AI) and big data, it has become vitally important to organize and represent the enormous volume of knowledge appropriately. As graph data, knowledge graphs accumulate and convey knowledge of the real world. It has been well-recognized that knowledge graphs effectively represent complex information; hence, they rapidly gain the attention of academia and industry in recent years. Thus to develop a deeper understanding of knowledge graphs, this paper presents a systematic overview of this field. Specifically, we focus on the opportunities and challenges of knowledge graphs. We first review the opportunities of knowledge graphs in terms of two aspects: (1) AI systems built upon knowledge graphs; (2) potential application fields of knowledge graphs. Then, we thoroughly discuss severe technical challenges in this field, such as knowledge graph embeddings, knowledge acquisition, knowledge graph completion, knowledge fusion, and knowledge reasoning. We expect that this survey will shed new light on future research and the development of knowledge graphs.
... Recently, LD and open data techniques seem very promising in HE and propose notable research in this area. Since 2009, LD has been established by educational domains to be used in many aspects to overcome many challenges [13]. ...
Article
Full-text available
In recent years, educational institutions have worked hard to automate their work using more trending technologies that prove the success in supporting decision-making processes. Most of the decisions in educational institutions rely on rating the academic research profiles of their staff. An enormous amount of scholarly data is produced continuously by online libraries that contain data about publications, citations, and research activities. This kind of data can change the accuracy of the academic decisions, if linked with the local data of universities. The linked data technique in this study is applied to generate a link between university semantic data and a scientific knowledge graph, to enrich the local data and improve academic decisions. As a proof of concept, a case study was conducted to allocate the best academic staff to teach a course regarding their profile, including research records. Further, the resulting data are available to be reused in the future for different purposes in the academic domain. Finally, we compared the results of this link with previous work, as evidence of the accuracy of leveraging this technology to improve decisions within universities.
Article
Full-text available
With the explosive growth of artificial intelligence (AI) and big data, it has become vitally important to organize and represent the enormous volume of knowledge appropriately. As graph data, knowledge graphs accumulate and convey knowledge of the real world. It has been well-recognized that knowledge graphs effectively represent complex information; hence, they rapidly gain the attention of academia and industry in recent years. Thus to develop a deeper understanding of knowledge graphs, this paper presents a systematic overview of this field. Specifically, we focus on the opportunities and challenges of knowledge graphs. We first review the opportunities of knowledge graphs in terms of two aspects: (1) AI systems built upon knowledge graphs; (2) potential application fields of knowledge graphs. Then, we thoroughly discuss severe technical challenges in this field, such as knowledge graph embeddings, knowledge acquisition, knowledge graph completion, knowledge fusion, and knowledge reasoning. We expect that this survey will shed new light on future research and the development of knowledge graphs.
Article
Full-text available
This article describes a proposal about a trust system for e-commerce platform based on semantic web technologies and trust dimensions rules. We try to expose a system that allow to manage communication processes between e-commerce platforms and users in a trustworthy manner. It allows the data flows and transactions gain more trust across the entire process. All of this can be achieved through the inference of rules exposed in the defined ontology, complemented by a cloud-based system with microservices architecture. With the implementation of the system through an e-commerce platform, could consume data from the microservices in order to get inferences about its clients that want to buy or sell something within its system. This system was created based on rules defined by the ontology, as well as the microservices could be used to register information about multiple e-commerce transactions. The result of this work is the Ontology and semantic web rules defined and implemented through protege.
Article
In the last years, Learning Management systems (LMSs) are acquiring great importance in online education, since they offer flexible integration platforms for organising a vast amount of learning resources, as well as for establishing effective communication channels between teachers and learners, at any direction. These online platforms are then attracting an increasing number of users that continuously access, download/upload resources and interact each other during their teaching/learning processes, which is even accelerating by the breakout of COVID-19. In this context, academic institutions are generating large volumes of learning-related data that can be analysed for supporting teachers in lesson, course or faculty degree planning, as well as administrations in university strategic level. However, managing such amount of data, usually coming from multiple heterogeneous sources and with attributes sometimes reflecting semantic inconsistencies, constitutes an emerging challenge, so they require common definition and integration schemes to easily fuse them, with the aim of efficiently feeding machine learning models. In this regard, semantic web technologies arise as a useful framework for the semantic integration of multi-source e-learning data, allowing the consolidation, linkage and advanced querying in a systematic way. With this motivation, the e-LION (e-Learning Integration ONtology) semantic model is proposed for the first time in this work to operate as data consolidation approach of different e-learning knowledge-bases, hence leading to enrich on-top analysis. For demonstration purposes, the proposed ontological model is populated with real-world private and public data sources from different LMSs referring university courses of the Software Engineering degree of the University of Malaga (Spain) and the Open University Learning. In this regard, a set of four case studies are worked for validation, which comprise advance semantic querying of data for feeding predictive modeling and time-series forecasting of students’ interactions according to their final grades, as well as the generation of SWRL reasoning rules for student’s behaviour classification. The results are promising and lead to the possible use of e-LION as ontological mediator scheme for the integration of new future semantic models in the domain of e-learning.
Conference Paper
Full-text available
One of the fundamental concepts of Open Educational Resources (OER) is “the ability to freely adapt and reuse existing pieces of knowledge.” The application of Semantic Web approach and Linked Data technologies to Open Education seeks to turn data and metadata from open educational repositories into actionable interoperability for the improvement of discovering, using and reusing of OER. Interoperability is not an end in itself. Instead, optimizing the level of interoperability has societal and educational value as a means to others purposes. Interoperability can have a positive impact on open innovation, user choice, ease to reuse and adapt educational materials, global discovery of open and diverse content, among other things. This paper reports on the implementation of Linked Open Data for open access repositories in a new interoperable and global open educational ecosystem. The goal is to improve the metadata interoperability between various collections of open material, so as to facilitate the discoverability and subsequent combining, remixing, or adapting OER; that is, OER data should be easily accessible to any user: human being or a machine agent. This work addressed two challenges in the OER ecosystem: providing evidence of globally discoverability and reusability academic resources. Although there is much further potential for teaching and learning to realize, linked open data is a critical enabler of global and interoperable OER ecosystem.
Conference Paper
Full-text available
The advance in quality of public education is a challenge to public managers in contemporary society. In this sense, many studies point to the strong influence of socioeconomical factors in school performance but it is a challenge to select proper data to perform analyses on this matter. In tandem, it has happening a growth in provision of big quantities of educational indicators data, but in isolate cases, and by different agencies of Brazilian government. For this work, we use both education and economic indicators for analysis. The following socioeconomical indicators were selected: municipal human development index (MHDI), social vulnerability index (SVI), Gini coefficient and variables extracted from DBpedia, as part of the connection of this data to the Web of data: GDP per capita and municipal population. These data were used as independent variables to look into their correlations with Brazilian Basic Education Development Index (IDEB) performances at municipal level, supported by the application of linked open data principles. OpenRefine was used to extract the data from different sources, convert to RDF triples and then the mapping of the variables to existing ontologies and vocabularies in this domain, aiming at the reuse of existing semantics. The correlational analysis of the variables showed coherence with the literature about the theme, with significative magnitude between IDEB performances and the indicators related to income and parent education (SVI and HDI), besides moderate relations with the other varibles, except for the municipal population. Finally, the consolidated dataset, enriched by information extracted DBpedia was made available by a SPARQL endpoint for queries of humans and software agents, allowing other applications and researchers to explore the data from other platforms.
Conference Paper
The 10th Linked Data on the Web workshop (LDOW2017) was held in Perth, Western Australia on April 3, 2017, co-located with the 26th International World Wide Web Conference (WWW2017). In its 10th anniversary edition, the LDOW workshop aims to stimulate discussion and further research into the challenges of publishing, consuming, and integrating structured data on the Web as well as mining knowledge from said data.
Conference Paper
Web-scale reuse and interoperability of learning resources have been major concerns for the technology-enhanced learning community. While work in this area traditionally focused on learning resource metadata, provided through learning resource repositories, the recent emergence of structured entity markup on the Web through standards such as RDFa and Microdata and initiatives such as schema.org, has provided new forms of entity-centric knowledge, which is so far under-investigated and hardly exploited. The Learning Resource Metadata Initiative (LRMI) provides a vocabulary for annotating learning resources through schema.org terms. Although recent studies have shown markup adoption by approximately 30% of all Web pages, understanding of the scope, distribution and quality of learning resources markup is limited. We provide the first public corpus of LRMI extracted from a representative Web crawl together with an analysis of LRMI adoption on the Web, with the goal to inform data consumers as well as future vocabulary refinements through a thorough understanding of the use as well as misuse of LRMI vocabulary terms. While errors and schema misuse are frequent, we also discuss a set of simple heuristics which significantly improve the accuracy of markup, a prerequisite for reusing learning resource metadata sourced from markup.
Article
With the increasing amount of Linked Data published on the Web, the community has recognised the importance of the quality of such data and a number of initiatives have been undertaken to specify and evaluate Linked Data quality. However, these initiatives are characterised by a high diversity in terms of the quality aspects that they address and measure. This leads to difficulties in comparing and benchmarking evaluation results, as well as in selecting the right data source according to certain quality needs. This paper presents a quality model for Linked Data, which provides a unique terminology and reference for Linked Data quality specification and evaluation. The mentioned quality model specifies a set of quality characteristics and quality measures related to Linked Data, together with formulas for the calculation of measures. Furthermore, this paper also presents an extension of the W3C Data Quality Vocabulary that can be used to capture quality information specific to Linked Data, a Linked Data representation of the Linked Data quality model, and a use case in which the benefits of the quality model proposed in this paper are presented in a tool for Linked Data evaluation.