ArticlePDF Available

Abstract and Figures

We conduct a systematic survey with the aim of assessing open government data initiatives, that is; any attempt, by a government or otherwise, to open data that is produced by a governmental entity. We describe the open government data life-cycle and we focus our discussion on publishing and consuming processes required within open government data initiatives. We cover current approaches undertaken for such initiatives, and classify them. A number of evaluations found within related literature are discussed, and from them we extract challenges and issues that hinder open government initiatives from reaching their full potential. In a bid to overcome these challenges, we also extract guidelines for publishing data and provide an integrated overview. This will enable stakeholders to start with a firm foot in a new open government data initiative. We also identify the impacts on the stakeholders involved in such initiatives.
Content may be subject to copyright.
A Systematic Review of Open Government Data Initiatives
Judie Attard, Fabrizio Orlandi, Simon Scerri, Sören Auer
University of Bonn
attard@iai.uni-bonn.de, orlandi@iai.uni-bonn.de, scerri@iai.uni-bonn.de,
auer@cs.uni-bonn.de
Abstract. We conduct a systematic survey with the aim of assessing open government data initiatives, that is;
any attempt, by a government or otherwise, to open data that is produced by a governmental entity. We describe
the open government data life-cycle and we focus our discussion on publishing and consuming processes
required within open government data initiatives. We cover current approaches undertaken for such initiatives,
and classify them. A number of evaluations found within related literature are discussed, and from them we
extract challenges and issues that hinder open government initiatives from reaching their full potential. In a
bid to overcome these challenges, we also extract guidelines for publishing data and provide an integrated
overview. This will enable stakeholders to start with a firm foot in a new open government data initiative. We
also identify the impacts on the stakeholders involved in such initiatives.
Keywords: Open Data, Government Data, Data Portals, Publishing, Consuming, OGD Life-Cycle, Openness
1 Introduction
In recent years, a number of open data movements sprung up around the world, with transparency and data reuse as
two of the major aims. To mention a few, there is the Public Sector Information (PSI) Directive1in 2003 in Europe,
U.S. President’s Obama open data initiative in 20092, the Open Government Partnership3in 2011, and the G8 Open
Data Charter4in 2013. Open government data portals resulting from such movements, such as data.gov.uk,
data.gov, and data.gov.sg, provide means for citizens and stakeholders to obtain government information
about the locality or country in question.
While not being the only motivation, initially corruption was one of the main issues that prompted the found-
ing of open government data initiatives such as the above. Corruption is a global issue that seriously harms the
economy and society as a whole, affecting people’s lives and often infringing fundamental human rights. The
democracy of many countries around the world is undermined by deep-rooted corruption, which also affects the
economic development. While the total economic costs of corruption cannot be easily calculated, the 2014 Eu-
ropean Commission Anti-Corruption Report5states that corruption can be estimated to cost the European Union
1http://ec.europa.eu/digital-agenda/en/european- legislation-reuse- public-sector
-information
2http://www.whitehouse.gov/open/documents/open-government- directive
3http://www.opengovpartnership.org/
4https://www.gov.uk/government/publications/open-data- charter
5http://ec.europa.eu/dgs/home-affairs/what- we-do/policies/organized- crime-and-
human-trafficking/corruption/anti- corruption-report/index_en.htm
economy 120 billion Euros per year. In places where there is widespread belief that corruption prevails, the people
end up losing faith and trust in those entrusted with power. As the Global Corruption Barometer 20136shows,
corruption can be identified running through the democratic and legal process in many countries. This results in
people losing trust in key institutions such as political parties, the judiciary and the police. While transparency
cannot be regarded as an end [80], it can be regarded as a means to act as a disincentive to corruption.
Collectively, there are three main reasons for opening government data7:
1. Transparency - In order to have a well-functioning, democratic society, citizens and other stakeholders need
to be able to monitor government initiatives and their legitimacy. Transparency also means that stakeholders
not only can access the data, but they also should be enabled to use, reuse and distribute it. The success to
achieve transparency results in a considerable increase in citizen social control;
2. Releasing social and commercial value - Governments are one of the largest producers and collectors of
data in many different domains [2]. All data, whether addresses of schools, geospatial data, environmental
data, transport and planning data, or budget data, has social and commercial value, and can be used for a
number of different purposes which are different than the ones originally envisaged. By publishing such data
the government encourages stakeholders to innovate upon it, and create new services; and
3. Participatory Governance - Through the publishing of government data citizens are given the opportunity to
actively participate in governance processes, such as decision-taking and policy-making, rather than sporadi-
cally voting in an election every number of years. Through open government data initiatives such as portals,
stakeholders can also be more informed and be able to make better decisions [60].
The above motivations, while not being the sole ones, are the foundations for most open government data
initiatives. The latter act as a preventive policy and give stakeholders the opportunity to scrutinise and reuse the
available information in a number of ways, including identifying patterns in the data and creating new services.
This results in an increased accountability that in turn hinders corruption. Besides, by the creation of new services
based on open government data, users add value to the latter, which can also be commercialised. The participa-
tion of citizens in decision-making processes is also a very important aspect of opening governmental data, as it
empowers citizens and thus enables governments to be more citizen-centred. However, citizen participation is not
only limited to the decision-making process. Open government initiatives may also allow stakeholders to provide
feedback on government actions or collaborate in policy-making.
Although the number of public entities seeking to publicly disclose their data has seen a drastic increase, it
is still a major challenge to achieve the full potential of open government data and support all interested parties
with the publication and consumption of this data. A number of barriers, including technical, policy and legal,
economic and financial, organisational, and cultural barriers, also contribute to this challenge [12, 83]. Yet, a
major stumbling block for the full exploitation of open government initiatives remains the heterogeneous nature
6http://www.transparency.org/gcb2013
7http://opengovernmentdata.org/
2
of data formats used by public administrations, which include anything from images, PDF and CSV files and
Excel sheets, to higher structured XML files and database records. This heterogeneity is a technical barrier to
both data providers and data consumers, and hinders society from realising government data transparency. Open
government data portals also suffer from the large number of diverse data structures that make the comparison and
aggregate analysis of government data practically impossible. The diversity of tools to present, search, download
and visualise this government data is also nearly as diverse as the number of existing portals. Past efforts have
sought to overcome this situation by creating comprehensive and connected European transparency portals such
as publicdata.eu. However, the diversity of transparency standards across Europe, which proved to be a
bottleneck, highlighted the need that platforms beyond the state-of-the-art also need to be more than just direct
entry points to government data analysis. They also need to provide a platform for advocacy towards common
transparency standards at the highest level across several jurisdictions.
Government data portals also experience a number of cultural obstacles which hinder them from reaching their
full potential. For example, public entities might be unwilling to publish their data. This may be so for a number
of reasons, including the perception that it requires a lot of resources and effort, or that the release of government
data might backfire. This disposition is, however, slowly changing world-wide, mostly due to advocacy of civil
society initiatives.
In this paper we aim to explore the state of open government data initiatives, as well as existing tools and ap-
proaches. For this aim, we conduct a systematic survey on the literature related to publishing and consuming open
data through government portals, data catalogues, or otherwise. We discuss different aspects of such open govern-
ment initiatives, including the above-mentioned barriers, their impact upon all the different processes within the
open government data life cycle, any proposed standards, and challenges impeding the efforts. In this paper, our
aim is not to provide new interpretations of existing literatures. Thus we here give an overview of existing inter-
pretations, and provide an integrated and unified model that covers the relevant concepts, terminology, initiatives,
challenges, and guidelines, therefore portraying how each element forms part of the bigger picture. We start by
explaining the process of the systematic survey in Section 2, followed by short definitions of key terms used in
this paper in Section 3. In Section 4 we describe the open government data life-cycle as we envision it, followed
by an overview of open government data initiatives, including assessment frameworks, evaluations and challenges
in Section 5. We proceed with discussing different aspects of publishing and consuming open government data in
Section 6, whereas in Section 7 we identify the various impacts that publishing government data has on different
stakeholders. We provide our concluding remarks in Section 8.
2 Research Method
In this survey paper we followed a systematic literature approach. By following this formal method with explicit
inclusion and exclusion criteria, we intend to provide a replicable research review with minimal bias arising from
3
the review process itself. Our approach is based on the guidelines proposed in [17] and [34]. The procedure we
undertake is as follows:
1. Define search terms;
2. Select sources (digital libraries) on which to perform search;
3. Application of search terms on sources; and
4. Selection of primary studies by application of inclusion and exclusion criteria on search results.
2.1 Research Questions
Identifying the research questions is essentially what distinguishes a systematic review from a traditional review.
Asking pre-defined questions is not only required for determining the content and structure of the review, but
it also aids in guiding the review process. This includes the techniques used for identifying studies, the critical
reviewing of studies, and the ensuing analysis of the results.
The goal of this survey is to analyse existing open government data initiatives, tools, and approaches, for
publishing and consuming open government data. We therefore define the following as a generic research question:
What are existing approaches that enable the publishing and consumption of government data?
This generic question can be further divided into more specific sub-questions as following:
1. What are existing approaches for publishing or consuming open government data, and how can they be clas-
sified?
2. What are the supported technical aspects, features and functions in existing approaches?
3. Are there any defined guidelines for the publishing or consumption of open government data?
4. What are existing challenges with publishing or consuming open government data?
5. What are possible impacts of open government initiatives on relevant stakeholders?
2.2 Search Strategy
In order to cover the largest spectrum of relevant publications possible, we identified and used the most extensively
used electronic libraries, namely:
ACM Digital Library
IEEE Xplore Digital Library
Science Direct
Springer Link
ISI Web of Knowledge
4
Although we considered Google Scholar to be used for this systematic review, we decided against including
it, since its content is indirectly obtained through the listed electronic libraries, thus making the use of Google
Scholar redundant.
Based on the research questions, we led out some pilot searches and consulted with experts in the field in order
to obtain a list of pilot studies. The latter were then used as a basis for the systematic review in order to find the
search terms which would best answer our research questions. The following are the search terms used in this
systematic review:
1. “government data portal”;
2. “government public portal’;’
3. “government open data”;
4. “government open data portal’;’
5. “government open data publishing”;
6. “government data publishing”;
7. “public government data”;
8. “consuming open government data”;
9. “consuming open data”;
10. “public open data”;
11. “open data consumption”;
12. “open data publication”;
13. “open data portal”; and
14. “consuming public data’.’
To construct the search string, all the search terms were combined by using the "OR" Boolean operator. The
reason this conjoining method was implemented for the query construction was to keep the query as simple
as possible, with as few Boolean operators as possible. This made the query more flexible to use in different
electronic library search tools.
The next step in defining the search strategy was to find suitable metadata fields on which to apply the search
string on. Searching in the publication title field alone does not always provide the relevant publications, mostly
due to low precision and low recall rates. While the search on the title retrieves a potentially larger number of
results, the results might not all be relevant. Thus by adding the search on the abstract, irrelevant results would be
reduced, while other relevant publications which do not have the search terms in the title are also retrieved. We
therefore decided to lead the search on both the title and abstract field of publications.
2.3 Study Selection
Some of the results obtained using the above method might still be irrelevant for our research questions, even if the
search terms appear in either the title, abstract, or both. Therefore, a manual study selection has to be performed,
5
retaining only those results which are relevant to answering our research question. Thus we defined inclusion and
exclusion criteria. Publications that satisfy any of the following inclusion criteria are selected as primary studies:
I1. A study that focuses on open government portals, open government data, or its publishing or consumption;
and
I2. A study that describes open government data initiatives.
Publications that meet any of the following criteria are excluded from the review:
E1. A study that only mentions some of the search terms, but does not focus on government data or its
publishing or consumption;
E2. A study that focuses on open data in general (not limited to government data); and
E3. A study that describes portals that exploit only non-governmental data.
The procedure for selecting the primary studies for this review was conducted in October 2014. Consequently,
this review only includes studies that were either published or indexed before that date. We also limited our search
to publications written in the English language and were published after 2002. This year was selected as a delimiter
since the preliminary search indicated that there were no relevant results before that date. As shown in Figure 1,
we started by applying the search string in each data source separately. Since the results included a couple of
proceedings, we resolved them by including all publications within the proceedings, resulting in 368 publications.
Subsequently, the results were merged, and duplicate studies were removed. This left us with 338 publications.
We then proceeded to manually go through the titles of the remaining studies, removing those entries whose title
indicated that they were not relevant to our review. This reduced the amount of potential primary studies to 159.
The following step was to manually scan the abstracts. Yet again, the number of studies was reduced to 103.
Finally we went through the full-text of the studies, whilst applying the Inclusion and Exclusion criteria defined
above. This resulted in 75 studies, which represented our final set of primary studies.
Fig. 1: Procedure for Identifying Primary Studies
Fig. 2: Resulting Primary Studies by Year
6
2.4 Overview of Included Studies
The goal of this publication is to execute a systematic analysis of existing literature within the field of open
government data. We here discuss some statistics of the relevant literature resulting from the conducted systematic
analysis. As shown in Figure 2, the period between 2002 and 2009 did not yield any relevant literature, however,
the results increase significantly in the subsequent years. Even though a number of major open data initiatives
were already established, such as the ones indicated in the figure, the surge in open government data literature
may be potentially linked to U.S. President Obama’s Open Government Directive at the end of 2009. As shown in
the image, the year 2014 resulted in the highest number of related literature, indicating that the awareness of open
government initiatives is increasing at a fast pace.
3 Terminology
Fig. 3: Relationship Between Open, Government and Linked Data
In order to give some context to our discussion, we here define the most important concepts used within this
paper. Figure 3 visually represents the relationships between open data, government data, and linked data.
Open Data - The ‘Open’ 8definition sets out eleven requirements that Open Data should conform to. The
latter requirements basically indicate how to enable the free use, reuse, and redistribution of data. Moreover, open
data should not discriminate any person and must not restrict the use of the data to a specific field or venture. Thus,
data published in an open data format would be "platform independent, machine readable, and made available to
the public without restrictions that would impede the re-use of that information"9. Hence open data only refers to
data that is available free of charge for the general public without any limitations [59]. Open data is considered to
be a key enabler of Open Government [35].
Public Data - It is important to note the distinction between public data and open data. While public data is
made freely available to the general public, it is not necessarily open. An extreme example of public data which
8http://opendefinition.org/od/
9http://www.whitehouse.gov/open/documents/open-government- directive
7
is not open is an archive of legal documents. While they are freely accessible, imagine the effort required to
identify and locate a specific document. On the other hand, if such data is digitalised and made available online in
a standardised format (also indexed), then this public data is also open.
Open Government Data - Open Government Data is a subset of Open Data, and is simply government-related
data that is made open to the public [35]. Government data might contain multiple datasets, including budget
and spending, population, census, geographical, parliament minutes, etc. It also includes data that is indirectly
‘owned’ by public administrations (e.g. through subsidiaries or agencies), such as data related to climate/pollution,
public transportation, congestion/traffic, child care/education. Several countries have already demonstrated their
commitment to opening government data by joining the Open Government Partnership (OGP)10.
E-Government - While many different definitions of e-government exist in the literature, we here stick to
the government’s use of technology to enhance the services it offers to other entities, including citizens, business
partners, employees, and other government agencies [36]. Technologies used for this purpose are most often web
applications. Thus, by aiding the interaction between citizens and their government, an e-government has the
potential of building better relationships and also deliver information and services more efficiently. While initially
e-government just referred the simple presence of government on the Internet, mostly in the form of an informative
website, the concept has since evolved. With the introduction of the ‘Open Government’ concept, we now consider
open government data initiatives to be a subset or an extension of e-government [28].
Linked Data - Linking data is the process of following a set of best practices for publishing and connecting
structured data on the Web [6]. It is the final step in the five star deployment scheme11 for open data. The term
‘linked data’ thus refers to data which is published on the Web and, apart from being machine readable, it is also
linked to other external datasets. The increased rate of adoption of linked data best practices has lead the Web to
evolve into a global information space containing billions of assertions, where both documents and data are linked.
The evolution of the Web enables the exploration of new relationships between data and the ensuing development
of new applications.
Data Portal - The open data movement aims at opening public sector information with the purpose of max-
imising its reuse. A typical implementation is to collect and publish datasets into central data portals or data cat-
alogues in order to provide a “one-stop-shop” for data consumers. While a data catalogue would most commonly
act as a registry of data sources [1], providing links, a portal is more commonly a single entry point hosting the ac-
tual data, where end users can search and access the published data and explore or interact with it in some manner.
A key function of a data portal is the management of metadata for the datasets, possibly including metadata har-
monisation. Various tools are provided on government data portals, such as data format conversion, visualisations,
query endpoints, etc.
Publishing - Publishing data on the Web enables data providers to add their data to the global data space.
This allows data consumers to discover and use this data in various applications. By following linked data best
10 http://www.opengovpartnership.org/countries
11 http://5stardata.info/
8
practices, published data is made more accessible and eases its reuse. A large number of linked data publishing
tools exist; they either serve the content of RDF stores as linked data on the Web or otherwise provide linked
data views over non-RDF data sources [6]. The majority of these tools allow publishers to avoid dealing with the
technical details behind data publishing.
Consuming - The aim of publishing data on the Web is to enable its use, reuse, and distribution. Such data
is made more discoverable and accessible if the data publishers follow linked data best practices. For example,
if published data has good quality metadata [59], then consumers would more easily discover the contents of the
published dataset, and decide whether it is fit for the intended use. While the role of data consumers and data
publishers is distinct, it is also interchangeable in that a publisher can also be a consumer and vice versa. To
describe this, the authors of [2] coin the term prosumers. Data consumption can be either data exploration, where
a user visualises or scrutinises open data, or data exploitation, where a user adds value to the open data by creating
mashups, leading analysis, or innovating upon the data itself. This is also known as knowledge economy.
Data Quality - Since the concept of quality is cross-disciplinary, there is no single agreed-upon definition of
quality [35]. However, data quality is commonly perceived to be fitness for use [31]. Fitness for use is, however, a
multi-dimensional concept that has both subjective perceptions and objective measurements based on the dataset in
question [57]. Subjective data quality assessments reflect the requirements and experiences of the consumers of the
data. Let us take an example using restaurant reviews. What one person might consider to be a tasty dish, another
might find bland. These different perceptions result in varying reviews of the same dish. Objective assessments
can be task-dependent or task-independent. Task-independent quality assessment metrics reflect the properties of
the data without contextual knowledge of how it will be consumed. Continuing on the previous example, if a
restaurant uses fresh ingredients in its food, then it is considered to be a good restaurant. Task-dependent metrics,
on the other hand, reflect the requirements of the application at hand. For example, if a person who does not like
fish is served a fish dish, then of course he will not like it. Thus, albeit a public entity publishes governmental data,
if this data does not have good quality standards with regards to its consumers, then the data will not be exploited
to its full potential.
4 Open Government Data Life-Cycle
In this section we propose and explain the open government data life-cycle. Albeit a number of open data life-
cycles exist12, none of them are tailored to the specific needs of open government data. Moreover, a number of
vital steps are omitted, and only the most common procedures for opening data are discussed. Therefore, using the
existing data life-cycles as a basis, we here attempt to cover all the processes in the life-cycle of open government
data, in order to provide a standard process which government open data stakeholders can follow.
The proposed life-cycle, shown in Figure 4, is made up of three sections, namely a pre-processing section
(rectangle), an exploitation section (oval), and a maintenance section (hexagon). The latter sections, in order, take
12 http://www.w3.org/2011/gld/wiki/GLD_Life_cycle
9
Fig. 4: Open Government Data Life-Cycle
care of: preparing the data to be published, using the published data, and maintaining the published data in order to
be sustainable. We proceed to give a short overview of each interdependent step in the life-cycle. This description
is not meant to be extensive, as each step can also require a number of different processes, and such an extensive
description is beyond the scope of this paper.
Data Creation - The open government data life-cycle typically starts with the creation of data. In public or
governmental entities, the creation of data is usually part of daily procedures, however, it is also possible to
collect data for the specific purpose of publishing it.
Data Selection - This is the process involving selecting the data to be published. This requires removing
any private data or personal data, as well as identifying under which conditions will this data be published
(potentially through the specification of open government data policies) [80].
Data Harmonisation - This step involves preparing the data to be published in order to conform to publishing
standards, such as the Eight Open Government Data Principles (explained further in Section 6.1.2).
Data Publishing - This is the actual act of opening up the data by publishing it on government portals.
Data Interlinking - Data Interlinking is the final step in the Five Star Scheme for Linked Open Data. This
allows published data to have additional value, as the linking of data gives context to its interpretation.
Data Discovery - The publishing of data is not enough to enable its reuse. Data consumers must discover the
existence of open data in order to be able to consume it. Data discovery can be enhanced by actively raising
awareness on its existence (e.g. through organising hackathons).
10
Data Exploration - This step is the most trivial way of consuming data. Here, a user passively examines open
data by visualising or scrutinising it.
Data Exploitation - This step is a more advanced way of consuming data. Data Exploitation enables a user
to pro-actively use, reuse or distribute the open data by leading out analysis, creating mashups, or innovating
upon the open data.
Data Curation - While not necessarily occurring at a fixed stage, data curation is vital in ensuring the pub-
lished data is sustainable. This involves a number of processes, including updating stale data, data and meta-
data enrichment, data cleansing, etc.
Since the complete open government data life-cycle is out of the scope of this paper, we here proceed to
focus on the essential aspects of opening government data through portals, namely the processes of publishing
and consuming open data. Of course, while the aim of opening up government data is achieved, enabling the full
exploitation of the data is hardly possible in this manner.
5 Open Government Initiatives
The Open Government movement aims to achieve a government that enables cooperation between public admin-
istrations and the general public, in order to become more transparent and democratic [51]. Open government
data does not only enhance the transparency and accountability of a government, but can result in economic
benefits, innovative solutions for community advancement, as well as supporting public administrations’ func-
tions [4, 5, 21, 22, 33, 35, 37, 43, 47, 50, 55, 61, 70, 72]. Furthermore, these benefits can be achieved simply by
publishing and reusing data which has already been produced in the day-to-day administration of a governing
entity. We can thus assume that the two major motivations which prompt governments to jump on the open data
bandwagon are the (i) spirit of democracy, and (ii) economics [11]. Regarding the first motivation, governments
exploit open data initiatives in order to lift the veil of secrecy and become more transparent. The second mo-
tivation, on the other hand, enables the growth of the information marked by sharing government data. Whilst
sensitive or personal data cannot be shared, other data can have economic value to businesses or individuals if
exploited, and new uses for the particular data can also be discovered. The publishing of data, such as traffic,
meteorological, budgetary, geo-spatial, and geographical data, provides consumers with opportunities to create
new services, which, apart from being profitable, also can benefit the common good [5]. This, in turn, can poten-
tially contribute to economic growth. Other important benefits resulting from open government initiatives include
crowdsourcing for error reporting, increased public service employee motivation due to the reuse of published
data, more informed citizens, enhanced citizen participation, and job creation [55].
To date, 64 countries have joined the Open Government Partnership13 (OGP) to demonstrate their commitment
to making data free to use, reuse and redistribute according to Open Data principles. The OGP initiative aspires to
13 http://www.opengovpartnership.org/
11
guarantee commitments from governments to promote transparency, accountability, empower citizens, and exploit
technologies to strengthen governance. In order to be eligible to participate in the OGP, countries (and their
respective governments) should meet the eligibility criteria and demonstrate their commitment to open government
principles in four key areas:
1. Fiscal transparency;
2. Access to information;
3. Income and asset disclosures; and
4. Citizen engagement.
A typical implementation to opening government data is to collect relevant datasets and their respective metadata
and publish them on a Open Government Data Portal. Open Government Data Portals can have different operators,
i.e. either an official government entity or a citizen initiative. Another difference between open government imple-
mentations is the scope, where a portal or catalog may publish data relevant to a specific administrative region, for
example, a city or a country. A large number of countries have created local or national government data portals
in order to provide access to open government datasets [46]. Four major sites to date are in the US (data.gov),
the UK (data.gov.uk), France (data.gouv.fr), and Singapore (data.gov.sg) [24]. Such portals act as one-stop-shops
and facilitate consumers’ access to government data, saving the trouble of collecting data from various authorities,
offices, or websites.
Fig. 5: Global Open Data Index by Place (Source: http://index.okfn.org)
While the main implementations of open government data initiatives are data portals, there exist a number of
different implementations with various characteristics. Government Data Catalogues or Metadata Portals/Repos-
12
itories are indexes which store structured descriptions (metadata) about the actual data (e.g. PublicData.eu).
Such tools have the potential of improving the discoverability of published datasets, as the discoverability of data
is directly dependent on the quality of the metadata [59]. An open government catalogue would contain a col-
lection of metadata records which describe open government datasets and also have the corresponding links to
the online resources [35, 44]. The implementation of a catalogue, however, raises an important question: What
metadata should be stored and how should it be represented? This question is especially significant when auto-
matic importing of metadata records (also known as harvesting) is performed, as metadata structure and meaning
are not usually consistent or self-explanatory [44]. Open data portal software such as CKAN14 or vocabularies
such as DCAT [43] provide solutions for this problem. Furthermore, the authors of [59] propose the implementa-
tion of metadata quality metrics on CKAN-powered government data catalogues with the aim of determining the
metadata’s adequacy for a user’s specific need.
Figure 5 shows the 2014 Global Open Data Index15 of a number of places (some places might not be officially
recognised as countries). This index tracks whether published data is actually released in a way which is accessible
to all stakeholders, and measures the openness level of data globally. The index represents the percentage of dataset
entries that are deemed to be open, based on the Open Definition16. The technical and legal dimensions of each
dataset available from the various places is assessed using the following nine questions:
1. Does the data exist?
2. Is the data in digital form?
3. Is the data available online?
4. Is the data machine-readable?
5. Is it available in bulk?
6. Is the data provided on a timely and up to date basis?
7. Is the data publicly available?
8. Is the data available for free?
9. Is the data openly licensed?
5.1 Assessment Frameworks
It is undeniable that with all the current open government initiatives a large amount of data has been released to the
public. This, however, does not mean that the targeted aims of promoting transparency and facilitating account-
ability have been achieved yet. For example, the authors of [3] point out that after interacting with transparency
websites (such as data portals), consumers do not consider that transparency and access to information have been
achieved. In such cases, while these portals would be complying with the law and following the requirements
for the publishing of information, they would not be promoting transparency in itself. Unfortunately, while being
14 http://ckan.org
15 http://index.okfn.org/place/
16 http://opendefinition.org/od/
13
aware of such deficiencies, several governments do not tackle them. This is because government initiatives are
evaluated according to whether they are complying with the law or not, and not based on the usefulness of the
information provided. A very apt example in this case is the publishing of data in PDF format, which makes it
pretty inconvenient for any intended use, reuse and redistribution.
Another contributor to the afore-mentioned deficiencies is the lack of an agreed-upon framework to evaluate
and assess the content provided on such data portals [3]. Whilst various authors propose and discuss recommen-
dations and requirements for evaluating data portals or catalogues, the contribution varies from publication to
publication. On the other hand, the majority of the proposed frameworks and recommendations reflect the Five
Star Scheme for Linked Open Data by Tim Berners-Lee17, such as [26], as well as the Eight Open Government
Data Principles18, such as in [40]. Table 1 gives an overview of the different aspects on which the following
literature focus on within the discussed assessment frameworks. In this table, by ‘Nature of Data’ we mean the
assessment of various data aspects according to the Five Star Scheme for Linked Open Data, and the Eight Open
Government Data Principles.
Data Portal External Factors Public Engagement
Nature Accountability Transparency Access to Openness Legal Institutional Participation Collaboration
of Data Information Obligations Arrangements
[3] X X X
[7] X X X X X
[26] X
[40] X X X X
[64] X X X X X
[75] X X X X X
Table 1: Overview of Aspects Evaluated by Assessment Frameworks Proposed in Literature
The authors of [7] propose a model for assessing data openness by relying on the Eight Open Government
Data Principles. With the aim of of automatically evaluating openness, the model was implemented as a web tool,
and it also aids in the process of building openness principles. The authors applied the model to seven data portals
with the aim of demonstrating its capabilities and results.
The authors of [40] also propose an assessment framework. They focus on accountability, aiming to propose
a set of requirements intended to assess whether portals are actually contributing to a higher degree of trans-
parency. The authors raise two essential questions regarding the effectiveness of portals in making data available
for accountability purposes, and how this can be evaluated. They analyse the related literature on internet-based
transparency and consider two dimensions, namely the type of public entities studied and the information types
that consumers looked for. The authors proceed to extract a set of requirements from the led studies, and propose
them as part of an open government dataset portals assessment framework. They conclude that while dataset por-
tals were created with the intention of meeting open government strategies, as yet no evidence was found in open
17 http://www.w3.org/DesignIssues/LinkedData.html
18 https://public.resource.org/8_principles.html
14
government literature that publishing a large amount of datasets actually contributes in promoting transparency
and facilitating accountability.
Another assessment model is proposed by [64], whereby the authors analyse existing assessment models and
then proceed to propose a new model catering for previous discrepancies in the older models. The authors base
their assessment on four conceptual pillars; focusing on collaboration, co-production, institutional arrangements,
legal obligations, and data openness. They proceed to test and analyse the proposed assessment model using actual
open government data portals.
In [75] the authors propose a benchmark for open government data initiatives. The benchmark is based on
data openness, transparency, participation, and collaboration. It assesses both the openness index as well as the
maturity of relevant initiatives.
5.2 Open Government Initiative Evaluations
The number of evaluations carried on existing portals, catalogues, and other open data initiatives is nearly as varied
as the number of initiatives itself. Furthermore, since there is no agreed-upon evaluation framework as yet [7], the
authors of such literature employ different approaches. Table 2 shows an overview of the approaches undertaken
within literature covered in the rest of this section. One should note that while all authors assess various aspects of
an initiative, they base their evaluation on one or two main aspects. Most of the evaluations assess the published
data properties of the initiative in question using the Five Star Scheme for Linked Open Data or the Eight Open
Government Data Principles, however others also assess data availability, data content, and data accessibility. Two
other popular assessment approaches consider the features and functions of an initiative (usually in the form of a
portal). In contrast to the above approaches, some authors assess the maturity of an initiative as a whole, rather
than based on specific aspects such as data, functions or features. In the latter cases, the maturity is assessed
based on other aspects such as the amount of fulfilled objectives, compliance to existing laws and regulations, and
the usability from a stakeholder’s point of view. A number of approaches in literature also consider stakeholder
participation in the initiative in question, as well as their feedback.
Within the results of our systematic survey, one of the most popular approach was to evaluate the functional-
ity of portals or catalogues. The authors of [44, 47, 77] follow this approach, for GovData.de, Brazilian anti-
corruption and transparency portals, and PublicData.eu respectively. Through the three publications, the
authors assess these portals by identifying the functions, limits, and challenges of the evaluated portals, and also
give recommendations towards avoiding or solving the challenges.
Another popular approach was to evaluate the features provided in data portals and their usability, such as
the number of data formats available and multilinguality. The authors of [38] and [23] both evaluate, for different
use-cases, how portals and catalogues actually enable the use, reuse, and distribution of data. They identify short-
comings such as consumers’ difficulty in identifying the required datasets and the use of different formats. The
15
Data Functionality Features Stakeholder Initiative Stakeholder Evaluated Initiatives Geographic Coverage
Participation Maturity Feedback
[1] X X X Various Portals Greece
[3] X X n/a Mexico
[20] Xdata.wien.gv.at Vienna, Austria
[22] X X Various Mobile Applications Mexico
[23] X X datosabiertos.df.gob.mx,
labplc.mx/hackdf-2 Mexico
[28] X X n/a Stockholm and Skellefteå, Sweden
[37] X X Various Portals Taiwan
[38] XVarious Portals and Agencies Australia
[44] X X GovData.de Germany
[46] XPublicData.eu Europe
[48] X X X Various Entities and Portals Brazil
[47] XMato Grosso, Paraíba, Piauí and Paraná Brazil
[54] X X Various Portals Italy
[55] Xn/a Vienna, Austria
[56] X X X Various Portals European Union
[58] Xwww.datos.gov.co Colombia
[60] Xdatosabiertoscolombia.cloudapp.net Colombia
[62] X X X n/a Colombia, Chile, Brazil
[63] XVarious Mobile Applications Various Countries
[65] X X Meu Congresso Nacional Application Brazil
[66] X X X Rio Inteligente and Cidadão Recifense Brazil
[67] X X X n/a Worldwide
[73] X X X RASOI Mobile Application India
[77] X X www.stat.gov.rs,
PublicData.eu, INSIGOS Europe, Serbia, Poland
[78] Xn/a Taiwan
Table 2: Overview of Evaluated Aspects in Open Government Initiative Evaluations
authors of [22] and [63], in a similar manner to the previous authors, evaluate the use of open data through mobile
applications.
In [67], the authors carry out a preliminary exploration of the worldwide status of open government. The
authors analyse the open government data portals from 35 countries, reviewing the published data, the provided
features, and the level of stakeholder participation. They also provide a framework for assessing open government
data initiatives. The authors of [58] and [60] also evaluate the status of open government initiatives, however,
they directly focus on the Colombian government initiative as a whole, rather than for specific portals. Similarly,
the authors of [66] and [48] both discuss the current state of Brazilian open data initiatives. The authors of [37]
and [78] both lead out a study on the Taiwanese open data platforms with the aim of identifying their status. The
Greek open data movement is analysed in [1], where the authors analyse the current state of open data from three
different perspectives, namely the functionality, the semantics of the data, and the provided features.
The authors of [20] and [54] also evaluate the status of open government data initiatives, however, they have
more specific aims. The former attempt to analyse the relationship between the open data ambitions at the Euro-
pean level and those at the Austrian federal level (focusing mostly on Vienna), both from the data consumer and
producer side. The latter, on the other hand, strive to understand the link between the Italian Open Government
Data legislation and a newly enacted Transparency Act. The authors of [46] also have a specific aim; that is, to
assess the data openness level through metadata quality.
Publications [55,65, 73] and [28] all assess stakeholders’ opinion to a certain degree. Through interviews and
online polls, the authors of [55] identify a number of factors that enabled the success of the open government
data strategy in Vienna. The authors of [65] and [73] discuss issues and challenges of developing applications
16
that implement open government data. While the former extract these challenges from evaluating and analysing
an application developed during an organised hackathon, the latter describe the challenges they faced during the
development of their own application. Finally, the authors of [28] analyse the interpretations and perspectives of
stakeholders with regards to opening municipalities’ data in Sweden, and strive to identify how the stakeholders’
opinion contributes to the implementation success of open data initiatives.
The authors of [56] do not focus on a single aspect for their evaluation. Rather, they carry out a comprehensive
analysis of open government data initiatives in the European Union, focusing on functions, data semantics, and
features. They collect and categorise a number of public data sources for each European Union member country,
and they assess their characteristics and provided services. The authors identify the differences in content, li-
cences, multilinguality, data accessibility, data provision, and data format. Finally, the authors point out that while
the quality of open government infrastructures is improving, there are still great differences between national open
data portals. The authors also identify two important challenges which are still not catered for, namely multilin-
guality and open licences. In a similar but downscaled manner to [56], the authors of [62] compare three South
American open government data initiatives (Brazil, Colombia, Chile), however, the authors rather focus on open
government policies, citizen involvement and the use of new technologies. The authors of [3] also focus their eval-
uation on different aspects, namely accountability, content, and usability. They review the evaluation literature of
Mexican e-transparency websites with the aim of defining a theoretical framework that could enable governments
to improve their portals’ contribution beyond standard transparency obligations.
5.3 Challenges
Even though there are numerous open government data initiatives, there still exist a number of setbacks which
prevent them from reaching their full potential. Through the evaluations led in the literature mentioned in the
previous section, and otherwise, we identified the most common challenges faced, and also propose possible
solutions. While the following challenges vary in domain, they are mostly barriers of a technical nature.
Data Formats - The whole point of opening and publishing data in portals is to enable its use, reuse, and
redistribution. Two of the Eight Open Government Data Principles, in fact, regard the format in which data is
published, and state that such data should be made open to the public in a machine processable data format which
is non-proprietary.
Unfortunately, while this is a guideline, it is not legally required by many open government initiatives (which
only require the publishing of data). Many governmental entities still publish data in a large variety of data formats
which can also be proprietary. This has resulted in a number of data silos which appear to be available for use but
which in reality require significant effort before being actually usable [13,24,30, 38, 43,46,68].
In an ideal world, in order to achieve economic growth, governmental entities (data providers) should take into
account the requirements of the data end-users (data consumers) [82]. This should include the specific formats
that are most convenient for the widest spectrum of consumers. W3C recommends the use of established open
17
standards and tools, such as XML and RDF as a publishing format19. A feasible solution would then be to enforce
data providers to publish their data in machine processable and non-proprietary formats through the open govern-
ment initiatives in which they partake [71]. Thus the portal’s ‘success’ would not only be evaluated on the amount
of data published, but also on the usability of this data.
Data Ambiguity - While of course any machine-readable data format, such as CSV, is preferred over non-
readable ones, such as PDF, more expressive data formats are generally preferred, simply because they are more
descriptive of the actual data they represent. This decreases the risks of ambiguity and misinterpretations [46].
Consider the example of the concept of a year. While a calendar year would be the most common in our everyday
lives, some financial agencies within the public sector might use a financial year to describe their data [38]. This
leads to difficulties when attempting to find relationships between two datasets due to this difference in temporal
representation. Semantic ambiguity therefore would require extra efforts in order to link and understand the data
in question [12]. Similar to [13] we can thus conclude that although data is available in a machine readable format,
such data is not really useful unless it is easily understandable; maybe by requiring just minimal background
knowledge on the subject.
A simple enough solution for this issue is to publish data with descriptive titles, or otherwise provide a key to
code names, if the latter are used [53]. This would help data consumers to clearly and easily understand what the
data is about, and if it is actually useful for them. The use of RDF as a data format is also encouraged as it is a
highly descriptive data format.
Data Discoverability - Publishing data and making it accessible qualifies as ‘open data’, however open data
also needs to be discoverable. The discoverability of open data is bound to the quality of the metadata describing
the data itself, which is not always complete or accurate [12,35,46,59]. In addition, other factors lead to difficulties
in finding useful data quickly [38]. For instance, some portals support only simple search functions which do
not return only relevant data, but also related policies and documents such as research papers [2]. This may
result in the user being overloaded with information [84] and having to go through all the results to potentially
identify the relevant datasets. Moreover, most portals only allow users to simply download the available data,
with no possibility of exploring it directly through the portal (for example through visualisation). These issues
are particularly evident when the data consumers do not know the responsibilities of the government entity in
question or the data structures that they implement, making it even harder to locate the relevant data they need.
The fact that even most of the datasets are spread over a number of decentralised data sources further aggravates
the problem [12,16, 83].
A number of efforts in the literature focus on metrics which assess metadata quality. The authors of [59], for
example, tackle the problem of metadata quality by applying five quality metrics, namely: completeness, weighted
completeness, accuracy, richness of information, and accessibility, to three public government data repositories.
This evaluation is carried out with the aim of measuring the metadata’s efficiency, identifying low-quality metadata
19 http://www.w3.org/TR/gov-data/#formats
18
records, and also understanding the reasons behind the origin of the low quality. Evaluated metadata is then
assigned a quality score which enables the uniform comparison of the metadata quality across different repositories
or catalogues. Evaluated metadata can consequently be improved in order to achieve better searchability, and
subsequently better discoverability.
Data Representation - The heterogeneity of the published datasets and their representation is quite an obvious
setback for open government data initiatives. Data as varied as traffic, budget, geographical, and environmental
data, etc., is published onto portals in a non-standardised manner, meaning that there exists a large heterogeneity
in terms of semantics, standards, and most importantly in this case: schema. This leads to interoperability issues
and challenges to aggregate existing metadata in a way that would be useful for data consumers [8,24, 44, 46].
Additionally, such heterogeneous data would potentially even require the to be mapped to a global schema. A
further aspect to this issue is versioning. An ideal representation of a dataset would also capture how it evolves
over time.
A number of efforts in the literature approach this challenge by proposing a generic schema. For example,
in [44], the authors propose a minimal schema that is compatible with the predominant data catalogue vocabulary
and software. The schema supports the description of datasets as well as documents and applications, and most im-
portantly includes a list of resources containing pointers to the actual data, documents, or applications. In contrast,
the authors of [43] propose a standardised interchange format which enables machine-readable representations of
data catalogues. Thus, for catalogues differing widely in scope, terminology, structure, and metadata fields, this
contribution acts as an interoperability format. With regards to versioning, a solution to the issue is the use of
Named Graphs [10], where the metadata represents the temporal validity of the annotated RDF data. However,
this solution is only available with the use of RDF.
Overlapping Scope - Provenance, whilst not a challenge in itself, is also an issue. Provenance refers to details
about the origins of data, or, in other words, who created or generated the data. The issue with provenance occurs
when there is the assumption that data strictly travels in a vertical direction, for example from local, to regional,
national, European and international level. There are numerous parallel entities which collect data, and then pass
it on to another relevant entity. For example, budget datasets from a city can be published on the city’s portal, but
also transferred to the entity taking care of cities within a specific region. This results in an overlapping scope,
where data may have duplicates, but also new or modified data [44]. Hence, provenance does not only regard the
source of the data, but also how the data was modified or manipulated during the publishing process.
Here again, named graphs can be a solution to provenance issues, as different provenance metadata can be
attached to datasets with varying provenance [68]. With a somewhat different approach, the authors of [43] propose
a standard interchange format which enables federated search over catalogues or portals with overlapping scope,
providing a way around this problem. Using a more concrete approach, the W3C Provenance Incubator group20,
on the other hand, strives to provide a roadmap in the area of provenance for Semantic Web technologies.
20 http://www.w3.org/2005/Incubator/prov/wiki/W3C_Provenance_Incubator_Group_Wiki
19
Public Participation - A very relevant challenge to achieving the full potential of published datasets in portals
is their use, or lack thereof. The increasing number of open data initiatives, where government entities are opening
up their data, ideally would result in increased transparency, participation, and innovation [59]. Yet, as the authors
of [11,19, 21, 22,47, 79,81] point out, the full potential of consumer participation and collaboration for achieving
innovation in government services has yet to be reached. Participation, as defined by [67], means the extent to
which stakeholders can participate in the governance of an open government data portal, such as suggesting what
data to publish, or rating datasets or features on the portal itself. Collaboration, an extension to participation, refers
to features on a portal that enable cooperation and collaboration amongst different stakeholders.
Public access to government data also remains challenging due to the heterogeneous and dispersed nature of
the data. The lack of consumers exploiting existing open data portals indicates that there is the need to understand
what factors influence participation in open data, and the requirement to engage stakeholders in participating and
collaborating. If the projected consumers of the data do not use it, then the objective of open government initiatives
is futile. For a portal to be successful, consumers (including citizens, end users and beneficiaries) must be made
aware of the published data, and its relevance and usefulness [51]. Considered to be a core pillar of democratic
society, the collaboration between a government and its citizens has the potential of open data consumption,
policy making, service delivery, and also political opinions and decisions [74]. This interaction would allow the
government to provide more citizen-centred services and data.
In literature such as [79] the authors attempt to identify what influences the participation of stakeholders in
consuming open data, with the aim of mitigating the barriers they face. Furthermore, the authors of [11] establish
strategies to ensure that open data initiatives reach the desired participation rate. Similarly, in [70], the authors
tackle the question of what kind of services should governmental entities provide in order to increase stakeholder
participation. In contrast, the authors of [5] focus on issues that smaller communities face when attempting to con-
sume open data. The authors analyse these issues with the aim of enhancing public participation with the purpose
of creating local data infrastructures. In [69] the authors attempt to give structure to unstructured documents (such
as PDF) and store them in repositories compliant with open government data principles, with the aim of providing
stakeholders with analysis functionality and unrestricted data access.
6 Publishing and Consuming Open Government Data
The act of publishing data is the very basis of open government data initiatives. Government and public entities
are sharing data on the Internet at an astonishing pace. Yet, there is a lack of agreed-upon standards for data
publishing [16], and as discussed in detail in Section 5.3, there are many challenges to be overcome in order for
the published data to be exploited to its full potential. While not all challenges are directly related to publishing
issues, tackling these issues at the root could prevent subsequent issues related to data consumption. For example,
if data is published in a machine-readable format with good metadata descriptions, then usability issues will most
probably be avoided when it is consumed.
20
The publishing of data enables it to be available for use by the public, in an attempt to achieve the main aim of
open government data initiatives; namely to use, reuse and distribute the published data. This is only achievable
through the consumption of the data by stakeholders. Data consumption is possible through a number of means.
The most direct example is to obtain a copy of the actual published data, generally with the aim of using it in a
specific use-case. Certain portals might also provide exploration tools, where a data consumer can simply look
through the published data. Other tools, such as analysis tools, enable a consumer to actually identify potential
patterns in the published data. Usually analysis tools also provide for visualisations, which aid data consumers
to view the data in a pictorial manner. An even more hands-on way of consuming the data is to create mashups,
where different datasets are merged in order to create new knowledge using existing data.
6.1 Publishing Data
In this section we provide a classification of different data publishing approaches, and proceed to discuss guide-
lines and best practices for publishing data in any data publishing effort.
6.1.1 Data Publishing Approach Classification
There are countless methods towards publishing data. Following the contribution within [33], we here classify
open government data publishing initiatives into two:
1. The technological approach - followed by the data publisher in the actual act of publishing data, i.e. making
the data available on the Web. Publishing initiatives are classified within the first approach depending on the
variation of technologies implemented for publishing the data. These include:
a) The format of the published data (proprietary, machine readable, descriptive);
b) The access method (RESTful APIs, custom APIs, search interfaces);
c) The use of linked open data principles (HTTP, URIs, RDF);
d) The level of linkage to different datasets (LOD cloud 21).
As is evident, the above reflect most of the existing guidelines for publishing data, especially the Five Star
Scheme for Linked Open Data.
2. The organisational approach - followed by the data provider, i.e. the manner in which the data is provided
to the data consumers. The second dimension for open government data publishing initiatives focuses on the
provision of data, rather than the actual act of publishing. The authors of [33] identify two different methods
of providing linked open data, the epitome of an open government initiative, each with their own advantages
and disadvantages.
a) Direct Data Provision - Direct data provision involves a portal aggregating all processed and value-added
data provided by a public entity. In this case, the data publisher is not necessarily the same as the data
21 http://lod-cloud.net/
21
provider (public entity). In the case that the latter are 2 different entities, the maintainability is limited
unless an effective data synchronisation process is in place. For example, if the original data from the
public entity changes over time, this change must be reflected in the data published on the data portal,
otherwise the data provided here will be obsolete [16]. An advantage of having direct data provision,
however, is the consumers’ direct access to data through a single entry point.
b) Indirect Data Provision - Data Catalogues are a good example of indirect data provision, where the data
cannot be directly accessed through the catalogue. Catalogues contain links (metadata) to the actual data
provided by the public entity. Therefore, in this case, the data provider is usually also the data publisher. To
access data, a consumer has to search for the relevant data through the catalogue, then follow the provided
links to the public entity that provides the actual data. In contrast to direct data provision, indirect data
provision has the advantage of being up to date and unique, since the actual data is published by the
data producer itself. On the other hand, processed and value-added data has to be performed by the data
consumer, as it cannot be provided by the data catalogue.
6.1.2 Publishing Guidelines
In order to tackle the previously-mentioned issues in Section 5.3, and other publishing-related problems, a number
of publications in the literature, such as [26,38, 70], propose guidelines for publishing data on the Web. The basis
of most of these guidelines are the Eight Open Government Data Principles:
1. Complete - All available public data that is not subject to privacy, security or privilege limitations is made
available.
2. Primary - Data is made available as it is available at the source, and not aggregated or modified.
3. Timely - Data is made available to the public as soon as possible after the actual data is created, in order to
preserve the value of the data.
4. Accessible - Data is made available to all consumers possible, and with no limitations on its use.
5. Machine Processable - Data is published in a structured manner, to allow automated processing.
6. Non-Discriminatory - Data is available for all to use, without requiring any registration.
7. Non-Proprietary - Data is published in a format which is not controlled exclusively by a single entity.
8. Licence-Free - Other than allowing for reasonable privacy, security and privilege restrictions, data is not
subject to any limitations on its use due to copyright, patent, trademark or trade secret regulations.
The above principles provide a roadmap for the data publisher and help result in good open government data
with the best potential for being consumed by the stakeholders. Further to these principles, the Five Star Scheme
for Linked Open Data, listed below, provides a more technical guide towards publishing linked open data, the
epitome of open government data initiatives:
1. Available on the Web in any format but with an open licence (Open Data);
22
2. Available as machine-readable structured data (e.g. Microsoft Excel table instead of image scan of a table);
3. Available as machine-readable structured data in a non-proprietary format (e.g. CSV instead of Microsoft
Excel);
4. All of the above as well as using open standards from W3C (RDF and SPARQL) to identify things;
5. All of the above as well as linking the published data to other existing data to provide context.
In order to provide official guidelines, the W3C eGov Interest Group has also developed the following set
of steps for publishing open government data22, which emphasise standards and methodologies to encourage the
publishing of government data, with the aim of enabling easier use by the public:
1. Identify - The use of permanent, patterned and/or discoverable URI/URLs enables processes and people to
find and consume the data more easily.
2. Document - Documentation helps the data to be more understandable and less ambiguous, as well as enabling
easier data discovery. The use of formats such as XML/RDF would be self-documenting.
3. Link - Linked data contains links to other data and documentation, providing context.
4. Preserve - The use of versioning of datasets enables data consumers to cite and link to present and past
versions, where new and upgraded datasets can refer back to original datasets. Versioning also allows the
documentation of changes between versions.
5. Expose interfaces - To make it easier for published data to be discovered and explored, published data should
be both human-readable and machine-readable. Preferably, data should be published separate from the inter-
face, and external parties should have direct access to raw data. This enables them to build their own interfaces
if needed.
6. Create standard names/URIs for all government objects - The use of a unique identifier for each object
is as important as having information about the object itself. This aids in discoverability, improves metadata,
and ensures authenticity.
Along with the above, the W3C eGov Interest Group also discusses the importance of choosing what data
to publish, the right format to publish it in, and the restrictions on its use. Data which is to be shared with the
public should be published in compliance with applicable laws and regulations, and only after addressing issues
of security and privacy. Such data is usually already available in other formats, and may already have been shared
with the public in other ways. The best format to publish this data is in its raw form serialised as XML and RDF,
to allow for easy manipulation. The use of established open standards is also recommended. Finally, the published
data should have clear documentation on any legal or regulatory restrictions on the use of that data.
The authors of [38] present some recommendations for data publishing and analysis based on a survey on
the sustainability related datasets published by the Australian government, with the aim of identifying underlying
22 http://www.w3.org/TR/gov-data/
23
opportunities and issues. While not entirely reflecting the above-mentioned guidelines, the proposed recommen-
dations complement the essential aspects. The authors tackle commonalities amongst data published by different
public entities, the ideal formats for publishing data as linked data, its discoverability, and its re-usability.
Similarly, the authors of [61] identify common issues and challenges to the accessibility and reusability aspects
of public sector information. They point out that such obstacles can be of legal, institutional, technical or cognitive
nature. They proceed by providing common solutions that can be implemented to overcome these issues.
In [70], the authors propose a maturity model for open data, with the aim of assessing the commitment and
capabilities of public agencies in pursuing the principles and practices of open data. The authors extend the dis-
cussed guidelines and principles by considering other aspects towards publishing data, including an Establishment
and Legal Perspective, a Technological Perspective, and finally a Citizen and Entrepreneurial Perspective.
Another maturity model was defined in [41]. Here the authors aim towards identifying essential contextual
aspects which affect the way data is published by public entities on their portals. The latter aspects are then
organised into an online transparency for an accountability maturity model, which has the purpose to assess the
level of advancement of a governing region. In other words, researchers requiring to assess an entity should start
by analysing the context using the proposed maturity model, and then proceed to define the assessment model
depending on the identified maturity level.
6.2 Consuming Data
The provision of data enables stakeholders (whether individuals, businesses, NGOs, or otherwise) to not only
scrutinise the published data, but also to stimulate stakeholders to create, deliver, and use new services that are
coupled with the published data [19]. Services can be as simple as offering exploration of the published datasets,
but may also include visualisation and data discovery services such as data mining and comparative analysis. The
latter enable stakeholders to explore the data and identify patterns. Furthermore, if the published data is linked with
other data on the Web, the services can be enhanced with mashups, and further increase the knowledge that can
potentially be discovered through the available data. This opportunity then potentially proceeds to an improvement
in e-government services provision, increasing work opportunities and finally contributing to economic growth
[11,33].
Unfortunately, few open government data portals provide consumption functionalities other than simple data
downloads [2]. In an attempt to enhance the consumption experience, the authors of [27] explore the challenges
and issues related to the integration and analysis of open data. Amongst other challenges the authors identified:
The lack of standard procedures for querying government portals;
The low quality of metadata;
Low reliability and non-completeness of public datasets; and
The heterogeneity of formats used to publish open data.
24
They proceed to propose a linked open data approach to modelling, merging and analysing specific data;
namely spatio-temporal and statistical data.
The authors of [29] tackle the question of how open data can encourage the creation of sustainable value. They
discuss that new methods of generating value can be brought about by the sharing and reuse of open data. The
authors proceed to propose a model describing how various processes within an open data system can generate
sustainable value, based on a number of contextual factors that provide stakeholders with the motivation, the
opportunity, and the ability to create it.
6.3 Data Quality
As defined in Section 3, data quality has no agreed-upon definition, and apart from being cross-disciplinary,
it is also subjective [53]. Also, the publishing of data on portals does not guarantee that it is of good or high
quality [15, 59]. For these reasons, we hereby do not define how published data can be of good quality, but we
discuss the different aspects which influence the quality of the data, whether positively or negatively.
The authors of [52] propose a set of metrics to identify metadata quality, based on parameters used for human
reviewing. The authors of [59] build upon these metrics, adapting them for assessing the quality of the actual data,
rather than the metadata. Similarly, in [35,40], the authors discuss a number of quality dimensions, as found in the
majority of related literature. We here establish the following criteria which are considered by most efforts in the
literature for calculating data quality.
Usability - This is the most "generic" quality criterion. By usability we mean how easily can the published
data be used. It is the most generic as it depends on other quality dimensions whether the published data is usable
or otherwise. For example, it is directly related to what degree the data is accessible, open, interoperable, complete,
and discoverable [38,45]. The more the published data is usable, the more potential data consumers are encouraged
to reuse and exploit the data.
Accuracy - By accuracy we mean the extent to which a data/metadata record correctly describes the respective
information [35, 46, 59]. With respect to metadata, this quality dimension directly affects the discoverability of
datasets, as good quality metadata enables the dataset to be easily discovered by data consumers.
Completeness - This quality dimension deals with the number of completed fields in a data/metadata record
[52, 59, 70]. Thus, a record is considered complete only when the record contains all the information required to
have the ideal representation of the described data. The completeness of the metadata, like accuracy, also directly
affects the discoverability of datasets.
Consistency - The consistency of record fields depends on whether they follow a consistent syntactical format,
without contradiction or discrepancy within the entire catalogue of metadata [35,43]. Apart from the syntactical
format, a field is considered to be consistent if the respective values are selected from a fixed set of options. An
example of inconsistency is if within two records the use of “U.S” and “United States” is interchangeable. Another
example is the representation of dates, where the date, month and year follow an arbitrary order.
25
Timeliness - By this quality dimension we mean the extent to which the data or metadata is up to date. As
pointed out in Section 6.1, the organisational approach affects the timeliness of the published data, which depends
on whether the data is directly or indirectly provided by the data provider.
Accessibility - As identified by the authors of [52], the accessibility quality dimension has two measures. The
cognitive accessibility defines how easy it is for a data consumer to understand the published information. Several
aspects of the data affect the cognitive accessibility, such as the ambiguity of the data, discussed in section 5.3.
The second measure is the psychological or logical accessibility, which can be defined as the ease with which
the relevant dataset is discovered through a data catalogue or repository. As discussed in Section 5.3, this quality
dimension is affected by the format in which the data is published, the search tool used, and the discoverability of
the dataset [43].
Openness - The openness of a dataset directly influences the use, reuse, and redistribution of data. Tim
Berners-Lee’s Five Star Scheme for Linked Open Data23 (Figure 6) can be seen as a mix of the accessibility
and usability quality dimensions. As the authors of [35] point out, open data can be technically defined to be open
if it is available as a complete set in an open, machine readable format, at a reasonable price which is not more
than the cost of reproduction.
Fig. 6: Five Star Scheme for Linked Open Data (Source: 5stardata.info)
The authors of [35] identify two types of strategies for improving data quality; namely data-driven and process
driven. The first involves directly modifying the values of data, such as correcting invalid data values or normalis-
ing data. The second involves the redesign of the data creation and modification processes in order to identify and
correct the cause of quality issues, such as implementing a data validation step in the data acquisition process.
Efforts in publications such as [14,35,59] take a number of quality dimensions and implement them, with the
aim of assessing the quality of published data. The authors of [14] evaluate and assess the datasets’ quality in such
23 5stardata.info
26
a way that consumers can then identify the ideal quality for the intended use, attaching the results of the evaluation
to the actual dataset graph. In [35], the authors focus on the quality of catalogue records within initiatives in the
Czech Republic. They proceed to propose some techniques and tools to improve the quality of the data catalogue
records. Similarly, the authors of [59] propose quality assessment metrics and implement them in three public
government data repositories.
6.4 Challenges
There are a number of issues and challenges which hinder governments from jumping on the open data bandwagon
and from making data truly open. Here we shortly discuss relevant issues, including those pointed out by the
authors of [12, 18, 61,76, 84]. These challenges vary between organisational, economic and financial, policy and
legal, and cultural barriers.
6.4.1 Factors which discourage entities from joining an open government data initiative
Awareness - The concept of open data, while not new, might seem a daunting task for people unfamiliar with
the term and what it involves [61, 76]. Public entities in the past would have only been concerned with delivering
reports formatted to given templates. Recent requests to provide data in its raw format might not be understood
clearly [12]. For this reason, the value and potential use of raw open data needs to be highlighted [83].
Motivation - The provision of raw data can be considered to be extra work without any purpose, especially to
public entities such as those described above [76]. The value of the data generated during day-to-day administra-
tion needs to be pointed out. The reuse of open datasets can be a great motivator in portraying the unexpected use
of the generated data, and can also help the data producers in understanding the true value of the data they create
and publish [55].
Capacity - The use of open data should be targeted towards nobody in particular. Having said that, it should
be available for the use, reuse, and distribution of all, whether machines or humans. Unfortunately, many entities
are not so open-minded about the application of open data, and rather focus on the simple publishing of data
rather than ensuring that it is of good quality in this aspect. Furthermore, public entities might focus on publishing
data with no value, rather than other, more relevant, data [76,84]. There is the urgent need for the application of
standards and large-scale training in order to overcome these issues.
Budget Provision - Being a relatively new concept, there might not be any local budget allocation for open
government data efforts [76]. Considering the required processes for publishing data are "extra" tasks, requiring
effort, resources, and time, there is the new necessity of having a specific budget allocated for this purpose,
otherwise there is the risk that open government initiatives are not given the priority they deserve, moreover if
public entities do not grasp the true value of open data.
27
Technical Support - Most of the existing government data portals were not envisaged for large-scale open data
publishing and consumption. Thus, these public entities now require technical support to update their websites or
portals to enable their published data to achieve its highest reuse potential [12,18,61, 76, 84].
Institutionalisation - Being a relatively new initiative, open data tasks are usually assigned to employees
whose job was already pre-defined, with no institutional structure or public entity dedicated solely to this task [18,
61,76, 84]. This issue results in no regular monitoring of the open data initiative performance. The establishment
of open government initiative policies would help in this challenge by clearly defining required responsibilities.
6.4.2 Issues hindering data from being truly open
Conflicting Regulations - Whilst there is a lack of open government data policies, many open government data
initiatives still belong to existing legal frameworks concerning freedom of information, reuse of public sector
information, and the exchange of data between public entities. The issue lies in the unclear task of how such
initiatives can interact, resulting in uncertainty on the possible use of the relevant data. This issue does not only
concern data consumers, but also data producers who end up being sceptical of fully opening up their institutions’
data, even if it is covered by a clear legal framework [61].
Privacy and Data Protection - There is a considerable conflict between open data and the aims of trans-
parency and accountability, and data protection and the right to privacy [49, 61, 83, 84]. Even though data is
anonymised before publishing, the merging of different datasets can still possibly result in the discovery of data of
a personal nature [80]. For example, if garbage collecting routes are published, along with the personnel timetable,
a data consumer would be able to identify the location of a particular employee. This issue requires more research
in order to come up with guidelines that can provide a solution to this conflict, however a plausible approach
would be to employ access control mechanisms which regulate data access. However, this restricts the openness
level of such data.
Copyright and Licensing - The licensing of published data is one of the Eight Open Government Data Prin-
ciples. The first aspect of this issue is the incompatibility of licences [61]. As discussed in Section 6.1.2, data
providers should provide efforts towards publishing their data in an open format, allowing the free and unre-
stricted use, reuse and distribution of data. Since there are no agreed-upon standards, this can result in a number
of incompatible open licences. While they all, in different grades, allow the reuse of data, they might contain
restrictions which prevent data with different licences from being merged for a specific use. The definition of
clear data policies is a means to provide a solution to this challenge. The second aspect of this issue is copyright
inconsistencies that arise from unclear dataset ownership resulting from data sharing, for example between public
entities [12,18, 84]. This hinders data from being published.
Competition - While open data can be considered as unfair competition for private entities, public entities
might consider the commercial appropriation of public open data unfair [18,61]. In the first case, consider compa-
nies who invested in creating their own data stores (e.g. database of streets and locations for navigation purposes).
28
If the same data they created is made public through government open data initiatives, these companies will ob-
viously deem it to be unfair competition as there is the possibility of new competitors who did not need to invest
anything but could get the freely available open data. Thus, management mechanisms need to be applied in order
to ensure that private companies do not suffer financial consequences due to opening up their data. On the other
hand, public entities might be reluctant to publish their data openly due to not wanting data belonging to the public
(and paid by taxes) to be used for commercial gain. A possible approach for the latter issue is to provide the data
for a nominal fee. Yet, this limits the openness of the data in question.
Liability - This issue is limited to data providers. Public entities fear being held liable for damage caused
by the use of the provided data, due to it being stale, incorrect, or wrongly interpreted [18,61]. To cater for this
fear, many public entities either do not publish their data or otherwise impose restrictions on its use, resulting in
data which is not truly open. In the worst case, due to fears of data being used against the publishing entity, such
data might not even be collected/generated any longer [84]. A possible solution for these issues is to enable social
interaction with regards to the data in question. A community of stakeholders within the data platform where the
data is published can aid data consumers to better interpret and exploit the published data.
Considering the above risks or negative impacts, it is vital to find a trade-off for open government initiatives.
One must keep in mind the numerous benefits associated with open data, but also cater and prepare for any risks,
challenges and issues.
6.5 Publishing Tools and Standards
While there exist a huge number of government data portals that enable data producers to publish their data, there
are not many tools aiding data publishers in this task. Yet, efforts are currently being focused on providing portals
and other open government data initiatives which allow stakeholders to publish (and consume) datasets without
requiring background knowledge on the open data life-cycle. An example of such efforts is the LinDA project24.
In this project a stakeholder is able to upload data in any format, which is then converted to RDF to enable easy
linking with other open datasets.
The authors of [25] propose a technical framework for data sharing between data providers and consumers,
based on an analysis of a number of data platforms. They aim to identify, from the relevant literature, the required
functionality for data sharing, considering challenges such as different published formats, data ambiguity, and
privacy issues.
In [49] the authors present two case studies involving two different public sector entities, with the aim of
demonstrating the use of pre-commitment to resolve conflicts during a data request procedure. Pre-commitment
involves applying restrictions on the type and content of the data that is available for request, ensuring the data
conforms to the legal requirements (e.g. removing privacy sensitive data), and deciding on whether to open the
data publicly or restrict its access to specific user groups.
24 http://linda-project.eu/
29
The authors of [2] propose a second generation platform, which offers both the basic functionality of a gov-
ernment data portal, but also additional functionality (based on Web 2.0 technologies) aiming to stimulate and aid
value generation from open government data. This additional functionality includes the capability of performing a
number of processing techniques, information and knowledge exchange, and collaboration between stakeholders.
The authors of [30] introduce the European Open Government Data Initiative, which is a free, open-source,
cloud-based collection of datasets that public entities can exploit. In this case, public data can be uploaded and
stored into the Microsoft Cloud through the Windows Azure Platform and environment. This tool is aimed at ex-
perts, and allows developers to use a variety of programming languages. This initiative strives to keep in line with
the open government data principles and thus enables data to be openly published in a re-usable format, enables
stakeholders to develop new applications based on the published data, allows developers to use the free and cus-
tomisable source code, and has the aim of enhancing transparency through increased visibility of a governments’
services.
In contrast to the above, the authors of [43] propose a standardised interchange format, the dcat vocabulary, for
machine-readable representations of government data catalogues, with the aim of bringing all published datasets
into the Web of linked data, resulting in higher interoperability. The use of this interchange format results in a
number of advantages:
1. The embedding of machine-readable metadata in Web pages increases discoverability;
2. The decentralised publishing by individual agencies could be aggregated into national or supra-national (e.g.
EU-wide) catalogues;
3. Catalogues with overlapping scope (e.g. Bonn, Germany and EU) can be searched in a federated manner;
4. One-click download and installation of data packages is available for application developers;
5. Priority is given towards archiving and digital preservation of valuable government datasets through the use
of manifest files with accurate metadata; and
6. Software tools and applications, such as improved search and data visualisation interfaces, can be built to
work with multiple, or even across, catalogues.
The dcat vocabulary has since been proposed as a W3C recommendation by the Government Linked Data
Working Group25.
7 Impact on stakeholders
Open government data initiatives are based on transparency,citizen participation, and collaboration for strength-
ening democracy [3,11,19,47,51,79]. Through these three pillars, the publishing of government datasets not only
has the potential of improving accountability and decreasing corruption, but it also affects all the involved stake-
holders in a number of ways. While there is an obvious niche in literature with regards to frameworks which assess
25 http://www.w3.org/TR/vocab-dcat/
30
Fig. 7: Relationship Between Different Impacts of Open Government Data Initiatives
the impact achieved through open government data initiatives, a number of authors discuss the different impacts
that can be obtained through such initiatives. The authors of [39] depict the different levels of impact that can be
achieved by an open government data initiative. We adapt these levels in Figure 7 and portray, in context, how
each impact builds upon or supports the other impacts. While each impact does not strictly require the previous
one, each impact supports the next one to achieving a higher level of impact on the relevant stakeholders.
As shown in Figure 7, the most direct impact is access to information. Once data is published (made open to a
given degree), this impact is immediately effective, since it provides the means for data to be reused. Of course, the
data’s reuse is conditional on how the data is published (its level of openness), and the consumer’s willingness to
participate in such an effort. Through providing access to relevant information, an open government data initiative
can be more transparent.
Transparency, the second level of impact for publishing government data, can result in a considerable increase
in social control by citizens by enabling them to scrutinise the data. Subsequently, if provided with the relevant
means, they can also provide relevant feedback to the data provider, and monitor policies and government ini-
tiatives [47, 65, 78]. Consequently, stakeholders gain more responsibilities as they are able to interact with the
government and other public entities more actively than in traditional governmental structures. For example, fol-
lowing the publishing of budget data, stakeholders such as citizens, NGOs and even other private entities can
provide feedback on budget priorities and specific transactions26. Therefore, by easing social control, open gov-
ernment data initiatives allow citizens to further exercise their duty and right of participation. Moreover, it helps
citizens establish a trusting relationship with the government, which is able to prove legitimacy of the actions
taken.
26 http://www.participatorybudgeting.org/about-participatory- budgeting/where-has- it-
worked/
31
The increased transparency resulting from publishing data will also impact public administrations in that there
will be enhanced accountability within public sectors. The authors of [9] define accountability as the disclosure of
data that provides stakeholders with the information required for assessing the propriety and effectiveness of the
government’s conduct, while the authors of [39] identify accountability as having a dimension of answerability.
They separate the latter into two components, namely information and justification. The first implies that there
should be an entity that is obliged to provide information to which the stakeholders should have access. Justifica-
tion, on the other hand, is more challenging to achieve since it implies that the data-providing entity should justify
their actions to the citizens. Yet, as the authors of [40] point out, even if the published data is usable and adheres
to good quality standards, the simple provision of data does not guarantee that the public entity or government is
immediately enhancing transparency and/or accountability.
Through the long term interaction with an open government data platform, open data promotes not just trans-
parency and accountability, but also democracy [51]. As mentioned in the example for the budget data, stakehold-
ers can be enabled to provide feedback on the published data. Such feedback loops will not only inform the public
entity of the public opinion, but also can improve service delivery through the repeated querying of the open data
by all stakeholders, including citizens and government agencies. For example, the analysis of published budget
data would enable the shift from a centralised government to a citizen-centric governance model.
While datasets are usually published in their raw form, and thus have little value on their own, public entities
can leverage on other stakeholders, such as the private sector, community groups, and citizens, to innovate upon
the published data and strive to achieve the utmost potential of open government data initiatives [19,51, 79]. Ben-
efits are plenty, including exploiting user participation (crowdsourcing) in order to enhance data quality through
feedback [53]. Yet, active participation is not so simply achieved. While open data initiatives form the basis for
citizen participation and collaboration, there is no guarantee that there is actually any resulting participation or
collaboration [2, 11, 19,71]. Moreover, as the authors of [50] and [51] point out, there is the need to bridge the
gap between data providers and consumers by using data intermediaries. Thus, those who can make sense of the
published data should interact with the software developers in such a way that the latter can develop innovative
applications or services based on the published data. Even though this informal type of collaboration is facilitated
by the existing technologies, it is not yet fully endorsed by public and governmental entities.
7.1 Challenges
The need to instigate data reuse through citizen participation is essential, as it promotes the innovative potential
of developers and other stakeholders. This is, however, easier said than done. A number of barriers hinder public
participation, and mostly include challenges related to the cultural domain.
32
The authors of [71] point out the need of an action plan for stimulating the consumption of open datasets
between both the original data producers and the other consumers. User participation usually follows a "90-9-1
rule" 27 where:
90% of users are lurkers who follow by reading or observing but do not actively contribute;
9% of users contribute from time to time, but other priorities dominate their time;
1% of users participate a lot and account for most contributions.
‘Lurking’ tends to have a negative connotation, however lurking is also valuable in a democratic society where
an informed citizen can take effective decisions [19]. In open government data initiatives, the aim is to achieve the
highest number of active users as possible, keeping in mind that collaboration is not done for the sake of doing it,
but to enable all stakeholders to participate in efficient and effective decisions.
Anonymity can be seen as an advantage in online participation. It allows anyone to be able to speak freely
about his/her opinions and about any agendas they might be interested in, without the fear of being persecuted
for them. This makes it easier for stakeholders to participate in efforts such as decision-making. Yet, anonymity
also has its downside as it allows participants to contribute undesirable and useless information, as well as making
participants more likely to insult or verbally attack others whilst hiding behind their anonymity [19]. Furthermore,
a single user can use multiple online identities to manipulate the discussion in progress.
The participation of third parties in processes such as policy-making or decision-taking does not only poten-
tially increase citizen satisfaction, but it also increases the potential of more innovative solutions or approaches
to problems. [42] term this participation as open government collaboration, which involves the collaboration of
different entities during the implementation, monitoring, and evaluation of policies. Entities such as unions and
political party associations were always traditionally included in the process of policy-making. Yet, these entities
do not represent all members of society equally. By allowing all stakeholders to participate through eParticipation,
a new collaboration approach that enables a many-to-many communication allows all individuals to participate in
shaping the democracy they live in.
Albeit the benefits of open data outweigh the efforts required, it appears that there is a lack of public par-
ticipation in open government data initiatives. In [79] the authors identify that the lack of research on the factors
influencing external stakeholders’ decision to participate and consume open data might be a factor in this problem.
The authors of [11], on the other hand, point out that governmental agencies do not have effective strategies to
encourage participation from external stakeholders. Such public entities must come to the realisation that success-
ful open data initiatives are based on the actual usage of the data rather than simply the creation of an open data
portal. In [5], the authors carry out a case study with the aim of identifying how community data can be leveraged
through public libraries. Amongst the authors’ conclusions, they point out that stakeholders (i) not only need more
data, but need it to be meaningful, (ii) need the identification of best practices for using the data, and (iii) request
the collaboration of different stakeholder communities.
27 www.nngroup.com/articles/participation-inequality/
33
7.2 Motivating the Use of Open Data
On the premise that the role of government agencies in open data initiatives is not only to publish the data, public
agencies are starting to focus their efforts on motivating external stakeholders to use the published data. While
there is no agreed-upon method to achieve public participation, there are a number of popular methods. Challenge
competitions are a commonly-used approach [21], where the competition involves developing the best application,
or finding an innovative use, based on the published data. Usually the winners are awarded a prize or recognition
for their efforts. A disadvantage of such competitions is that most participants are usually novices rather than
professionals [11]. This is of course somewhat reflected in the submitted entries, which tend to be amateurish.
Moreover, the entries do not usually contribute to the development of sustainable services [21]. Professionals are
usually deterred from participating in such competitions due to the minimal (if any) prize money. In any case, it is
evident that in such cases the governmental entity does not have any direct control on the output of the competition,
and there is no assurance on the quality. For these reasons, challenge competitions are more suitable for just raising
awareness about the open data initiative, and introducing stakeholders to public participation. Another approach
towards encouraging participation are Calls for Collaboration, where companies are invited to submit proposals to
create particular services. As opposed to challenge competitions, the governmental entity now has a say as to what
will be developed as the output of the call, as well as the possibility to enforce the participants to meet specific
requirements.
A number of publications in the literature attempt to identify the best method to achieve public participation.
In [51, 79], the authors propose their intentions in researching the best practices in increasing the consumption
of open data. The authors of [32] research the use of social media platforms in eParticipation and propose a
two-phased approach for backing participatory decision-making, along with an architecture which supports its
implementation. This approach is based on the integration of government and social data and attempts firstly to
help the government identify public opinion and predict public reactions, and secondly to enable citizens and
stakeholders to contribute to the decision-making process. With the similar aim of identifying what motivates
stakeholders to participate and collaborate, the authors of [11] identify a set of considerations for motivating
stakeholders to innovate upon the published datasets.
8 Conclusion
In this paper we give an overview of the open government data initiatives surveyed in our systematic research. The
aim behind this research is to answer a set of questions, mainly concerning open government data initiatives and
their impact on stakeholders, existing approaches for publishing and consuming open government data, existing
guidelines, and challenges (see Table 3) for the discussed approaches. We identify corruption to be the major
problem which triggered open government data initiatives, and we point out the various motivations for opening
government data. One major motivation is transparency, which however should not be an end in itself. It should
34
Nature of Challenge Challenge Possible Solution
Technical
Formats Using a Machine-processable, non-proprietary format
Ambiguity Using a descriptive format; Adding documentation/metadata
Discoverability Using good quality metadata; More advanced search tools on portals
Representation Defining and using standardised representation; Using named graphs for versioning
Capacity Applying standards; Large-scale training
Policy/Legal
Copyright/Licensing Defining standard data policies
Conflicting Regulations Defining open government data initiative policies and legal frameworks
Privacy/Data Protection Defining privacy regulations; Implementing access control mechanisms (this limits the openness of the data)
Liability Social interaction; Raising awareness; Defining legal frameworks
Economic/Financial Budget Provision Providing budget specifically for open data initiatives
Organisational
Institutionalisation Re-organising the current organisational structure; Defining open government initiative policies
Overlapping Scope Using provenance metadata
Technical Support Providing support to public entities with the executing of an open data initiative
Cultural
Motivation Raising awareness on the reuse of open data and its benefits
Awareness Highlighting the value and potential of open data
Public Participation Raising awareness; Providing incentives
Competition Providing specific data at a nominal fee (this limits the openness of the data)
Table 3: Overview of Challenges in Open Government Data Initiatives
rather be a means to enhance an open government initiative. This perspective will avoid governments in publishing
their data for the sake of it, rather than striving to provide useful data which stakeholders can use, reuse and
distribute, and ideally even innovate upon.
Based on existing open data life-cycles and on existing open data initiatives, we define the open government
data life-cycle, which is provided as the depiction of the processes and their ideal order required during the lifetime
of open government data. The definition of this life-cycle is not meant to be an extensive description of the
processes; rather we propose it to act as a guideline for stakeholders to follow during their participation in an open
government data initiative.
One of our main contributions is the discussion about open government data initiatives. We first discuss differ-
ent assessment frameworks for evaluating various aspects of open government initiatives. We follow by providing
a summary of open government initiative evaluations found in our primary studies. The various publications cov-
ered evaluate different aspects of the initiatives, such as the features provided, the openness level of the available
data, and the impact on relevant stakeholders. Many of them also evaluate the current status for specific adminis-
trative regions. Based on the results of our evaluations, we proceed to point out challenges and issues which hinder
open government initiatives from reaching their full potential, and we also suggest possible solutions.
In this paper we focus on the publishing and consumption processes of open government data, which are
the most essential processes within the life-cycle. We classify different publishing and consumption approaches,
and identify different data quality aspects which influence or are influenced by the approaches undertaken for
consuming or publishing the data. Based on the literature covered in the survey, the Eight Open Government
Data Principles, and the Five Star Scheme for Linked Open Data, we extract and integrate various guidelines for
publishing open government data. Adhering to these guidelines will improve the end usability of the data (for
consumption), and the resulting success of the initiative in question. Unfortunately, while some solutions exist,
there are still a number of factors which influence public entities from jumping on the open data bandwagon in
the first place, as well as other issues which hinder data from being truly open. Besides, even though efforts are
35
being targeted towards producing publishing tools to aid data publishers in their task, there are no fixed standards
to follow.
To conclude, we revisit the research questions posed in Section 2.1 and summarise the discussions in this paper
with the following observations:
What are existing approaches for publishing or consuming open government data, and how can they be clas-
sified?
Open government data initiatives vary in nature, and the implemented approaches reflect this heterogeneity.
However, the most common approaches include data portals, data catalogues, and services.
What are the supported technical aspects, features and functions in existing approaches?
The aim behind most open government data initiatives is to publish data in order to make it available for
reuse. The most commonly available feature is therefore the availability of data. This basic feature is then
complemented through other technical aspects, together with features and functions, such as multilinguality,
different data formats, data accessibility, data content, and visualisation tools.
Are there any defined guidelines for the publishing or consumption of open government data?
While a number of different guidelines are defined in literature, there are no agreed upon standards for the
publishing or consumption of open government data. Yet, by following the integrated overview of guidelines
we propose, we attempt to provide a higher possibility for an open government data initiative to succeed.
What are existing challenges with publishing or consuming open government data?
We identified and explored a number of challenges, including technical, policy and legal, economic and finan-
cial, organisational, and cultural barriers.
What are possible impacts of open government initiatives on relevant stakeholders?
Transparency was identified to be one main aim of opening government data, however it is not the only
impact. There are varying impacts of open government data initiatives, including the direct impact of access
to information that results in more informed citizens, as well as an increase in accountability and a higher
opportunity for citizens to actively participate governance processes.
References
1. Alexopoulos, C., Spiliotopoulou, L., Charalabidis, Y.: Open data movement in greece: A case study on open government
data sources. In: Proceedings of the 17th Panhellenic Conference on Informatics. pp. 279–286. PCI ’13, ACM, New York,
NY, USA (2013), http://doi.acm.org/10.1145/2491845.2491876
2. Alexopoulos, C., Zuiderwijk, A., Charalabidis, Y., Loukis, E., Janssen, M.: Designing a second generation of open data
platforms: Integrating open data and social media. In: Electronic Government - 13th IFIP WG 8.5 International Confer-
ence, EGOV 2014, Dublin, Ireland, September 1-3, 2014. Proceedings. pp. 230–241 (2014), http://dx.doi.org/
10.1007/978-3- 662-44426- 9_19
36
3. Arcelus, J.: Framework for useful transparency websites for citizens. In: Proceedings of the 6th International Conference
on Theory and Practice of Electronic Governance. pp. 83–86. ICEGOV ’12, ACM, New York, NY, USA (2012), http:
//doi.acm.org/10.1145/2463728.2463749
4. Bakıcı, T., Almirall, E., Wareham, J.: A smart city initiative: the case of barcelona. Journal of the Knowledge Economy
4(2), 135–148 (2013), http://dx.doi.org/10.1007/s13132-012- 0084-9
5. Bertot, J.C., Butler, B.S., Travis, D.: Local big data: the role of libraries in building community data infrastructures. In:
15th Annual International Conference on Digital Government Research, dg.o ’14, Aguascalientes, Mexico, June 18-21,
2014. pp. 17–23 (2014), http://doi.acm.org/10.1145/2612733.2612762
6. Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semantic Web Inf. Syst. 5(3), 1–22 (2009)
7. Bogdanovi´
c-Dini´
c, S., Veljkovi´
c, N., Stoimenov, L.: How open are public government data? an assessment of seven open
data portals. In: Rodríguez-Bolívar, M.P. (ed.) Measuring E-government Efficiency, Public Administration and Information
Technology, vol. 5, pp. 25–44. Springer New York (2014), http://dx.doi.org/10.1007/978-1- 4614-9982-
4_3
8. Böhm, C., Freitag, M., Heise, A., Lehmann, C., Mascher, A., Naumann, F., Ercegovac, V., Hernandez, M., Haase, P.,
Schmidt, M.: Govwild: Integrating open government data for transparency. In: Proceedings of the 21st International Con-
ference Companion on World Wide Web. pp. 321–324. WWW ’12 Companion, ACM, New York, NY, USA (2012),
http://doi.acm.org/10.1145/2187980.2188039
9. Bovens, M.: Analysing and assessing accountability: A conceptual framework. European Law Journal 13(4), 447–468
(2007), http://dx.doi.org/10.1111/j.1468-0386.2007.00378.x
10. Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs, provenance and trust. In: Proceedings of the 14th International
Conference on World Wide Web. pp. 613–622. WWW ’05, ACM, New York, NY, USA (2005), http://doi.acm.
org/10.1145/1060745.1060835
11. Chan, C.M.: From open data to open innovation strategies: Creating e-services using open government data. 2014 47th
Hawaii International Conference on System Sciences 0, 1890–1899 (2013)
12. Conradie, P., Choenni, S.: Exploring process barriers to release public sector information in local government. In: Pro-
ceedings of the 6th International Conference on Theory and Practice of Electronic Governance. pp. 5–13. ICEGOV ’12,
ACM, New York, NY, USA (2012), http://doi.acm.org/10.1145/2463728.2463731
13. Davies, T., Frank, M.: ’there’s no such thing as raw data’: Exploring the socio-technical life of a government dataset.
In: Proceedings of the 5th Annual ACM Web Science Conference. pp. 75–78. WebSci ’13, ACM, New York, NY, USA
(2013), http://doi.acm.org/10.1145/2464464.2464472
14. Debattista, J., Lange, C., Auer, S.: Representing dataset quality metadata using multi-dimensional views. In: Proceedings
of the 10th International Conference on Semantic Systems. pp. 92–99. SEM ’14, ACM, New York, NY, USA (2014),
http://doi.acm.org/10.1145/2660517.2660525
15. DiFranzo, D., Graves, A., Erickson, J., Ding, L., Michaelis, J., Lebo, T., Patton, E., Williams, G., Li, X., Zheng, J., Flores,
J., McGuinness, D., Hendler, J.: The web is my back-end: Creating mashups with linked open government data. In: Wood,
D. (ed.) Linking Government Data, pp. 205–219. Springer New York (2011), http://dx.doi.org/10.1007/978-
1-4614- 1767-5_10
37
16. Dos Santos Brito, K., Silva Costa, M., Cardoso Garcia, V., Romero de Lemos Meira, S.: Experiences integrating hetero-
geneous government open data sources to deliver services and promote transparency in brazil. In: Computer Software and
Applications Conference (COMPSAC), 2014 IEEE 38th Annual. pp. 606–607 (July 2014)
17. Dyba, T., Dingsoyr, T., Hanssen, G.K.: Applying systematic reviews to diverse study types: An experience report. In:
Proceedings of the First International Symposium on Empirical Software Engineering and Measurement. pp. 225–234.
ESEM ’07, IEEE Computer Society, Washington, DC, USA (2007), http://dx.doi.org/10.1109/ESEM.2007.
21
18. Eckartz, S., Hofman, W., Van Veenstra, A.: A decision model for data sharing. In: Janssen, M., Scholl, H., Wimmer, M.,
Bannister, F. (eds.) Electronic Government, Lecture Notes in Computer Science, vol. 8653, pp. 253–264. Springer Berlin
Heidelberg (2014), http://dx.doi.org/10.1007/978-3-662-44426-9_21
19. Edelmann, N., Höchtl, J., Sachs, M.: Collaboration for open innovation processes in public administrations. In: Charal-
abidis, Y., Koussouris, S. (eds.) Empowering Open and Collaborative Governance, pp. 21–37. Springer (2012), http:
//dblp.uni-trier.de/db/books/daglib/0028914.html#EdelmannHS12
20. Egger-Peitler, I., Polzer, T.: Open data: European ambitions and local efforts. experiences from austria. In: Gascó-
Hernández, M. (ed.) Open Government, Public Administration and Information Technology, vol. 4, pp. 137–154. Springer
New York (2014), http://dx.doi.org/10.1007/978- 1-4614- 9563-5_9
21. Foulonneau, M., Martin, S., Turki, S.: How open data are turned into services? In: Snene, M., Leonard, M. (eds.) Ex-
ploring Services Science, Lecture Notes in Business Information Processing, vol. 169, pp. 31–39. Springer International
Publishing (2014), http://dx.doi.org/10.1007/978-3- 319-04810- 9_3
22. Fuentes-Enriquez, R., Rojas-Romero, Y.: Developing accountability, transparency and government efficiency through mo-
bile apps: The case of mexico. In: Proceedings of the 7th International Conference on Theory and Practice of Electronic
Governance. pp. 313–316. ICEGOV ’13, ACM, New York, NY, USA (2013), http://doi.acm.org/10.1145/
2591888.2591944
23. González, J.C., Garcia, J., Cortés, F., Carpy, D.: Government 2.0: A conceptual framework and a case study using mexican
data for assessing the evolution towards open governments. In: Proceedings of the 15th Annual International Conference
on Digital Government Research. pp. 124–136. dg.o ’14, ACM, New York, NY, USA (2014), http://doi.acm.org/
10.1145/2612733.2612742
24. Hendler, J., Holm, J., Musialek, C., Thomas, G.: Us government linked open data: Semantic.data.gov. IEEE Intelligent
Systems 27(3), 25–31 (2012)
25. Hofman, W., Rajagopal, M.: A Technical Framework for Data Sharing. Journal of theoretical and applied electronic com-
merce research 9, 45 – 58 (09 2014), http://www.scielo.cl/scielo.php?script=sci_arttext&pid=
S0718-18762014000300005&nrm=iso
26. Höchtl, J., Reichstädter, P.: Linked open data - a means for public sector information management. In: Andersen, K.N.,
Francesconi, E., Grönlund, Å., van Engers, T.M. (eds.) EGOVIS. Lecture Notes in Computer Science, vol. 6866, pp. 330–
343. Springer (2011), http://dblp.uni-trier.de/db/conf/egov/egovis2011.html#HochtlR11
27. Janev, V., Mijovi´
c, V., Paunovi´
c, D., Miloševi´
c, U.: Modeling, fusion and exploration of regional statistics and indicators
with linked data tools. In: K?, A., Francesconi, E. (eds.) Electronic Government and the Information Systems Perspective,
Lecture Notes in Computer Science, vol. 8650, pp. 208–221. Springer International Publishing (2014), http://dx.
doi.org/10.1007/978-3- 319-10178- 1_17
38
28. Jetzek, T., Avital, M., Bjørn-Andersen, N.: Generating sustainable value from open data in a sharing society. In: Bergvall-
Kåreborn, B., Nielsen, P. (eds.) Creating Value for All Through IT, IFIP Advances in Information and Communication
Technology, vol. 429, pp. 62–82. Springer Berlin Heidelberg (2014), http://dx.doi.org/10.1007/978-3-
662-43459- 8_5
29. Jetzek, T., Avital, M., Bjørn-Andersen, N.: Generating sustainable value from open data in a sharing society. In: Bergvall-
ˇ
ereborn, B., Nielsen, P. (eds.) Creating Value for All Through IT, IFIP Advances in Information and Communica-
tion Technology, vol. 429, pp. 62–82. Springer Berlin Heidelberg (2014), http://dx.doi.org/10.1007/978-3-
662-43459- 8_5
30. Jiˇ
ˇ
cek, Z., Di Massimo, F.: Microsoft open government data initiative (ogdi), eye on earth case study. In: Hˇ
rebíˇ
cek,
J., Schimak, G., Denzer, R. (eds.) Environmental Software Systems. Frameworks of eEnvironment, IFIP Advances in
Information and Communication Technology, vol. 359, pp. 26–32. Springer Berlin Heidelberg (2011), http://dx.
doi.org/10.1007/978-3- 642-22285- 6_3
31. Juran, J.M.: Juran’s Quality Handbook. Mcgraw-Hill (Tx), 4th edn. (1974)
32. Kalampokis, E., Hausenblas, M., Tarabanis, K.A.: Combining social and government open data for participatory
decision-making. In: Tambouris, E., Macintosh, A., de Bruijn, H. (eds.) ePart. Lecture Notes in Computer Science,
vol. 6847, pp. 36–47. Springer (2011), http://dblp.uni-trier.de/db/conf/epart/epart2011.html#
KalampokisHT11
33. Kalampokis, E., Tambouris, E., Tarabanis, K.: A classification scheme for open government data: Towards linking decen-
tralised data. Int. J. Web Eng. Technol. 6(3), 266–285 (Jun 2011), http://dx.doi.org/10.1504/IJWET.2011.
040725
34. Kitchenham, B.: Procedures for performing systematic reviews. Tech. rep., Departament of Computer Science, Keele
University (2004)
35. Kuˇ
cera, J., Chlapek, D., Neˇ
caský, M.: Open government data catalogs: Current approaches and quality perspective. In:
Technology-Enabled Innovation for Democracy, Government and Governance, Lecture Notes in Computer Science, vol.
8061, pp. 152–166. Springer Berlin Heidelberg (2013), http://dx.doi.org/10.1007/978-3-642-40160-
2_13
36. Layne, K., Lee, J.: Developing fully functional e-government: A four stage model. Government Informa-
tion Quarterly 18(2), 122 – 136 (2001), http://www.sciencedirect.com/science/article/pii/
S0740624X01000661
37. Lin, C., Yang, H.C.: Data quality assessment on taiwan’s open data sites. In: Wang, L.L., June, J., Lee, C.H., Okuhara, K.,
Yang, H.C. (eds.) Multidisciplinary Social Networks Research, Communications in Computer and Information Science,
vol. 473, pp. 325–333. Springer Berlin Heidelberg (2014), http://dx.doi.org/10.1007/978-3-662- 45071-
0_26
38. Liu, Q., Bai, Q., Ding, L., Pho, H., Chen, Y., Kloppers, C., McGuinness, D., Lemon, D., de Souza, P., Fitch, P., Fox, P.:
Linking australian government data for sustainability science - a case study. In: Wood, D. (ed.) Linking Government Data,
pp. 181–204. Springer New York (2011), http://dx.doi.org/10.1007/978- 1-4614- 1767-5_9
39. López-Ayllón, S., Arellano Gault, D.: Estudio en materia de transparencia de otros sujetos obligados por la Ley Federal
de Transparencia y Acceso a la Información Pública Gubernamental. Centro de Investigación y Docencia Económicas:
Instituto Federal de Acceso a la Información: UNAM. Instituto de Investigaciones Jurídicas (2008)
39
40. Lourenço, R.P.: Open government portals assessment: A transparency for accountability perspective. In: Wimmer, M.,
Janssen, M., Scholl, H.J. (eds.) EGOV. Lecture Notes in Computer Science, vol. 8074, pp. 62–74. Springer (2013), http:
//dblp.uni-trier.de/db/conf/egov/egov2013.html#Lourenco13
41. Lourenço, R., Serra, L.: An online transparency for accountability maturity model. In: Janssen, M., Scholl, H., Wimmer,
M., Bannister, F. (eds.) Electronic Government, Lecture Notes in Computer Science, vol. 8653, pp. 35–46. Springer Berlin
Heidelberg (2014), http://dx.doi.org/10.1007/978-3-662-44426-9_3
42. von Lucke, J., GroSSe, K.: Open government collaboration. In: Gascó-Hernández, M. (ed.) Open Government, Public
Administration and Information Technology, vol. 4, pp. 189–204. Springer New York (2014), http://dx.doi.org/
10.1007/978-1- 4614-9563- 5_12
43. Maali, F., Cyganiak, R., Peristeras, V.: Enabling interoperability of government data catalogues. In: Wimmer, M., Chap-
pelet, J.L., Janssen, M., Scholl, H.J. (eds.) EGOV. pp. 339–350. Lecture Notes in Computer Science, Springer (2010)
44. Marienfeld, F., Schieferdecker, I., Lapi, E., Tcholtchev, N.: Metadata aggregation at govdata.de: An experience report. In:
Proceedings of the 9th International Symposium on Open Collaboration. pp. 21:1–21:5. WikiSym ’13, ACM, New York,
NY, USA (2013), http://doi.acm.org/10.1145/2491055.2491077
45. Martin, S., Foulonneau, M., Turki, S., Ihadjadene, M.: Open data: Barriers, risks, and opportunities. In: European Confer-
ence on eGovernment, Como, Italy, June 13-14 (2013) (2013)
46. Martin, S., Foulonneau, M., Turki, S.: 1-5 stars: Metadata on the openness level of open data sets in europe. In: Garoufallou,
E., Greenberg, J. (eds.) MTSR. Communications in Computer and Information Science, vol. 390, pp. 234–245. Springer
(2013), http://dblp.uni-trier.de/db/conf/mtsr/mtsr2013.html#MartinFT13
47. Matheus, R., Ribeiro, M.M., Vaz, J.C., de Souza, C.A.: Anti-corruption online monitoring systems in brazil. In: Proceed-
ings of the 6th International Conference on Theory and Practice of Electronic Governance. pp. 419–425. ICEGOV ’12,
ACM, New York, NY, USA (2012), http://doi.acm.org/10.1145/2463728.2463809
48. Matheus, R., Ribeiro, M.M., Vaz, J.C.: New perspectives for electronic government in brazil: The adoption of open
government data in national and subnational governments of brazil. In: Proceedings of the 6th International Confer-
ence on Theory and Practice of Electronic Governance. pp. 22–29. ICEGOV ’12, ACM, New York, NY, USA (2012),
http://doi.acm.org/10.1145/2463728.2463734
49. Meijer, R., Conradie, P., Choenni, S.: Reconciling Contradictions of Open Data Regarding Transparency, Privacy, Secu-
rity and Trust. Journal of theoretical and applied electronic commerce research 9, 32 – 44 (09 2014), http://www.
scielo.cl/scielo.php?script=sci_arttext&pid=S0718-18762014000300004&nrm=iso
50. Mercado-Lara, E., Gil-Garcia, J.R.: Open government and data intermediaries: The case of aiddata. In: Proceedings of the
15th Annual International Conference on Digital Government Research. pp. 335–336. dg.o ’14, ACM, New York, NY,
USA (2014), http://doi.acm.org/10.1145/2612733.2612789
51. Mutuku, L.N., Colaco, J.: Increasing kenyan open data consumption: A design thinking approach. In: Proceedings of the
6th International Conference on Theory and Practice of Electronic Governance. pp. 18–21. ICEGOV ’12, ACM, New
York, NY, USA (2012), http://doi.acm.org/10.1145/2463728.2463733
52. Ochoa, X., Duval, E.: Quality metrics for learning object metadata. In: Pearson, E., Bohman, P. (eds.) Proceedings of World
Conference on Educational Multimedia, Hypermedia and Telecommunications 2006. pp. 1004–1011. AACE, Chesapeake,
VA (June 2006), http://www.editlib.org/p/23127
40
53. O?Hara, K.: Enhancing the quality of open data. In: Floridi, L., Illari, P. (eds.) The Philosophy of Information Quality,
Synthese Library, vol. 358, pp. 201–215. Springer International Publishing (2014), http://dx.doi.org/10.1007/
978-3- 319-07121- 3_11
54. Palmirani, M., Martoni, M., Girardi, D.: Open government data beyond transparency. In: K?, A., Francesconi, E. (eds.)
Electronic Government and the Information Systems Perspective, Lecture Notes in Computer Science, vol. 8650, pp.
275–291. Springer International Publishing (2014), http://dx.doi.org/10.1007/978-3- 319-10178- 1_22
55. Parycek, P., Hochtl, J., Ginner, M.: Open Government Data Implementation Evaluation. Journal of theoretical and ap-
plied electronic commerce research 9, 80 – 99 (05 2014), http://www.scielo.cl/scielo.php?script=sci_
arttext&pid=S0718-18762014000200007&nrm=iso
56. Petychakis, M., Vasileiou, O., Georgis, C., Mouzakitis, S., Psarras, J.: A State-of-the-Art Analysis of the Current Public
Data Landscape from a Functional, Semantic and Technical Perspective. Journal of theoretical and applied electronic com-
merce research 9, 34 – 47 (05 2014), http://www.scielo.cl/scielo.php?script=sci_arttext&pid=
S0718-18762014000200004&nrm=iso
57. Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (Apr 2002), http://
doi.acm.org/10.1145/505248.506010
58. Prieto, L.M., Rodríguez, A.C., Pimiento, J.: Implementation framework for open data in colombia. In: Proceedings of
the 6th International Conference on Theory and Practice of Electronic Governance. pp. 14–17. ICEGOV ’12, ACM, New
York, NY, USA (2012), http://doi.acm.org/10.1145/2463728.2463732
59. Reiche, K.J., Höfig, E.: Implementation of metadata quality metrics and application on public government data. In: COMP-
SAC Workshops. pp. 236–241 (2013)
60. Rojas, L., Bermúdez, G., Lovelle, J.: Open data and big data: A perspective from colombia. In: Uden, L., Fuenzaliza Oshee,
D., Ting, I.H., Liberona, D. (eds.) Knowledge Management in Organizations, Lecture Notes in Business Information
Processing, vol. 185, pp. 35–41. Springer International Publishing (2014), http://dx.doi.org/10.1007/978-
3-319- 08618-7_4
61. Dulong de Rosnay, M., Janssen, K.: Legal and Institutional Challenges for Opening Data across Public Sectors: Towards
Common Policy Solutions. Journal of theoretical and applied electronic commerce research 9, 1 – 14 (09 2014), http://
www.scielo.cl/scielo.php?script=sci_arttext&pid=S0718-18762014000300002&nrm=iso
62. Sanabria, P., Pliscoff, C., Gomes, R.: E-government practices in south american countries: Echoing a global trend or really
improving governance? the experiences of colombia, chile, and brazil. In: Gascó-Hernández, M. (ed.) Open Government,
Public Administration and Information Technology, vol. 4, pp. 17–36. Springer New York (2014), http://dx.doi.
org/10.1007/978-1- 4614-9563- 5_2
63. Sandoval-Almazan, R., Gil-Garcia, J.R., Luna-Reyes, L.F., Luna, D.E., Rojas-Romero, Y.: Open government 2.0: Citizen
empowerment through open data, web and mobile apps. In: Proceedings of the 6th International Conference on Theory
and Practice of Electronic Governance. pp. 30–33. ICEGOV ’12, ACM, New York, NY, USA (2012), http://doi.
acm.org/10.1145/2463728.2463735
64. Sandoval-Almazan, R., Gil-Garcia, J.: Towards an evaluation model for open government: A preliminary proposal. In:
Janssen, M., Scholl, H., Wimmer, M., Bannister, F. (eds.) Electronic Government, Lecture Notes in Computer Science, vol.
8653, pp. 47–58. Springer Berlin Heidelberg (2014), http://dx.doi.org/10.1007/978-3-662-44426-9_4
41
65. dos Santos Brito, K., dos Santos Neto, M., da Silva Costa, M.A., Garcia, V.C., de Lemos Meira, S.R.: Using parliamen-
tary brazilian open data to improve transparency and public participation in brazil. In: Proceedings of the 15th Annual
International Conference on Digital Government Research. pp. 171–177. dg.o ’14, ACM, New York, NY, USA (2014),
http://doi.acm.org/10.1145/2612733.2612769
66. dos Santos Brito, K., da Silva Costa, M.A., Garcia, V.C., de Lemos Meira, S.R.: Brazilian government open data: Imple-
mentation, challenges, and potential opportunities. In: Proceedings of the 15th Annual International Conference on Digital
Government Research. pp. 11–16. dg.o ’14, ACM, New York, NY, USA (2014), http://doi.acm.org/10.1145/
2612733.2612770
67. Sayogo, D., Pardo, T., Cook, M.: A framework for benchmarking open government data efforts. In: System Sciences
(HICSS), 2014 47th Hawaii International Conference on. pp. 1896–1905 (Jan 2014)
68. Shadbolt, N., O’Hara, K., Salvadores, M., Alani, H.: egovernment. In: John Domingue, Dieter Fensel & James Hendler
(eds.), Handbook of Semantic Web Technologies, pp. 840–900. Springer-Verlag (2011), http://eprints.soton.
ac.uk/271711/, dOI 10.1007/978-3-540-92913-0_20 Chapter: 20
69. Sheffer Correa, A., Correa, P., Silva, D., Soares Correa da Silva, F.: Really opened government data: A collaborative
transparency at sight. In: Big Data (BigData Congress), 2014 IEEE International Congress on. pp. 806–807 (June 2014)
70. Solar, M., Concha, G., Meijueiro, L.: A model to assess open government data in public agencies. In: Scholl, H.J., Janssen,
M., Wimmer, M., Moe, C.E., Flak, L.S. (eds.) EGOV. Lecture Notes in Computer Science, vol. 7443, pp. 210–221.
Springer (2012), http://dblp.uni-trier.de/db/conf/egov/egov2012.html#SolarCM12
71. Solar, M., Meijueiro, L., Daniels, F.: A guide to implement open data in public agencies. In: Wimmer, M., Janssen, M.,
Scholl, H.J. (eds.) EGOV. Lecture Notes in Computer Science, vol. 8074, pp. 75–86. Springer (2013), http://dblp.
uni-trier.de/db/conf/egov/egov2013.html#SolarMD13
72. Styrin, E., Dmitrieva, N., Zhulin, A.: Openness evaluation framework for public agencies. In: Proceedings of the 7th
International Conference on Theory and Practice of Electronic Governance. pp. 370–371. ICEGOV ’13, ACM, New York,
NY, USA (2013), http://doi.acm.org/10.1145/2591888.2591964
73. Vasa, M., Tamilselvam, S.: Building apps with open data in india: An experience. In: Proceedings of the 1st International
Workshop on Inclusive Web Programming - Programming on the Web with Open Data for Societal Applications. pp. 1–7.
IWP 2014, ACM, New York, NY, USA (2014), http://doi.acm.org/10.1145/2593761.2593763
74. Veljkovi´
c, N., Bogdanovi´
c-Dini´
c, S., Stoimenov, L.: Web 2.0 as a technological driver of democratic, transparent, and
participatory government. In: Reddick, C.G., Aikins, S.K. (eds.) Web 2.0 Technologies and Democratic Governance,
Public Administration and Information Technology, vol. 1, pp. 137–151. Springer New York (2012), http://dx.doi.
org/10.1007/978-1- 4614-1448- 3_9
75. Veljkovi´
c, N., Bogdanovi´
c-Dini´
c, S., Stoimenov, L.: Benchmarking open government: An open data perspective. Govern-
ment Information Quarterly 31(2), 278 – 290 (2014), http://www.sciencedirect.com/science/article/
pii/S0740624X14000434
76. Verma, N., Gupta, M.P.: Open government data: Beyond policy & portal, a study in indian context. In: Proceedings of the
7th International Conference on Theory and Practice of Electronic Governance. pp. 338–341. ICEGOV ’13, ACM, New
York, NY, USA (2013), http://doi.acm.org/10.1145/2591888.2591949
77. van der Waal, S., W˛ecel, K., Ermilov, I., Janev, V., Miloševi´
c, U., Wainwright, M.: Lifting open data portals to the data
web. In: Auer, S., Bryl, V., Tramp, S. (eds.) Linked Open Data – Creating Knowledge Out of Interlinked Data, pp. 175–
42
195. Lecture Notes in Computer Science, Springer International Publishing (2014), http://dx.doi.org/10.1007/
978-3- 319-09846- 3_9
78. Yang, T.M., Lo, J., Wang, H.J., Shiang, J.: Open data development and value-added government information: Case stud-
ies of taiwan e-government. In: Proceedings of the 7th International Conference on Theory and Practice of Electronic
Governance. pp. 238–241. ICEGOV ’13, ACM, New York, NY, USA (2013), http://doi.acm.org/10.1145/
2591888.2591932
79. Yang, Z., Kankanhalli, A.: Innovation in government services: The case of open data. In: Grand Successes and Failures
in IT. Public and Private Sectors - IFIP WG 8.6 International Working Conference on Transfer and Diffusion of IT, TDIT
2013, Bangalore, India, June 27-29, 2013. Proceedings. pp. 644–651 (2013), http://dx.doi.org/10.1007/978-
3-642- 38862-0_47
80. Zuiderwijk, A., Gascó, M., Parycek, P., Janssen, M.: Special issue on transparency and open data policies: Guest editors’
introduction. J. Theor. Appl. Electron. Commer. Res. 9(3), i–ix (Sep 2014), http://dl.acm.org/citation.cfm?
id=2661036.2661037
81. Zuiderwijk, A., Helbig, N., Gil-García, J.R.A., Janssen, M.: Special Issue on Innovation through Open Data: Guest Edi-
tors’ Introduction. Journal of theoretical and applied electronic commerce research 9, i – xiii (05 2014), http://www.
scielo.cl/scielo.php?script=sci_arttext&pid=S0718-18762014000200001&nrm=iso
82. Zuiderwijk, A., Janssen, M.: A coordination theory perspective to improve the use of open data in policy-making. In:
Wimmer, M., Janssen, M., Scholl, H.J. (eds.) EGOV. Lecture Notes in Computer Science, vol. 8074, pp. 38–49. Springer
(2013), http://dblp.uni-trier.de/db/conf/egov/egov2013.html#ZuiderwijkJ13
83. Zuiderwijk, A., Janssen, M.: Barriers and development directions for the publication and usage of open data: A socio-
technical view. In: Gascó-Hernández, M. (ed.) Open Government, Public Administration and Information Technology,
vol. 4, pp. 115–135. Springer New York (2014), http://dx.doi.org/10.1007/978-1-4614-9563-5_8
84. Zuiderwijk, A., Janssen, M.: The negative effects of open government data - investigating the dark side of open data. In:
Proceedings of the 15th Annual International Conference on Digital Government Research. pp. 147–152. dg.o ’14, ACM,
New York, NY, USA (2014), http://doi.acm.org/10.1145/2612733.2612761
43
... with research problems. The process of analyzing legal materials as data in this research begins with systematic data (Alam, 2021;Attard et al., 2015). At the systematic data stage, all collected legal materials are selected, classified, and presented descriptively so that a clear, detailed, and systematic picture of the facts is obtained. ...
Article
Full-text available
Section 171 concerning person who inherits and heir must be a Muslim who has blood relations or marital relations. In case 04/Pdt/2013/ the Bandung Religious Court determined the share of non-Muslim inheritance on the pretext of using a mandatory will. For this reason, research within literature method for case analysis of 04/Pdt/2013/PA Bandung. The results target of this study is determination of Muslim heirs by non-Muslims who are determined by Bandung Religious Court Judge through the determination path number 04/Pdt/2013, if measured through perspective of Islamic Sharia. The judge's determination contradicts the opinion based on the hadith that Muslims do not inherit from infidels and vice versa. Significantly shows the existence of an embargo to become heirs and heirs between people who are Muslims and Non-Muslim. Meanwhile, PA Bandung judges are guided by handful of opinions whose basis is based on general hadith of Prophet and does not discuss inheritance.
... Numerous open data movements have emerged around the world in recent years. However, fully realising the potential of open government data and supporting for citizen's open government data utilisation remain important challenges (Attard et al., 2015). Government, as a representative of the public sector, plays a significant role in the generation and maintenance of data (Vetrò et al., 2016). ...
Article
Full-text available
Introduction. Governance process optimisation is critical to achieve the goal of improving public services efficiency. Public data service is crucial starting point for realising this goal. However, practical challenges persist in public data service, including unclear processes and insufficient identification of public data service citizens’ information needs. Method & Analysis. We employed an inductive methodology to analyse the public data service users’ information needs and annual distribution among information needs. Based on a data-driven approach, we mapped these information needs sub-categories into four main procedures in public data service. This paper further identified the hierarchy of information needs based on public data processing levels. We utilised Sankey diagram to effectively illustrate cross-analysis result between topics and information needs’ subcategories. Results. The typical levels of public data service users’ information needs include informing, utilisation, problem-solving, social life service, and society governance. Furthermore, our study reveals the data flow among sub-categories of public data service information needs follows the logic of Datafication- Government to Government-Government to citizens- Data to Optimiation. Conclusion. Government departments should ensure public data quality and facilitate efficient data circulation processes. We recommend understanding relationships between public data resources and services derived from them through a problem-solving lens. Our findings will serve as a reference tool for analysing information needs of public data service users in other countries.
... Encouraging broader citizen participation and the democratization of information necessitates a sensitive approach that acknowledges and integrates the country's rich heritage and social dynamics. The push for an open government inherently increases the quantity of data flowing from the public to the government, highlighting the need for stringent data handling and protection practices (Attard et al., 2015;Wu, 2014). There is an urgent imperative to ensure that systems and protocols are in place to prevent misuse of data, unauthorized access, and breaches that could lead to exposure of sensitive personal information. ...
Article
Full-text available
The article examines the potential and obstacles of adopting open government principles in Vietnam. Utilizing a thorough document analysis of governmental reports, policy directives, and academic documents, the study highlights significant progress through legislative reforms and digital technology adoption in Vietnam. This country is undergoing a shift towards transparent governance, motivated by its strong economic growth and political development. Notwithstanding these progressions, Vietnam encounters various obstacles. These factors encompass discrepancies in technology infrastructure, cultural and institutional opposition, concerns around privacy and security, and limitations in available resources. Furthermore, the challenge of aligning new policies with current legislation and ensuring efficient implementation presents additional complexities. The study suggests that enhancing legal and regulatory frameworks, digital infrastructure, fostering a culture of transparency in public agencies, building inclusive participation methods, and prioritizing anti-corruption measures will help Vietnam overcome obstacles in developing an open government.
... [57] Open Data Exposed [22] "DIGIWHIST Recommendations for the Implementation of Open Public Procurement Data An Implementer' s Guide" [58] "Open data: Quality over quantity" [59] Datos Abiertos: Guía estratégica para su puesta en marcha Conjuntos de datos mínimos a publicar [60] "Exploring the economic value of open government data" [61] "A systematic review of open government data initiatives" [62] "The exploitation of Business Register data from a public sector information and data protection perspective: A case study" Innovación y gestión en la contratación [63] "AI governance in the public sector: Three tales from the frontiers of automated decision-making in democratic settings" ...
Book
Este trabajo tiene como objetivo explicar a un lector no técnico la Tesis Doctoral realizada por el autor: aplicar la ciencia de datos a la contratación pública.
... In contrast, unstructured data do not have a predefined data model or common identifiable structure, such as narrative text, audio, photo, or video. Some scholars argue that governments are better to publish structured datasets because such data are "machineprocessable", meaning that calculus and algorithms can read, combine, process and analyze them easily, and computers can depict them using graphs and maps (Attard et al., 2015;. In the field of scientific research, Figlio et al. (2017) propose that the integration of structured datasets not only provides researchers with a full-sample data resource that reduces the generation of random errors during empirical analysis, but also offers new opportunities to reveal the full picture of event development under dynamic longitudinal panel data. ...
Article
Purpose This study investigates the individual and binary (i.e. combined) effects of institutional dimensions of open government data (which include instructional, structural and accessible rules) on scientific research innovation, as well as the mediating roles that researchers' perceived data usefulness and data capability play in between. Design/methodology/approach Based on a sample of 1,092 respondents, this study uses partial least squares structural equation modeling (PLS-SEM) and polynomial regression with response surface analysis to evaluate the direct and indirect effects of individual and binary institutional dimensions on scientific research innovation. Findings The findings demonstrate that instructional, structural and restricted access data have a positive effect on scientific research innovation in the individual effect. While the binary effect of institutional dimensions produces varying degrees of scientific research innovation. Furthermore, this study discovers that the perceived usefulness and data capability of researchers differ in the mediating effect of institutional dimensions on scientific research innovation. Originality/value Theoretically, this study contributes new knowledge on the causal links between data publication institutions and innovation. Practically, the research findings offer government data managers timely suggestions on how to build up institutions to foster greater data usage.
Conference Paper
В статье рассматривается внедрение принципов открытого правительства во Вьетнаме в цифровую эпоху, подчеркивается трансформационное воздействие на государственное управление, включая повышение прозрачности и гражданской активности, а также выявляются такие проблемы, как цифровое неравенство и безопасность данных, и намечаются стратегии решения этих проблем.
Article
Purpose Despite the current attention toward the concept of data culture, a commonly accepted scope and definition is currently lacking. Addressing this conceptual fuzziness would be beneficial to pursue the development of knowledge on data culture in the public sector. The research aims at advancing theory by building a novel conceptualization of the constituent elements of data culture in local governments and their relationships. Design/methodology/approach For this purpose, the authors used a multi-method research design. More precisely, the authors conducted 12 semi-structured interviews with mayors and heads of administration from local governments, and a document analysis. The authors inductively mapped the findings to an existing heuristic featuring seven levels of data culture and extracted relationships between these levels. Findings The authors find several elements belonging to the data culture of local governments for each level of the existing generic heuristic and identify 24 influence relationships between these levels. The authors integrate these findings into the data culture model, which conceptualizes data culture in local governments. Research limitations/implications The data culture model provides a strong theoretical basis for researchers to position their research and further advances knowledge on this still elusive concept. Practitioners can use the data culture model as a reflective tool to understand which elements impacted their current data behavior. Originality/value To the best of the authors’ knowledge, this is the first work to provide a conceptualization of data culture in local governments at this level of depth, and to conceptualize relationships between constituent elements of data culture.
Conference Paper
Over the past two decades, the global movement towards open government gained momentum, aiming to leverage vast amounts of data generated by government institutions to increase citizen participation in governing processes, increase the transparency of public resource allocation, and increase organizations’ economic value. Despite legislative initiatives promoting the use of OGD little is known about its actual use and the impact it generates. The study aims to determine whether a model to measure and distinguish between different levels of OD maturity can be made. The scope of this research includes a review of the existing literature on OD and OGD, and models that measure the OD maturity level. We analyzed the research findings of the identified literature and models used to measure the preparedness of organizations to adopt OGD in their everyday processes. Nine models that measure the maturity level for OGD adoption have been identified. We discovered that no existing model is fully comprehensive in assessing the maturity level of SMEs to adopt and use OGD. A model that will explain the current OD maturity level of an SME and propose individualized actions to increase it yet needs to be developed.
Conference Paper
Full-text available
Despite the development of Open Data platforms, the wider deployment of Open Data still faces significant barriers. It requires identifying the obstacles that have prevented e-government bodies either from implementing an Open Data strategy or from ensuring its sustainability. This paper presents the results of a study carried out between June and November 2012, in which we analyzed three cases of Open Data development through their platforms, in a medium size city (Rennes, France), a large city (Berlin, Germany), and at national level (UK). It aims to draw a clear typology of challenges, risks, limitations and barriers related to Open Data. Indeed the issues and constraints faced by re-users of public data differ from the ones encountered by the public data providers. Through the analysis of the experiences in opening data, we attempt to identify how barriers were overcome and how risks were managed. Beyond passionate debates in favor or against Open Data, we propose to consider the development of an Open Data initiative in terms of risks, contingency actions, and expected opportunities. We therefore present in this paper the risks to Open Data organized in 7 categories: (1) governance, (2) economic issues, (3) licenses and legal frameworks, (4) data characteristics, (5) metadata, (6) access, and (7) skills.
Conference Paper
Full-text available
Two important trends in government that are emerging in the recent years have been on one hand the exploitation of the Web 2.0 social media, supporting a more extensive interaction and collaboration with citizens, and on the other hand the opening of government data to the citizens through the Internet, in order to be used for scientific, commercial and political purposes. However, there has been limited attempt of integrating them. Using a design science approach a second generation of open government data (OGD) platforms has been developed, which offer to the users both the ‘classical’ first generation functionalities, and also a comprehensive set of additional novel Web 2.0 features. The latter aim to provide support to the users in order to generate value from ODG. They enable users to become ‘prosumers’, both producing and consuming data. These novel capabilities for performing various types of processing, information and knowledge exchange, and collaboration were found to be useful and valuable by users in a first evaluation.
Conference Paper
Full-text available
This article presents a guide to implement open data in Public Agencies (PAs). The guide is the result of a worldwide proposal’s study, of the application of a maturity model to diagnose the situation of PAs in Latin American countries, the opinion of experts in different excellence centers, e-government authorities, and developers of open data application in the world. The guide is simple and orients decision makers so that PAs following the actions of the guide can see their capacities improved when facing a diagnosis of their institutional maturity in implementation of open data.
Chapter
The member states of the European Union are faced with the challenges of handling “big data” as well as with a growing impact of the supranational level. Given that the success of efforts at European level strongly depends on corresponding national and local activities, i.e., the quality of implementation and the degree of consistency, this chapter centers upon the coherence of European strategies and national implementations concerning the reuse of public sector information. Taking the City of Vienna’s open data activities as an illustrative example, we seek an answer to the question whether and to what extent developments at European level and other factors have an effect on local efforts towards open data. We find that the European Commission’s ambitions are driven by a strong economic argumentation, while the efforts of the City of Vienna have only very little to do with the European orientation and are rather dominated by lifestyle and administrative reform arguments. Hence, we observe a decoupling of supranational strategies and national implementation activities. The very reluctant attitude at Austrian federal level might be one reason for this, nationally induced barriers—such as the administrative culture—might be another. In order to enhance the correspondence between the strategies of the supranational level and those of the implementers at national and regional levels, the strengthening of soft law measures could be promising.
Conference Paper
This paper aims to present and analyse the Open Government Data (OGD) legislation framework in force in the current Italian legal system. The previous legislation has been compared with the recently enacted Legislative Decree about transparency (the so called Transparency Act d.lgs. 33/2013). After discussing the normative contest, this paper completes the theoretical analysis with an empirical research conducted on the Italian Municipalities’ web sites (35 portals) in order to deeply understand the connection between the Open Government Data legislation and the new Transparency Act. The aim of this comparison is to test and prove our theory about the fact that the Transparency Act doesn’t enable and reinforce the OGD – as FOIAs do – but it subtracts resources, human capital, skills, funds and motivations. The Transparency Act, in fact, implements an old-style model of web site oriented to a "Public Administration centered" paradigm instead of an "ODG centered" one. The authors, finally, wants to identify a method to combine the two different approaches, having a unique production workflow of data and documents in Open Data format, with a semantic web metadata classification that qualify the information.
Conference Paper
Open data is accessible public data that people, business, and organisations can use to launch new ventures, analyse patterns and trends, make data-driven decisions, solve complex problems, control of public institutions and improve the quality of life. Big data gives us unprecedented power to understand, analyse, and ultimately change the world we live in. Both big data and open data can transform business, government, and society and a combination of the two is especially potent. Colombia has taken small steps and still has a long way to go to make use of big data and open data for the benefits of its citizens. This paper breiefly reviews steps the Colombian government is taking to achieve the potential of the new technology.
Chapter
This paper looks at some of the quality issues relating to open data. This is problematic because of an open-data specific paradox: most metrics of quality are user-relative, but open data are aimed at no specific user and are simply available online under an open licence, so there is no user to be relevant to. Nevertheless, it is argued that opening data to scrutiny can improve quality by building feedback into the data production process, although much depends on the context of publication. The paper discusses various heuristics for addressing quality, and also looks at institutional approaches. Furthermore, if the open data can be published in linkable or bookmarkable form using Semantic Web technologies, that will provide further mechanisms to improve quality
Conference Paper
Dataset portals such as Data.gov and Data.uk.gov have become flagship initiatives of open government and open data strategies. These portals aim to fulfill the open government objectives of promoting re-use of public sector information to develop new products and services, and increasing transparency for public officials’ accountability. This work focus on the latter and its aim is to propose a set of requirements as part of a framework to assess whether dataset portals are indeed contributing to a higher degree of transparency focusing on accountability. Previous studies on internet-based transparency (including Internet Financial Reporting – IFR) were analyzed, from which several requirements were derived concerning the data types sought after, the public entities covered, the information seeking strategies adopted and the desired qualitative characteristics of data. The rationale behind our proposal is that dataset portals developed under the open government principles should, at least, be able to fulfill the informational and operational requirements identified in the ‘traditional’ transparency assessment literature.
Conference Paper
Online transparency for accountability assessment exercises reported in the literature rely solely on the analysis of public entities’ individual web sites, measuring the data disclosed and the way it is disclosed, and not taking into consideration the context in which these ‘target’ entities operate. This paper aims at identifying key contextual elements that may influence the way data is disclosed by public entities in their individual web sites, and therefore should be taken into consideration when designing the assessment models and exercises. The contextual elements identified were organized into an online transparency for accountability maturity model that may be used on its own to assess the overall level of sophistication of a country or region (‘context’), or it may be used in a stage-gate approach to define the appropriate type of entities assessment model. Researchers wanting to assess a set of ‘target’ entities should therefore begin by analyzing the context in which they operate (using the proposed maturity model) and then define their assessment model according to the recommendations proposed in this paper for the corresponding maturity level.