Conference PaperPDF Available

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development

Authors:

Abstract and Figures

In this paper we discuss Business Intelligence and define what is meant by support for Multilingualism in a Business Intelligence reporting context. We identify support for Multilingualism as a challenging issue which has implications for data warehouse design and reporting performance. Data warehouses are a core component of most Business Intelligence systems and the star schema is the approach most widely used to develop data warehouses and dimensional Data Marts. We discuss the way in which Multilingualism can be supported in the Star Schema and identify that current approaches have serious limitations which include data redundancy and data manipulation, performance and maintenance issues. We propose a new approach to enable the optimal application of multilingualism in Business Intelligence. The proposed approach was found to produce satisfactory results when used in a proof-of-concept environment. Future work will include testing the approach in an enterprise environment.
Content may be subject to copyright.
An Evaluation of the Challenges of Multilingualism in Data
Warehouse Development
Nedim Dedić and Clare Stanier
Faculty of Computing, Engineering and Sciences, Staffordshire University,
Beaconside, Stafford, Staffordshire, ST18 0AD, U.K.
Keywords: Business Intelligence, Data Warehousing, Multilingualism, Star Schema, Semantic Web.
Abstract: In this paper we discuss Business Intelligence and define what is meant by support for Multilingualism in a
Business Intelligence reporting context. We identify support for Multilingualism as a challenging issue which
has implications for data warehouse design and reporting performance. Data warehouses are a core component
of most Business Intelligence systems and the star schema is the approach most widely used to develop data
warehouses and dimensional Data Marts. We discuss the way in which Multilingualism can be supported in
the Star Schema and identify that current approaches have serious limitations which include data redundancy
and data manipulation, performance and maintenance issues. We propose a new approach to enable the
optimal application of multilingualism in Business Intelligence. The proposed approach was found to produce
satisfactory results when used in a proof-of-concept environment. Future work will include testing the
approach in an enterprise environment.
1 INTRODUCTION
Users today expect to access relevant information in
the semantic web in their own language (Gracia et al,
2011). This is also the case when accessing analytical
data relevant for decision making using Business
Intelligence (BI) applications, such as reports or
dashboards. In this paper, we discuss the application
of BI in an international context focusing the issue of
Multilingualism (ML). Multilingualism is defined
further in section two but can be understood as data
manipulation and reporting in more than one
language. Understanding the issues presented by ML
requires discussion of the Data Warehouse (DW),
which is a core component of BI (Olszak and Ziemba,
2007). We also discuss data marts and focus on the
star schema as the most accepted form of dimensional
modelling. We evaluate current solutions and
approaches to the implementation of ML using the
star schema in BI environment and propose an
improved approach.
The rest of this paper is structured as follows:
section 2 defines BI and ML and the requirement to
support ML in a BI context; section 3 discusses ML
in a Data Warehouse environment, outlining DW
concepts and design approaches, the role of the star
schema and the issues presented when implementing
ML in a star schema; section 4 presents the
conclusions and evaluation, proposing a revised
approach to supporting ML.
2 PROBLEM CONTEXT:
BUSINESS INTELLIGENCE
AND MULTILINGUALISM
To survive in today’s business a company has to
continuously improve productivity and efficiency,
while management and executives have to make
decisions almost immediately to ensure
competitiveness (Huff, 2013). Information is used to
enable improved decision making and efficiency
(Yrjö-Koskinen, 2013; Hannula and Pirttimäki,
2003). This process is supported by activities,
processes and applications which are collectively
known as Business Intelligence. Business
Intelligence was initially used to describe activities
and tools associated with the reporting and analysis
of data stored in data warehouses (Kimball et al,
2008). Dekkers et al., (2007) define BI as a
continuous activity of gathering, processing and
analysing data. BI helps companies to out-think the
196
Dedi´
c, N. and Stanier, C.
An Evaluation of the Challenges of Multilingualism in Data Warehouse Development.
In Proceedings of the 18th International Conference on Enterprise Information Systems (ICEIS 2016) - Volume 1, pages 196-206
ISBN: 978-989-758-187-8
Copyright c
2016 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
competition through better understanding of the
customer base (Brannon, 2010) and can provide
competitive advantage (Marchand and Raymond,
2008), and support for strategic decision-making
(Popovič et al., 2010). From the early days of
computing, computer technology and software has
been associated with development in the English
language (Hensch, 2005) and Business Intelligence is
no exception.
BI is a fast evolving field (Obeidat et al, 2015;
Chaudhuri et al., 2011) and although traditional BI
focussed on activities such as data warehousing and
reporting, the new generation of BI has an additional
focus on data exploration and visualisation
(Anadiotis, 2013; Obeidat et al, 2015), which
increases the demand for ML. The business and legal
context of BI is also evolving. With emerging markets
and expanding international cooperation, especially
in the case of European Union, there is a requirement
to support BI in languages other than English. Users
today expect to access information in the semantic
web in their own language (Gracia et al, 2011). Based
on the online profiles of the biggest European
companies (Forbes, 2015), most of these companies
are international in their nature. Business users expect
to be able to use software and applications, including
Business Intelligence, in their own language for the
purpose of better productivity (Hau and Aparício,
2008). Thus, when expanding their business to new
countries, companies need to extend and adopt their
BI infrastructure to support optimal use of local
languages. In a European context, the requirement
for multilingualism has legal underpinnings in some
countries. Most European countries have laws on the
official use of their respective languages in public
communication (Italian Law No. 482, 1999;
Constitution of France, 1958; Constitution of Croatia,
1990; Federation Constitution, 1994; Spanish
Constitution, 1978). For international companies,
this means a requirement to support the use of several
languages to obey local laws. Where there is a need
to support multiple languages, there is an imperative
to enable transfer and processing of textual
accessibilities for localization purposes (Vazquez,
2013). This also applies in a Business Intelligence
environment.
Multilingualism is complex phenomenon which
can be seen from different perspectives and has many
definitions (Cenoz, 2013). The European
Commission (2008, p.6.) defines it as “the ability of
societies, institutions, groups and individuals to
engage, on a regular basis, with more than one
language in their day-to-day lives”. In the Business
Intelligence DW and presentation context, we define
multilingualism as the ability to store descriptive
information and to use this information in a semantic
layer in more than one language. The changing
attitudes of business users, the importance of
emerging and international markets and ever-growing
local data warehousing communities are additional
issues that justify the application of multilingualism
in Business Intelligence. Multilingualism, however,
presents challenges for design and reporting in data
warehouses.
3 MUTILINGUALISM IN A DATA
WAREHOUSE ENVIRONMENT
3.1 Data Warehouse Concepts
Data warehouses are seen as core in the development
of BI systems (Olszak and Ziemba, 2007). A widely
accepted definition of the Data Warehouse is
provided by Inmon (2005) who defined a Data
Warehouse as a collection of integrated databases
designed to support the DSS (decision support
system) function. In this definition, the data
warehouse from the architectural point should be
almost the same as the source system and may also
have data marts, or aggregated tables that are used for
reporting and querying purposes. Linstedt et al (2010)
propose very similar concept known as the Data Vault
approach. Differentiation is only in the context of
modelling and storing information inside the Data
Warehouse. In the Data Vault approach data is loaded
from the source system in its original format (Linstedt
et. al, 2010). The Data Vault approach is built around
Hubs, Links and Satellites (Jovanović et al., 2014).
Hubs represent source system business keys in a
master table, links are associations between hubs with
validity period, and satellites point to the links
containing attributes of transaction with validity
period (Orlov, 2014). As the structure of the data is
highly normalized (4NF+), this approach for the
implementation of a data warehouse is not adequate
for direct reporting and requires additional
dimensional data marts to enable reporting or
querying (Orlov, 2014).
Kimball et al (2008) presented an alternative view
of the data warehouse, arguing that a data warehouse
should be seen as a collection of data marts, which are
used for querying and reporting and connected using
conformed dimensions. In this approach, there is no
requirement to replicate all the data from the source
system, just the data needed by the business.
An Evaluation of the Challenges of Multilingualism in Data Warehouse Development
197
3.2 The Star Schema
Much of the literature on the development of data
warehouses, and particularly the seminal works by
Inmon and Kimball, dates from the end of the 20th
century/the first decade of the current century. There
has been comparatively little recent work on data
warehouse design and schema development although
there is a significant literature on data warehouse
development and optimisation (Cravero and
Sepulveda, 2015; Dokeroglu et al., 2014; Sano, 2014;
Graefe et al., 2013). Inmon and Kimball are the two
seminal figures in this field and although their data
warehouse design philosophies, as discussed in
section 3.1, are opposed, both propose dimensional
modelling and the use of data marts (star or snowflake
schema) for reporting (Orlov, 2014). The Data Vault
approach introduced by Linstedt (2010) also proposes
the use of data marts (in the form of star or snowflake)
for reporting. Inmon (1995), Kimball (2008) and
Linstedt (2010) all recommend the use of the star
schema as the most appropriate design strategy for the
development of data marts. Additionally, a survey
paper by Sen and Sinha (2005) examined the
approaches used by 15 data warehouse vendors and
found that 12 of the 15 vendors supported the star
schema (alone or in combination with others star
schema based approaches).
A star schema is a collection of dimension tables
and one or more fact tables (Cios, Pedrycz, Winiarski
et al, 2007). Dimensions have a key field and one
additional field for every attribute (Kimball et al,
2008; Jensen et al, 2010). The fact table is a central
table that contains transactional information and
foreign keys to dimensional tables, while dimensional
tables contain only master data (Kimball et al, 2008;
Jensen et al, 2010; Cios, Pedrycz, Winiarski et al,
2007). In visual model representation, the dimension
model resembles a star, thus the name (Jensen, 2010).
The main benefits of the star schema design are ease
of understanding and a reduction in the number of
joins needed to retrieve the data (Cios, Pedrycz,
Winiarski et al, 2007).
In dimension tables, the primary key is used to
identify the dimensional value, while hierarchy is
defined through attributes. Dimension tables do not
confirm to the relational model strategy of
normalisation and may contain redundancy (Jensen et
al, 2010). The Fact table, on the other hand, holds the
foreign key to dimensional table values and as there
is no redundancy it could be considered to be in 3NF
(Jensen et al, 2010). All foreign keys to the
dimensional tables build together the primary key for
the fact table.
There are other schemas used for the purpose of
dimensional modelling, such as snowflake or galaxy.
Cios, Pedrycz, Winiarski et al (2007) consider the
snowflake and the galaxy schemas as the variations
of the star schema, while Inmon (1995), Kimball
(2008), Linstedt (2010), Corr and Stagnittno (2014)
and Jensen et al (2010) consider it as a separate
dimensional modelling philosophy and not as a
variation of the star schema.
The star schema is the dimensional modelling
concept most used by the industry (Sen and Sinha,
2005). It is recommended as the most appropriate
design strategy for development of data marts and is
considered as a general dimensional modelling
approach in the data warehouse (Inmon, 1995;
Kimball, 2008; Linstedt, 2010). Thus, in this paper
we focus on the issues of multilingualism exclusively
within star schema. Although the accepted design
approach for DW development, and regarded as a
good fit for business requirements (Purba, 1999), star
schemas present issues when handling multilingual
systems.
3.3 Multilingualism Issues in Data
Warehouse Design
Conventional Business Intelligence uses a process
known as ETL [Extract-Transform- Load] (Kimball
et al, 2008; Inmon, 2005) in which data is extracted
from data source applications, transformed and
loaded into a data storage medium, typically a data
warehouse or data mart and then analysed to enable
meaningful reporting usually via web or desktop
applications. However, this process is most effective
in a single language environment. Language,
including multilingualism, is a difficult issue in
software localization (Collins, 2002) especially in a
multidimensional context; and the expansion of BI
systems to enable reporting in different languages is
not trivial. In the context of multilingual websites,
several factors have been identified when presenting
to different range of audience in different countries
Table 1: Simple Product dimension.
Key Description Code Category Subcategory From_Date To_Date
123 Apples FA Fruits vegetables Fruits 01.01.2014 31012014
124 Beer DB Drinks Alcoholic 01.01.2014 31012014
ICEIS 2016 - 18th International Conference on Enterprise Information Systems
198
(Hillier 2003). The most important are cultural
context and accessibility of applications in local
language. Creating and maintaining a web
environment in a multilingual perspective creates
special challenges, both cultural and technical (Huang
and Tilley, 2001). Besides the standard issues that
arise during translation process from one to other
language (such as meaning of words, terminology,
phrases, text direction, date formats, etc.) we face
additional technical issues when translating texts in
computer-based environment (Hillier, 2003).
These problems may range from different
application environments to different implementation
standards. To optimally apply multilingualism to
existing Business Intelligence environment it is
necessary to identify the issues in a BI environment.
The next section examines three possible
workarounds based on amendments to the star
schema: including additional attributes, extending the
primary key and providing additional dimension
tables. All these solutions, as discussed below
introduce problems such as extreme data
redundancies leading to performance issues, and
implementation and maintenance difficulties.
3.3.1 Adding Additional Attributes for New
Languages to Dimension
One approach, derived from Kimball’s method for
delivering country-specific calendars (Kimball and
Ross, 2011), recommends that where there are new
values for the dimension tables in star schema, we
simply add new attributes to the dimension. This
approach is also proposed by Imhoff et al (2003) as a
solution for simultaneous bilingual reporting. Imhoff
et al (2003) state that if we need to provide the ability
to report in two or more languages within the same
query, we need to publish the data with multiple
languages in the same row. When implementing
dimensions using this approach, attributes should be
descriptive, added in the form of textual labels that
consist full words, without missing values, discreetly
valued and quality assured (Kimball et al 2008). This
is illustrated by the simple Product dimension shown
Table 1. As we see, attributes in the Product
dimension (Description, Code, Category and
Subcategory) are textual fields and at this point, the
star schema would be sufficient for the most
companies to implement functional and optimal data
marts. The limitation of this approach in a
multilingual environment would be extremely large
dimension tables. For example, if there are ten
descriptive attributes for “product” dimension, in the
case of the five languages, there would be an
additional forty columns.
To demonstrate the problem, we convert the
product dimension table (Table 1) to a conceptual
view (Table 2a). The sample Product dimension,
based on Table 2a, which additionally includes
German, Italian and Bosnian language besides
English would look like Table 2b.
Table 2a: Conceptual view of the Product dimension.
Key (Primary Key)
Description
Code
Category
Subcategory
From_Date
To_Date
Table 2b: Product dimension in English, German, Italian
and Bosnian language.
Key (Primary Key)
Description
Code
Category
Subcategory
From_Date
To_Date
Description_DE
Code_DE
Category_DE
Subcategory_DE
Description_IT
Code_IT
Category_IT
Subcategory_IT
Description_BA
Code_BA
Category_BA
Subcategory_BA
As we see, new attribute columns for German, Italian
and Bosnian language are added for every possible
textual description. They have suffix _DE, _IT and
_BA (Table 2b).This simplified example does not
fully convey the scale of the problem. In
implementation practice, the Product dimension
might contain more than 20 of textual attributes and
the problem would be replicated for all dimensions.
A real-world example of a Product dimension would
include descriptive attributes to be used as reporting
aggregates, for example, description, category,
subcategory, assortment, assortment area, buying
department, brand, brand origin, country,
international categorization, product level, season
information, product state, class and type.
As they require large amounts of maintenance
An Evaluation of the Challenges of Multilingualism in Data Warehouse Development
199
time and CPU (Poolet, 2008), large and wide
dimension tables can be problematic, especially for
rapidly changing dimensions such as a Customer
dimension (Ponniah, 2004). Rapidly changing
dimensions are those dimensions where attribute or
hierarchical values change frequently (Boakye,
2012). As an example, consider a Customer
dimension with several million rows of data intended
to be used in five languages. In this example, the
Customer dimension has descriptive attributes such
as “buying category”, “buying frequency” and
“monetary value” in all five languages. These
categories are intended to be updated on a daily basis.
This and similar scenarios could lead to system
overhead on a daily basis. In addition, wide
dimension tables require duplicate storage for
descriptive attributes and make ETL transformation
complex as the language-based columns must be
taken into account. More complex SQL/MDX
statements are required with different language-based
columns to change the language of data previews at
the semantic level (reports, queries or dashboards).
Moreover, queries that return data sets must be re-
executed in the required language. There are other
external, but related problems caused by using this
approach. For example, updating or changing some
descriptive hierarchical attributes that are used as
basis for tables containing aggregated data. Suppose
a specific group of products change their category
from non-alcoholic drinks to energy drinks, affecting
also subcategories. It is necessary to update the
dimension table to change the descriptive records for
every language and also to re-aggregate the data in
tables holding aggregated data. In this scenario, we
would have to delete all data in tables that hold
aggregate data according to category and re-
aggregate. The process of re-aggregation could take
several days if we have billions of records in fact
tables, which is not unusual; Wallmart.com sells more
than 4.000.000 and Amazon.com more than
350.000.000 different products (Scrapehero.com,
2015). The situation is more critical with wide
dimension tables that represent rapidly changing
dimensions. The overhead would also increase as the
company needs to store more languages.
3.3.2 Extending a Primary Key to Include
Language Identifier
The second approach, discussed by Imhoff (2003),
proposes extending a primary key to include a
language identifier. As we can see from Table 3, the
limitation in this case is duplication of the records
with every new language. With five languages for the
product dimension, which for example holds one
million of data, there would be five million records.
Larger dimension tables slow the process of query
execution and make it harder to manage updates
according to the slowly changing dimension rules.
Slowly changing dimensions are dimensions whose
attribute or hierarchical values change over time, but
unlike rapidly changing dimensions, values are
changed unpredictably and less frequently (Kimball
et al, 2008). This approach to multilingualism is
problematic also for rapidly changing dimensions
(Ponniah, 2004), and as with the additional attributes
approach, makes heavy increased demands in terms
of maintenance time and CPU (Poolet, 2008). From a
memory perspective this approach is less efficient
than that discussed in 3.3.1 as it doubles storage
requirements with every additional language. This
approach also suffers from the semantic
layerproblems previously discussed: to change the
language of data preview on semantic layer (reports,
dashboards), query statements that return data sets
must be re-executed. This approach, unlike the
additional attributes approach, does not lead to more
complex ETL transformations and SQL/MDX
statements but the same problems exist with regard to
rapidly changing dimensions and changing the
structure of externally aggregated tables. For
companies using several languages and holding
millions of records in their dimensions, re-executing
queries and re-aggregating data according to a
specific language can be time, memory and CPU
demanding.
3.3.3 Schema and Multiple Dimensional
Tables Solution
A third approach discussed by Kimball (2001),
Imhoff et al (2003) and Corr and Stagnittno (2014),
proposes implementing one fact table and multiple
dimensional tables – depending on the number of
languages required. Different languages are saved in
different database schema and/or in different tables.
For example, for five different languages, we would
have five “product” dimension tables - one for every
language. For the same example, if there are one
hundred initial dimensions in data warehouse, five
hundred dimension tables would be required to satisfy
the MLrequirements for five languages. This
approach has numerous limitations. As additional
tables and possibly additional schemas are needed in
the data warehouse, this approach makes ETL
processes more complex as the language-based tables
must be planned for. It requires additional
transformations to every table for every additional
ICEIS 2016 - 18th International Conference on Enterprise Information Systems
200
Table 3: Product dimension with extended primary key.
Key Lang Description Code Category Subcategory From Date To Date
123 EN Apples FA Fruits vegetables Fruits 01.01.2014 31.01.2014
124 EN Beer DB Drinks Alcoholic 01.01.2014 31.01.2014
123 DE Äpfel FA Obst und Gemüse Obst 01.01.2014 31.01.2014
124 DE Bier DB Getränke Alcoholisch 01.01.2014 31.01.2014
123 IT Mele FA Frutta e Verdura Frutta 01.01.2014 31.01.2014
124 IT Birra DB Beve Alcolico 01.01.2014 31.01.2014
123 SI Jabloka FA Sadje in Zelenjava Sadje 01.01.2014 31.01.2014
124 SI Pivo DB Pijače Alkoholna 01.01.2014 31.01.2014
123 BA Jabuka FA Voće i povrće Voće 01.01.2014 31.01.2014
124 BA Pivo DB Pića Alkoholna 01.01.2014 31.01.2014
language. The data to be used for aggregation and
reporting is doubled and so is the metadata for tables
and schemas. This approach requires more complex
SQL/MDX statements than two previous approaches,
and changing the language of data preview at the
semantic level requires the query to be re-executed.
Changing any descriptive data in dimensions requires
re-aggregation of relevant tables holding aggregated
data, which can be critical considering the ETL and
SQL/MDX complexity of this approach. If one part
of the business (country), for example, changes the ID
for a specific dimension value, this could lead to
consistency problems. Having different IDs for the
same data category in different languages disables
consolidated reporting for that aspect at the enterprise
level. Other subtle problems that might arise when
using this approach as discussed by Kimball (2001)
include: the possibility of translating two distinct
attributes as the same word in a new language causing
ETL and reporting problems; reports cannot preserve
sort orders easily across different languages. To
overcome these problems, this concept requires
additional programming or the application of
additional or surrogate keys as actual keys in fact
table.
3.3.4 Discussion
Besides data redundancy, sub-optimal memory usage
and complex management, the common limitation of
the all three approaches is the fact that the business
users need to re-execute the query or report to change
the interface language. As BI-based reports and
queries do complex calculations besides querying
huge amounts of data from the data warehouse, this is
problematic and introduces performance issues.
Further, in the case of translation corrections for
master data, corrections have to go through the whole
ETL process. For example, to change an attribute
description for a specific language, it is necessary to
change the value in the source system, then wait for
the execution of the process for the respective master
data. Other options, such as normalizing star schema
dimensions for the purpose of multilingualism could
lead to the Star schema developing into a snowflake
schema. A snowflake schema is not an optimal design
solution for data marts with large amounts of data.
Issues include: the execution of the main query needs
to consider all joins between tables in the same
dimension; multi-level dimension tables inside one
dimension have to be joined during query execution;
the structure of the snowflake is complex and
includes large number of database tables per
dimension; harder implementation and maintenance
of ETL processes; greater complexity in
reorganization than the star schema; changes in the
semantic model could lead to the extensive
reorganization which would use additional resources.
3.4 Vendor Specific Approaches: SAP
Extended Star Schema
We reviewed the data warehouse and BI software
market and found that the biggest vendors, such as
Oracle, IBM and Microsoft, support one or more of
three approaches explained above. However, one of
the biggest vendors in Business Intelligence, SAP
proposes a specific solution for ML, using the concept
of an extended star schema, which also includes
language as part of the key. In this approach the
dimensions and the fact table are linked to one
another using abstract identification numbers
(dimension IDs), which are contained in the key part
of the respective database table (SAP, 2015).
Dimensions are not represented as one table with
redundant data as in classical star schema. Values
from the tables that hold information about a specific
dimension attribute text or value are mapped to an
abstract dimension key.
This is an implementation driven approach which
is only supported by SAP BW. This means it cannot
be seen as a general design solution as it is a vendor
An Evaluation of the Challenges of Multilingualism in Data Warehouse Development
201
specific proprietary solution which relies on complex
joins to retrieve content for reporting purposes.
4 PROPOSED SOLUTION AND
FUTURE WORK
4.1 Conclusions
We have reviewed the problems presented by ML in
a BI context and discussed the limitations of current
solutions to the design challenge of supporting ML in
the star schema. Existing solutions propose design
workarounds based on extensions of the star schema
but lack theoretical underpinnings and introduce data
redundancy and data manipulation and performance
and maintenance issues. The challenge of ML in a
data warehouse environment is to develop an
approach that can support ML without introducing
inefficiencies into the star schema.
4.2 Proposed Solution
Imhoff et al (2003) propose that language-based
values should be generated during the delivery
process. Corr and Stagnittno (2014) suggest a similar
approach and propose attribute name translation
handled by the BI tool semantic layer. The main
challenge to providing support for ML in the context
of the Star Schema is that attribute and hierarchy
descriptions are saved inside the dimensional tables.
We therefore propose an alternative approach that
would regard the Star Schema as a higher level entity
and save textual descriptions from attributes and
hierarchies elsewhere as language files.
Figure 1 shows the concept of proposed approach
based on a three-layered BI architecture idea. We
describe more detailed technical functionality of the
proposed solution in Figure 2. Figure 1 shows that the
master and transactional data are extracted from
source layer applications into a staging area. From the
staging area data can be extracted directly into
reporting Data Marts, following the philosophy of
Kimball (dashed arrows), or to Data Warehouse and
then to reporting Data Marts, following the Data
Vault and Inmon philosophies (full line arrows).
Before storing data into reporting Data Marts,
attributes and hierarchical descriptions are extracted,
together with their IDs, to language files elsewhere on
the server. As we extract attributes and hierarchical
descriptions and their IDs to separate language files,
we store only integer values to dimensional tables.
Descriptions of attributes and hierarchies would be
associated with relevant IDs from the dimensional
tables during report or query execution (on the fly),
depending on the default language or language
selected. This concept has numerous benefits
compared with conventional methods of enabling
multilingualism in Business Intelligence
environment. Working only with integers makes
SQL/MDX queries faster, reduces table size and
storage requirements (Heflin and Pan, 2010; Smith,
2012).
Initially, the end user sees an Initial Filter Page
(ivp.file) that enables filtering of the content from
data marts to be presented in the Reporting Page
(rep.file). Initial Filter Page uses a default language
set-up via Global Variables and Language Content
Pages (steps a, b, c and d). The application sends an
SQL/MDX request defined using Initial Filter Page
to the data mart (step 1). The first results set is than
sent to Check and Define Page (cdp.file), which takes
it and defines attribute and hierarchical numerical
values from dimensions as language variables (step
2). The second result with variabilized values is then
sent to Reporting Page (rep.file) (step 3). Reporting
Page communicate with Global Variables and
Language Content Page using header (steps 4 and b),
thus it reads the relevant configuration settings for the
language before taking the second result set. After
taking the second result set, Reporting Page reads
language values for attributes and hierarchies from
Language Content files, assigns them and displays
final Result Set in the form of business readable
report (steps 5 and c).
Using this approach, there is no need to re-execute
the SQL/MDX query to change the preview language
– we need only to read new language values from the
Language Content files. As there is no need for new
query execution, switching between languages would
be easy and fast.
The manipulation of the language content would
be easier for business users, as this could be supported
by additional tools or modulesthat enable
manipulation of textual descriptions saved in textual
files. In this context, we consider the idea from
Arefin, Marimoto and Yasmin (2011) that efficient
Content Management System (CMS), or better
Multilingual CMS (MCMS), would help us to
overcome the technical limitations of multilingual
content in semantic web. In the proposed solution,
language files are part of the Warehousing layer, but
content is manageable by MCSM.
Adding new languages for semantic web interface
would not be dependent on the languages that exist in
the source system. If MCSM is implemented so that
it has a backend and a frontend for example, we could
ICEIS 2016 - 18th International Conference on Enterprise Information Systems
202
Figure 1: The concept underlying the proposed approach based on three-layered BI architecture.
define new languages through the MCSM backend.
This would allow the business user to change
descriptions for languages as required.
As it would be possible for business users to
change descriptive content directly and by
themselves, the ETL process required to perform
language changes would be simplified or in some
cases eliminated. In conventional solutions, the whole
ETL process may be performed just to change a small
descriptive value for the specified master data. This is
unnecessary in our proposed solution although it
would be highly recommended that values should be
changed in source systems as well to reflect
corrections made in language files. Otherwise, if there
is a future need to load whole master data for specific
dimension, it would overwrite the corrections made.
We recommend using language files only for smaller
language corrections and executing standard ETL
process when dealing with three or more corrections
at the same time.
This approach simplifies the manipulation of
textual changes which would be performed outside
the values stored in dimensions. This is especially
important with dimensions which are typically large
such as Article, Product or Customer as there would
be no need to re-aggregate existing data according to
textual descriptions of attributes or hierarchies. This
would allow a more optimal use of database memory
as there would be no duplicated data descriptions in
dimensions.
Our proposed approach was implemented in a
proof-of-concept (PoC) artefact. We had one fact
table for Sales data, one dimension table for Product
Categories data, and one language file holding
Product Categories data.The fact table had Product
Category ID as primary key and Price and Amount as
measures. Product Category dimension had Product
Category ID used for relation with fact table and a
Description field holding descriptions of categories.
The language file had the same descriptive data as the
Product Category dimension. The PoC was tested
with 107.768 records in the fact table, 97 records in
Product Category dimension table and 97 records for
Product categories as language file. For testing
purposes, we wanted to show total price per category
in our report. In our first PoC method, we multiplied
Amount and Price from fact table per Product
Category ID and assigned descriptions from language
file during report execution - to present the names
(descriptions) of categories shown. Using a second
method we made the same multiplication, however,
we used the names (descriptions) of categories from
Product Category dimension and assigned them result
set using SQL JOIN.
Results from the PoC showed enviable
improvement in performance when using language
An Evaluation of the Challenges of Multilingualism in Data Warehouse Development
203
Figure 2: Detailed description of technical functionality of the proposed approach.
files. The average execution time for a report that
sums the product sales for 97 different categories
using language files method was seven times faster
than using conventional methods of implementing
star schema. We executed PoC report 228 times: 139
times for method based on language files and 89 times
for other method. Using the language files approach
we could perform smaller descriptive language
changes quickly (simultaneously in data mart and
language files). The approach was also less expensive
in terms of memory. As this approach can be
implemented in the form of add-on to existing
monolingual DW structure, it would be optimal way
to enable multilingualism in existing BI environment.
4.3 Future Work
The findings from the PoC artefact were encouraging
and provide the basis for our future work which will
test our solution in a real world environment, as part
of a MCSM. Future challenges include integrating
multilingualism into the Business Intelligence
framework and addressing migration issues as
businesses move from single language to multilingual
business intelligence systems.
REFERENCES
Anadiotis, G., 2013. Agile business intelligence: reshaping
the landscape. , p.3.
Arefin, M.S., Morimoto, Y. & Yasmin, A., 2011.
Multilingual Content Management in Web
Environment. In 2011 International Conference on
Information Science and Applications. pp. 1–9.
Boakye, E.A., 2012. From Design and Build to
Implementation and Validation. In C. McKinney, ed.
Implementing Business Intelligence in Your Healthcare
Organization. Chicago: HIMSS, pp. 73–86.
Brannon, N., 2010. Business Intelligence and E-Discovery.
Intellectual Property & Technology Law Journal,
22(7), pp.1–5.
Cenoz, J., 2013. Defining Multilingualism. Annual Review
of Applied Linguistics, 33, pp.3–18.
Chaudhuri, S., Dayal, U. & Narasayya, V., 2011. An
overview of business intelligence technology.
Communications of the ACM, 55(8), p. 88–98 .
Cios, K.J. et al., 2007. Data Mining: A Knowledge
Discovery Approach 1st ed., New York, USA: Springer
Science+Business Media, LLC.
Collins, R.W., 2002. Software localization for internet
software: Issues and methods. IEEE Software.
Corr, L. & Stagnittno, J., 2014. Agile Data Warehouse
Design: Collaborative Dimensional Modeling, from
Whiteboard to Star Schema, Leeds: DecisionOne Press.
Cravero, A. & Sepúlveda, S., 2015. Using GORE in Data
Warehouse: A Systematic Mapping Study . Latin
America Transactions, IEEE (Revista IEEE America
Latina) , 13(5), pp.1654 – 1660.
Croatian Government, 1990. The Constitution of the
Republic of Croatia. , p.Article 12.1. .
Dekkers, J., Versendaal, J. & Batenburg, R., 2007.
Organising for Business Intelligence: A framework for
ICEIS 2016 - 18th International Conference on Enterprise Information Systems
204
aligning the use and development of information. In
BLED 2007 Proceedings. Bled, pp. 625 – 636.
Dokeroglu, T., Sert, S.A. & Cinar, M.S., 2014.
Evolutionary Multiobjective Query Workload
Optimization of Cloud Data Warehouses . The
Scientific World, 2014(14), pp.1 – 16.
European Comission, 2008. Final Report: Commission of
the European Communities High Level Group on
Multilingualism, Luxembourg.
Forbes, 2015. The World’s Biggest Public Companies.
Forbes.com. Available at: http://www.forbes.com/
global2000/list/ (Accessed November 24, 2015).
Gouvernement de la République française, 1958.
Constitution of France. , p.Article 2.1. .
Government of Federation of Bosnia and Herzegovina,
1995. Federation Constitution. , p.Article 6.1.
Gracia, J. et al., 2012. Challenges for the multilingual Web
of Data. Web Semantics: Science, Services and Agents
on the World Wide Web, 11, pp.63–71.
Graefe, G. et al., 2013. Elasticity in cloud databases and
their query processing. International Journal of Data
Warehousing and Mining, 9(2), p.1.
Hannula, M. & Pirttimäki, V., 2003. Business intelligence
empirical study on the top 50 Finnish companies. Journal
of American Academy of Business, 2, pp.593–599.
Hau, E. & Aparício, M., 2008. Software
Internationalization and Localization in Web Based
ERP. In SIGDOC ’08 Proceedings of the 26th annual
ACM international conference on Design of
communication. New York: ACM , pp. 175 – 180.
Heflin, J. & Pan, Z., 2010. Semantic Integration: The
Hawkeye Approach. In Semantic Computing. Haboken,
New Jersey, US: IEEE Press; John Wiley & Sons, pp.
199–227.
Hensch, K., 2005. IBM History of Far Eastern Languages
in Computing, Part 1: Requirements and Initial
Phonetic Product Solutions in the 1960s. IEEE Annals
of the History of Computing, 27(1), pp.17–26.
Available at: http://ieeexplore.ieee.org/lpdocs/epic03/
wrapper.htm?arnumber=1401743.
Hillier, M., 2003. The role of cultural context in
multilingual website usability. Electronic Commerce
Research and Applications, 2(1), pp.2–14. Available at:
www.elsevier.com.
Huang, S. & Tilley, S., 2001. ssues of content and structure
for a multilingual web site. In Proceedings of the 19th
annual international conference on computer
documentation. ACM, pp. 103–110.
Huff, A., 2013. Big Data II: Business Intelligence - Fleets
Analyze Information from several sources to improve
overall performance. Commercial Carrier Journal,
(April), p.53.
Imhoff, C., Galemmo, N. & Geiger, J.G., 2003. Mastering
Data Warehouse Design: Relational and Dimensional
Techniques, Indianapolis: Wiley Publishing, Inc.
Inmon, B.W., 2005. Building the Data Warehouse 4th ed.,
Indianapolis: John Wiley & Sons.
Inmon, B.W., 1995. Building the Data Warehouse 1st ed.,
New York: Wiley.
Jensen, C.S., Pedersen, T.B. & Thomsen, C., 2010.
Multidimensional Databases and Data Warehousing
1st ed., Morgan & Claypool Publishers.
Jovanovic, V., Subotic, D. & Mrdalj, S., 2014. Data
Modeling Styles in Data Warehousing. In 37th
International Convention on Information and
Communication Technology, Electronics and
Microelectronics (MIPRO). Opatija, Croatia: MIPRO,
pp. 1458 – 1463.
Kimball, R., 2001. Design Tip #24: Designing Dimensional
In A Multinational Data Warehouse. Kimball Group.
Available at: http://www.kimballgroup.com/2001/
06/ design-tip-24-designing-dimensional-in-a-multina
tional-data-warehouse/ Accessed December 1, 2015).
Kimball, R. et al., 2008. The Data Warehouse Lifecycle
Toolkit 2nd ed., Indianapolis: John Wiley & Sons.
Kimball, R. & Ross, M., 2011. The Data Warehouse
Toolkit: The Complete Guide to Dimensional Modeling
2nd ed., John Wiley & Sons.
Kingdom of Spain, 1978. Spanish Constitution. , p. Article
3.1.
Law No. 482, 1999. Norme in materia di tutela delle
minoranze linguistiche storiche. Gazzetta Ufficiale, 297
(Article 1.1).
Linstedt, D., Graziano, K. & Hultgren, H., 2010. The
Business of Data Vault Modeling 2nd ., Lulu.com.
Marchand, M. & Raymond, L., 2008. Researching
performance measurement systems: An information
systems perspective. International Journal of Operations
& Production Management, 28(7), pp.663 – 686.
Obeidat, M. et al., 2015. Business Intelligence Technology,
Applications, and Trends. International Management
Review, 11(2), pp.47–56.
Olszak, C. & Ziemba, E., 2007. Approach to Building and
Implementing Business Intelligence Systems.
Interdisciplinary Journal of Information, Knowledge &
Management, 2, pp.135–148.
Orlov, V., 2014. Data Warehouse Architecture: Inmon CIF,
Kimball Dimensional or Linstedt Data Vault? WMP
Blog. Available at: http://blog.westmonroepartners
.com/data-warehouse-architecture-inmon-cif-kimball-
dimensional-or-linstedt-data-vault/ (Accessed February
20, 2015).
Ponniah, P., 2004. Data Warehousing Fundamentals: A
Comprehensive Guide for IT Professionals, New Yor:
John Wiley & Sons.
Poolet, M.A., 2008. Data Warehousing: Rapidly Changing
Monster Dimensions. SQL Server Pro. Available at:
http:// sqlmag.com/database-administration/data-ware
housing-rapidly-changing-monster-dimensions (Acces
sed November 24, 2015).
Popovič, A., Turk, T. & Jaklič, J., 2010. Conceptual Model
of Business Value of Business Intelligence Systems.
Management: Journal of Contemporary Management,
15(1), pp.5–29.
Purba, S., 1999. Handbook of Data Management 3rd ed.,
Boca Raton, FL, US: Auerbach, CRC Press.
Sano, M. Di, 2014. Business Intelligence as a Service: a
new approach to manage business processes in the
An Evaluation of the Challenges of Multilingualism in Data Warehouse Development
205
Cloud. In 2014 IEEE 23rd International WETICE
Conference. Parma, pp. 155–160.
SAP AG, 2015. SAP Library - XML: BW - Data
Warehousing - Modeling. Available at: http://
help.sap.com/saphelp_snc70/helpdata/en/4c/89dc37c7f
2d67ae10000009b38f889/frameset.htm (Accessed
March 2, 2015).
Scrapehero, 2015. How many products does Walmart.com
sell in comparison to Amazon.com? Scrapehero.com.
Available at: http://learn.scrapehero.com/how-many-
products-does-walmart-com-sell-vs-amazon-com/
[Accessed November 24, 2015].
Sen, A. & Sinha, A.P., 2005. A comparison of data
warehousing methodologies. Communications of the
ACM, 48(3), pp.79–84.
Smith, P., 2012. Professional Website Performance:
Optimizing the Front-End and Back-End, Indianapolis,
USA: John Wiley & Sons.
Vázquez, S.R., 2013. Localizing Accessibility of Text
Alternatives for Visual Content in Multilingual
Websites. In ACM SIGACCESS Accessibility and
Computing. ACM, pp. 34–37.
Yrjö-Koskinen, P., 1973. Johto – Tiedontarve –
Tiedonhankinta. Miten informaatiopalvelu voi palvel la
yrityksen johtoa? Publications of Insinöörijärjestö,
No.77 - 73.
ICEIS 2016 - 18th International Conference on Enterprise Information Systems
206
... Data warehouses are very often modelled by a star schema (Dedic and Stanier, 2016). This schema characterised by a table of facts of very large (Giga or Terra Octets) linked by foreign keys to a set of tables of smaller dimension thus forming a star schema ( Figure 1). ...
... Data warehouses are very often modelled by a star schema (Dedic and Stanier, 2016). This schema characterised by a central facts table, linked by a set of dimension tables via their foreign keys. ...
Article
Full-text available
Data warehouses are dedicated to analysis and decision-making applications. They are often schematised as star relational models or variants for on-line analysis. Typically, the analysis process is conducted via online analytical processing (OLAP) type queries. These queries are usually complex, characterised by multiple selections operations, joins, grouping and aggregations on large tables. Which require a lot of calculation time and thus a very high response time. The performance of these queries depends directly on the use of the secondary memory. Indeed, each input-output on disk requiring up to ten milliseconds. In order to reduce and minimise the cost of executing these queries, the data warehouse administrator must make a good physical design during the physical design and tuning phase by optimising access to the secondary memory. We focus on bitmap join indexes that share the same resource, that is, the selection attributes extracted from the business intelligence queries. To optimise star join queries.
... Data warehouses are very often modelled by a star schema (Dedic and Stanier, 2016). This schema characterised by a table of facts of very large (Giga or Terra Octets) linked by foreign keys to a set of tables of smaller dimension thus forming a star schema ( Figure 1). ...
... Data warehouses are very often modelled by a star schema (Dedic and Stanier, 2016). This schema characterised by a central facts table, linked by a set of dimension tables via their foreign keys. ...
Article
Full-text available
Data warehouses are dedicated to analysis and decision-making applications. They are often schematised as star relational models or variants for on-line analysis. Typically, the analysis process is conducted via online analytical processing (OLAP) type queries. These queries are usually complex, characterised by multiple selections operations, joins, grouping and aggregations on large tables. Which require a lot of calculation time and thus a very high response time. The performance of these queries depends directly on the use of the secondary memory. Indeed, each input-output on disk requiring up to ten milliseconds. In order to reduce and minimise the cost of executing these queries, the data warehouse administrator must make a good physical design during the physical design and tuning phase by optimising access to the secondary memory. We focus on bitmap join indexes that share the same resource, that is, the selection attributes extracted from the business intelligence queries. To optimise star join queries.
... O BI ajuda as empresas a pensar melhor sobre a concorrência por meio de um melhor entendimento da base de clientes (Brannon, 2010), o que pode levar à criação de um relacionamento mais próximo e mais forte com os clientes e ao aumento da receita (Alexander, 2014). Além disso, desempenha um papel crítico para os negócios em termos de desenvolvimento organizacional, fornecendo vantagem competitiva, no contexto de alcançar assimetria positiva de informações (Thamir & Poulis, 2015), e contribui para otimizar processos e recursos de negócios, maximizar lucros e melhorar proativamente, bem como para tomada de decisão estratégica (Dedić & Stanier, 2016). ...
Article
Full-text available
Objective: Identify and characterize the methodologies used for the experimental development of intelligent applications aligned with strategic planning.Methodology: A systematic mapping was carried out to characterize the research in the area, considering the last ten years.Originality: No scientific studies were found with the same research object of this article, to identify and characterize the methodologies for the experimental development of intelligent applications aligned with strategic planning, which increases the importance of the results presented here.Main results: As a result, no studies were found that presented any complete approach to discipline strategic alignment and experimentation, providing clear compliance with strategic objectives and an experimental phase in the validation of results. However, some trials of parts of these characteristics could be mapped, such as experimentation found in 28,57% of the studies. Among the countries, China, the United States and Brazil led the ranking of publications on the subject. As for the medium of publication, Journal was the most used option for publication. In addition, the "IEEE International Conference on Advanced Communications, Control and Computing Technologies" and the journal "Expert Systems with Applications" stood out as major publishers.Theoretical Contributions: This research presents results relevant to academia and entrepreneurs, providing evidence that there is a gap in research on a formal method of BI and Data Mining applications experimental and strategy-driven development. In addition, this work is presented as a source of consultation to the existing method standards for the development of intelligent applications, as well as being replicable and extended by the applied systematization. Finally, there is a focus on research that proposes methods of creating experimental applications validated experimentally and aligned with strategy.
... A star schema logic was used to build the data model [58]. The star schema separates business process data into facts that contain the measurable, quantitative information about a business, along with the dimensions that are descriptive attributes related to the factual data. ...
Article
Full-text available
Managers are required to make fast, reliable, and fact-based decisions to encompass the dynamicity of modern business environments. Data visualization and reporting are thus crucial activities to ensure a systematic organizational intelligence especially for technological companies operating in a fast-moving context. As such, this paper presents case-study research for the definition of a business intelligence model and related Key Performance Indicators (KPIs) to support risk-related decision making. The study firstly comprises a literature review on approaches for governance management, which confirm a disconnection between theory and practice. It then progresses to mapping the main business areas and suggesting exemplary KPIs to fill this gap. Finally, it documents the design and usage of a BI dashboard, as emerged via a validation with four managers. This early application shows the advantages of BI for both business operators and governance managers.
... A data warehouse is a system for reporting and data analysis and is a core component of business intelligence according to Dedi et al. (2016). It can be defined as a collection integrated databases designed to support a decision support system (DSS) 37 . ...
Thesis
Industry 4.0 creates a change in maintenance. Due to the rise of Cyber-Physical Production Systems (CPPS) and the availability of sensor data, maintenance was changed from descriptive to prescriptive maintenance. The Internet of Things (IoT), Data Science and Artifical Intelligence (AI) all play a vital role in the development of manufacturing technology. Predictive and prescriptive maintenance is expected to grow by approximately 39% to a total of, 10.96$B by 2022. Smart Manufacturing Leadership Coalition (SMLC) has also predicted that the following targets can be achieved by data driven analytics in smart manufacturing (1), 30% reduction in capital intensity, (2) up to 40% reduction in product cycle times, and (3) overarching positive impact across energy and productivity. Lueth K. et al. (2016) stated in their report that 79% of all decision makers of Original Equipment Manufacturers will see predictive and prescriptive maintenance as one of the most important applications in the next 1-3 years. In the area of prescriptive analytics the goal is to find the best course of action for a given problem, by using techniques like recommendation engines and neural networks for solving a problem. Those techniques can then be converted for use in maintenance. A rising demand for prescriptive maintenance, which offers decision support can be anticipated, while currently predictive maintenance mostly consists of inappropriate maintenance strategies and conditions. According to Cheng et al. (2018) and R. Ranjan (2014) state of the art decision-making processes combine different data sources with data science methods to either improve the system intelligence or establish an automated big data pipeline Cheng et al. (2018) and R. Ranjan (2014). The concept Knowledge Based Maintenance (KBM)9101112 is a key enabler for digital transformation to prescriptive maintenance. As stated by Ansari, Glawar, et al. (2019), the PriMa model and its four-step methodology have been introduced and an applied as part of a proof-of-concept study, however while the paper specifies the methodology and approach in detail, it does not go into detail on how to achieve problem 1 (P1) the data input into the data warehouse, problem 2 (P2) how to build aggregator functions and most importantly, how to handle the feedback loop between the Knowledge-Base and the Decision Support Dashboard problem 3 (P3). This works aims to design an automated PriMa model, specifically focusing on the knowledge pipeline from the textual data from maintenance reports to the recommendation of a solution for the problem identified in the report. These questions have been answered by looking into the requirements given by ODonovan et al. (2015) for the data ingest process and proposing an own requirement list for a data warehouse solution (P1). In the next step three machine learning (ML) algorithms, namely Hamilton Monte-Carlo (HMC), Random Forest (RF) and Neural Networks (NN) reasoning have been generated and minimum working examples provided. Their outputs later on have been aggregated by a weighted hybrid function (P2). For the knowledge pipeline a Natural Language Processing (NLP) algorithm was applied which uses maintenance reports and extracts nouns and verbs. Those than can be matched against an ontology by using case-based reasoning (CBR) with the help of SPARQL. To sum up, the present thesis contributes on design and technical realization of the knowledge pipeline in the context of maintenance by analyzing technical requirements and developing a proof of concept demonstrator.
Conference Paper
Full-text available
Fast evolving business environment requires excellent innovation performance for the success and growth of organizations. Above all, creativity is a critical component in the competition. Analytical thinking and innovation, active learning and learning strategies and creativity are the main skills in continuous coevolution of businesses. All these skills are essential during the early phases of innovation. These skills give us a guidance in the pathway to growth – entrench the competitive advantages. A deep study reveals answers to early activities - pre-inventive actions open the way of innovation initiatives to growth. It is important really immerse us to explanations for origins of innovation.Ambitious growth objectives cannot be achieved without effective, dynamic and systematic innovation. The early activities – pre-inventive actions - are designed to be a valuable innovation assistance in the step-by-step evolutionary pathway to growth. This is a new value stream innovation paradigm. Challenges and insights emerge during the pathway.Data management is policies and procedures to effectively manage the information lifecycle needs. It means data collection, classification, validation, storing, protection, and processing during lifecycle from creation to deletion - the practice of collecting, keeping, and using data securely, efficiently, and cost-effectively. This is a foundation for digitalization – mining for insights in company’s data.The amount of usable data in business environment and as big data is exponentially increasing. Technology opportunities are available to be used in data- analytics and in managing by data for the purpose of business co-evolution. The understanding of human factors inside self- organized teams during collective interpretation and imagination of future opportunities is becoming most important feature.The goal of this article is to identify and analyze futures thinking in innovation capability. The article attempts to develop evolutionary framework for data and event analytics in content management of fuzzy front end of innovation. An essential team performance is collective interpretation and imagination. That can be used in prospective innovation.This research is partly constructive, conceptual and analytical because it introduces data analytics progression framework in fuzzy front end of innovation.Data for this concept creation has been collected over several years on continuous flow of qualitative analysis of scientific articles and used methodologies in operative industrial environment. This can be seen as a type of applied science. Experiences and many results gathered during this analysis work can be used in sustainable development. These can be benefited in development work on data- based creativity, interpretation, innovation and foresight. A generic perception of this analysis work is that successful fuzzy front end of innovation needs structured data- analysis methodology and dynamic team performance to execute collective interpretation and imagination to serve as basis for common value creation in systemic and evolutionary innovation.
Chapter
According to the Global Burden of Disease Study, in the past 20 years, cardiovascular disease (CVD) has consistently ranked as the leading cause of human death and loss of healthy life. Regular and systematic physical exercise is a crucial treatment strategy for CVDs, and many evidence-based medical studies support its safety and effectiveness. Therefore, most CVD patients are encouraged to take long-term physical exercise. Although there are many exercise guidelines for patients with CVD, most of them are at population level rather than individualized, and thus overlook individual differences such as disease severity, drug intake, and common risk factors. In addition, when prescribing exercise for individuals, knowing how to combine different exercise training guidelines properly is a major challenge in the clinical practice of traditional exercise medicine. Data-driven exercise medicine, based on objective data and computer technology, improves the efficiency and safety of exercise intervention and offers an excellent solution to these issues by fully considering the individual differences of patients and dynamically generating optimally customized exercise prescription. This article takes CVDs as an example to review the concept, data integration, and modeling methods of data-driven exercise medicine, discussing specific application cases and summarizing challenges and future development trends.
Chapter
The chapter presents a comparative analysis of the implementation of multilingual/bilingual education in Kazakhstan and the Netherlands. The study explores the origin and role of multilingualism, for socio-economic development, relevant regional and international practices in multilingual education, the growth of trilingual instruction, English language training and the use of digital technologies (distant, e-learning programs) in the Eurasian region, Major common features and differences of implementing multilingual/bilingual teaching between the two case studies and the main findings from research trips are identified. A theoretical analysis of scholarly approaches, as well as practical and methodological implications, are made on the example of implementing multilingual instruction at the Kazakhstani agricultural higher educational institutions, particularly Kazakh National Agrarian University and Kazakh S. Seifullin Agritechnical University and Dutch counterparts - Wageningen University, and the Hague University of Applied Sciences.
Chapter
In this study, innovative forecasting techniques and data sources from Big Data are used for the study of Hotel Overnight Stays for Spain, from January 2018 to June 2019. The unstoppable development of the Tourism sector with the application of Big Data technologies, allow to make efficient decisions by economic agents. In this work, the use of the data collected from the Google Data Mining tools allows to obtain knowledge about Hotel Tourism Demand in Spain. The analysis carried out meets the four basic principles of Big Data analysis: volume, velocity, variety and veracity. In this setting, the methodology used corresponds to ARDL models, and ECM models being developed Granger-Causality extended to seasonality. The first one explains easily when economic agents will make their decisions; while the second one allows forecasting for short-term and long-term. This fact means that tourist offers and demands can be perfectly adjusted at every moment of the year. As a criterion for the selection of models, the innovative Matrix U1 Theil is proposed, this allows to quantify how much a model is better than another in terms of forecasting.
Article
Full-text available
Enterprises are considering substantial investment in Business Intelligence (BI) theories and technologies to maintain their competitive advantages. BI allows massive diverse data collected from virus sources to be transformed into useful information, allowing more effective and efficient production. This paper briefly and broadly explores the business intelligence technology, applications and trends while provides a few stimulating and innovate theories and practices. The authors also explore several contemporary studies related to the future of BI and surrounding fields.
Article
Full-text available
With advances in the business intelligence area, there is an increasing interest for the introduction of business intelligence systems into organizations. Although the opinion about business intelligence and its creation of business value is generally accepted, economic justification of investments into business intelligence systems is not always clear. Measuring the business value of business intelligence in practice is often not carried out due to the lack of measurement methods and resources. Even though the perceived benefits from business intelligence systems, in terms of better information quality or achievement of information quality improvement goals, are far from being neglected, these are only indirect business benefits or the business value of such systems. The true business value of business intelligence systems hides in improved business processes and thus in improved business performance. The aim of the paper is to propose a conceptual model to assess business value of business intelligence systems that was developed on extensive literature review, in-depth interviews, and case study analysis for researching business intelligence systems' absorbability capabilities or key factors facilitating usage of quality information provided by such systems respectively.
Article
Full-text available
The Goal-Oriented Requirements Engineering, known as GORE, is the first step in the development process of systems Data Warehouses (DW). It is based on the identification of the goal of the business that wants to analyze the executive, which enables developers to obtain the information requirements of DW through a set of transformations from the goal modeling in specifications model. This paper provides an overview of how some development proposals DW used GORE to elicit, negotiate, specify and validate information requirements. The methodology is the systematic mapping studies.
Conference Paper
Business intelligence (BI) processes have increased significantly in recent years and now play a fundamental role in many organizations, in order to support strategic decisions. Data warehouse is the core technical solution to design BI platforms. An effective BI infrastructure that leverages the power of a data warehouse can deliver value by helping companies enhancing their customer experience, reducing the gap between some strategic goals and their realization in an IT business process. In this paper, the author explains the realization of a BI platform solution (called BIOP) hosted on a custom cloud system, exploiting both the advantages of a flexible and on-demand solution.
Chapter
The objectives of this chapter are to (1) learn what exactly data mining is and examine its features; (2) compare data mining with online analytical processing (OLAP) and understand the relationships and differences; (3) notice the place of data mining in a data warehouse environment; (4) carefully go through the important data mining techniques and understand how each works; and (5) study a few data mining applications in different industries and perceive the application of the technology to one's environment.