Conference PaperPDF Available

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development

April 2016

April 2016

DOI:10.5220/0005858401960206

Conference: 18th International Conference on Enterprise Information Systems - ICEIS 2016
At: Rome, Italy
Volume: 1

Authors:

Nedim Dedić

Miba AG

Clare Stanier

Staffordshire University

In this paper we discuss Business Intelligence and define what is meant by support for Multilingualism in a Business Intelligence reporting context. We identify support for Multilingualism as a challenging issue which has implications for data warehouse design and reporting performance. Data warehouses are a core component of most Business Intelligence systems and the star schema is the approach most widely used to develop data warehouses and dimensional Data Marts. We discuss the way in which Multilingualism can be supported in the Star Schema and identify that current approaches have serious limitations which include data redundancy and data manipulation, performance and maintenance issues. We propose a new approach to enable the optimal application of multilingualism in Business Intelligence. The proposed approach was found to produce satisfactory results when used in a proof-of-concept environment. Future work will include testing the approach in an enterprise environment.

: Product dimension with extended primary key.

…

Figures - uploaded by Nedim Dedić

Content may be subject to copyright.

Content uploaded by Nedim Dedić

Content may be subject to copyright.

An Evaluation of the Challenges of Multilingualism in Data

Warehouse Development

Nedim Dedić and Clare Stanier

Faculty of Computing, Engineering and Sciences, Staffordshire University,

Beaconside, Stafford, Staffordshire, ST18 0AD, U.K.

Keywords: Business Intelligence, Data Warehousing, Multilingualism, Star Schema, Semantic Web.

Abstract: In this paper we discuss Business Intelligence and define what is meant by support for Multilingualism in a

Business Intelligence reporting context. We identify support for Multilingualism as a challenging issue which

has implications for data warehouse design and reporting performance. Data warehouses are a core component

of most Business Intelligence systems and the star schema is the approach most widely used to develop data

warehouses and dimensional Data Marts. We discuss the way in which Multilingualism can be supported in

the Star Schema and identify that current approaches have serious limitations which include data redundancy

and data manipulation, performance and maintenance issues. We propose a new approach to enable the

optimal application of multilingualism in Business Intelligence. The proposed approach was found to produce

satisfactory results when used in a proof-of-concept environment. Future work will include testing the

approach in an enterprise environment.

1 INTRODUCTION

Users today expect to access relevant information in

the semantic web in their own language (Gracia et al,

2011). This is also the case when accessing analytical

data relevant for decision making using Business

Intelligence (BI) applications, such as reports or

dashboards. In this paper, we discuss the application

of BI in an international context focusing the issue of

Multilingualism (ML). Multilingualism is defined

further in section two but can be understood as data

manipulation and reporting in more than one

language. Understanding the issues presented by ML

requires discussion of the Data Warehouse (DW),

which is a core component of BI (Olszak and Ziemba,

2007). We also discuss data marts and focus on the

star schema as the most accepted form of dimensional

modelling. We evaluate current solutions and

approaches to the implementation of ML using the

star schema in BI environment and propose an

improved approach.

The rest of this paper is structured as follows:

section 2 defines BI and ML and the requirement to

support ML in a BI context; section 3 discusses ML

in a Data Warehouse environment, outlining DW

concepts and design approaches, the role of the star

schema and the issues presented when implementing

ML in a star schema; section 4 presents the

conclusions and evaluation, proposing a revised

approach to supporting ML.

2 PROBLEM CONTEXT:

BUSINESS INTELLIGENCE

AND MULTILINGUALISM

To survive in today’s business a company has to

continuously improve productivity and efficiency,

while management and executives have to make

decisions almost immediately to ensure

competitiveness (Huff, 2013). Information is used to

enable improved decision making and efficiency

(Yrjö-Koskinen, 2013; Hannula and Pirttimäki,

2003). This process is supported by activities,

processes and applications which are collectively

known as Business Intelligence. Business

Intelligence was initially used to describe activities

and tools associated with the reporting and analysis

of data stored in data warehouses (Kimball et al,

2008). Dekkers et al., (2007) define BI as a

continuous activity of gathering, processing and

analysing data. BI helps companies to out-think the

196

Dedi´

c, N. and Stanier, C.

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development.

In Proceedings of the 18th International Conference on Enterprise Information Systems (ICEIS 2016) - Volume 1, pages 196-206

ISBN: 978-989-758-187-8

competition through better understanding of the

customer base (Brannon, 2010) and can provide

competitive advantage (Marchand and Raymond,

2008), and support for strategic decision-making

(Popovič et al., 2010). From the early days of

computing, computer technology and software has

been associated with development in the English

language (Hensch, 2005) and Business Intelligence is

no exception.

BI is a fast evolving field (Obeidat et al, 2015;

Chaudhuri et al., 2011) and although traditional BI

focussed on activities such as data warehousing and

reporting, the new generation of BI has an additional

focus on data exploration and visualisation

(Anadiotis, 2013; Obeidat et al, 2015), which

increases the demand for ML. The business and legal

context of BI is also evolving. With emerging markets

and expanding international cooperation, especially

in the case of European Union, there is a requirement

to support BI in languages other than English. Users

today expect to access information in the semantic

web in their own language (Gracia et al, 2011). Based

on the online profiles of the biggest European

companies (Forbes, 2015), most of these companies

are international in their nature. Business users expect

to be able to use software and applications, including

Business Intelligence, in their own language for the

purpose of better productivity (Hau and Aparício,

2008). Thus, when expanding their business to new

countries, companies need to extend and adopt their

BI infrastructure to support optimal use of local

languages. In a European context, the requirement

for multilingualism has legal underpinnings in some

countries. Most European countries have laws on the

official use of their respective languages in public

communication (Italian Law No. 482, 1999;

Constitution of France, 1958; Constitution of Croatia,

1990; Federation Constitution, 1994; Spanish

Constitution, 1978). For international companies,

this means a requirement to support the use of several

languages to obey local laws. Where there is a need

to support multiple languages, there is an imperative

to enable transfer and processing of textual

accessibilities for localization purposes (Vazquez,

2013). This also applies in a Business Intelligence

environment.

Multilingualism is complex phenomenon which

can be seen from different perspectives and has many

definitions (Cenoz, 2013). The European

Commission (2008, p.6.) defines it as “the ability of

societies, institutions, groups and individuals to

engage, on a regular basis, with more than one

language in their day-to-day lives”. In the Business

Intelligence DW and presentation context, we define

multilingualism as the ability to store descriptive

information and to use this information in a semantic

layer in more than one language. The changing

attitudes of business users, the importance of

emerging and international markets and ever-growing

local data warehousing communities are additional

issues that justify the application of multilingualism

in Business Intelligence. Multilingualism, however,

presents challenges for design and reporting in data

warehouses.

3 MUTILINGUALISM IN A DATA

WAREHOUSE ENVIRONMENT

3.1 Data Warehouse Concepts

Data warehouses are seen as core in the development

of BI systems (Olszak and Ziemba, 2007). A widely

accepted definition of the Data Warehouse is

provided by Inmon (2005) who defined a Data

Warehouse as a collection of integrated databases

designed to support the DSS (decision support

system) function. In this definition, the data

warehouse from the architectural point should be

almost the same as the source system and may also

have data marts, or aggregated tables that are used for

reporting and querying purposes. Linstedt et al (2010)

propose very similar concept known as the Data Vault

approach. Differentiation is only in the context of

modelling and storing information inside the Data

Warehouse. In the Data Vault approach data is loaded

from the source system in its original format (Linstedt

et. al, 2010). The Data Vault approach is built around

Hubs, Links and Satellites (Jovanović et al., 2014).

Hubs represent source system business keys in a

master table, links are associations between hubs with

validity period, and satellites point to the links

containing attributes of transaction with validity

period (Orlov, 2014). As the structure of the data is

highly normalized (4NF+), this approach for the

implementation of a data warehouse is not adequate

for direct reporting and requires additional

dimensional data marts to enable reporting or

querying (Orlov, 2014).

Kimball et al (2008) presented an alternative view

of the data warehouse, arguing that a data warehouse

should be seen as a collection of data marts, which are

used for querying and reporting and connected using

conformed dimensions. In this approach, there is no

requirement to replicate all the data from the source

system, just the data needed by the business.

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development

197

3.2 The Star Schema

Much of the literature on the development of data

warehouses, and particularly the seminal works by

Inmon and Kimball, dates from the end of the 20th

century/the first decade of the current century. There

has been comparatively little recent work on data

warehouse design and schema development although

there is a significant literature on data warehouse

development and optimisation (Cravero and

Sepulveda, 2015; Dokeroglu et al., 2014; Sano, 2014;

Graefe et al., 2013). Inmon and Kimball are the two

seminal figures in this field and although their data

warehouse design philosophies, as discussed in

section 3.1, are opposed, both propose dimensional

modelling and the use of data marts (star or snowflake

schema) for reporting (Orlov, 2014). The Data Vault

approach introduced by Linstedt (2010) also proposes

the use of data marts (in the form of star or snowflake)

for reporting. Inmon (1995), Kimball (2008) and

Linstedt (2010) all recommend the use of the star

schema as the most appropriate design strategy for the

development of data marts. Additionally, a survey

paper by Sen and Sinha (2005) examined the

approaches used by 15 data warehouse vendors and

found that 12 of the 15 vendors supported the star

schema (alone or in combination with others star

schema based approaches).

A star schema is a collection of dimension tables

and one or more fact tables (Cios, Pedrycz, Winiarski

et al, 2007). Dimensions have a key field and one

additional field for every attribute (Kimball et al,

2008; Jensen et al, 2010). The fact table is a central

table that contains transactional information and

foreign keys to dimensional tables, while dimensional

tables contain only master data (Kimball et al, 2008;

Jensen et al, 2010; Cios, Pedrycz, Winiarski et al,

2007). In visual model representation, the dimension

model resembles a star, thus the name (Jensen, 2010).

The main benefits of the star schema design are ease

of understanding and a reduction in the number of

joins needed to retrieve the data (Cios, Pedrycz,

Winiarski et al, 2007).

In dimension tables, the primary key is used to

identify the dimensional value, while hierarchy is

defined through attributes. Dimension tables do not

confirm to the relational model strategy of

normalisation and may contain redundancy (Jensen et

al, 2010). The Fact table, on the other hand, holds the

foreign key to dimensional table values and as there

is no redundancy it could be considered to be in 3NF

(Jensen et al, 2010). All foreign keys to the

dimensional tables build together the primary key for

the fact table.

There are other schemas used for the purpose of

dimensional modelling, such as snowflake or galaxy.

Cios, Pedrycz, Winiarski et al (2007) consider the

snowflake and the galaxy schemas as the variations

of the star schema, while Inmon (1995), Kimball

(2008), Linstedt (2010), Corr and Stagnittno (2014)

and Jensen et al (2010) consider it as a separate

dimensional modelling philosophy and not as a

variation of the star schema.

The star schema is the dimensional modelling

concept most used by the industry (Sen and Sinha,

2005). It is recommended as the most appropriate

design strategy for development of data marts and is

considered as a general dimensional modelling

approach in the data warehouse (Inmon, 1995;

Kimball, 2008; Linstedt, 2010). Thus, in this paper

we focus on the issues of multilingualism exclusively

within star schema. Although the accepted design

approach for DW development, and regarded as a

good fit for business requirements (Purba, 1999), star

schemas present issues when handling multilingual

systems.

3.3 Multilingualism Issues in Data

Warehouse Design

Conventional Business Intelligence uses a process

known as ETL [Extract-Transform- Load] (Kimball

et al, 2008; Inmon, 2005) in which data is extracted

from data source applications, transformed and

loaded into a data storage medium, typically a data

warehouse or data mart and then analysed to enable

meaningful reporting usually via web or desktop

applications. However, this process is most effective

in a single language environment. Language,

including multilingualism, is a difficult issue in

software localization (Collins, 2002) especially in a

multidimensional context; and the expansion of BI

systems to enable reporting in different languages is

not trivial. In the context of multilingual websites,

several factors have been identified when presenting

to different range of audience in different countries

Table 1: Simple Product dimension.

Key Description Code Category Subcategory From_Date To_Date

123 Apples FA Fruits vegetables Fruits 01.01.2014 31012014

124 Beer DB Drinks Alcoholic 01.01.2014 31012014

ICEIS 2016 - 18th International Conference on Enterprise Information Systems

198

(Hillier 2003). The most important are cultural

context and accessibility of applications in local

language. Creating and maintaining a web

environment in a multilingual perspective creates

special challenges, both cultural and technical (Huang

and Tilley, 2001). Besides the standard issues that

arise during translation process from one to other

language (such as meaning of words, terminology,

phrases, text direction, date formats, etc.) we face

additional technical issues when translating texts in

computer-based environment (Hillier, 2003).

These problems may range from different

application environments to different implementation

standards. To optimally apply multilingualism to

existing Business Intelligence environment it is

necessary to identify the issues in a BI environment.

The next section examines three possible

workarounds based on amendments to the star

schema: including additional attributes, extending the

primary key and providing additional dimension

tables. All these solutions, as discussed below

introduce problems such as extreme data

redundancies leading to performance issues, and

implementation and maintenance difficulties.

3.3.1 Adding Additional Attributes for New

Languages to Dimension

One approach, derived from Kimball’s method for

delivering country-specific calendars (Kimball and

Ross, 2011), recommends that where there are new

values for the dimension tables in star schema, we

simply add new attributes to the dimension. This

approach is also proposed by Imhoff et al (2003) as a

solution for simultaneous bilingual reporting. Imhoff

et al (2003) state that if we need to provide the ability

to report in two or more languages within the same

query, we need to publish the data with multiple

languages in the same row. When implementing

dimensions using this approach, attributes should be

descriptive, added in the form of textual labels that

consist full words, without missing values, discreetly

valued and quality assured (Kimball et al 2008). This

is illustrated by the simple Product dimension shown

Table 1. As we see, attributes in the Product

dimension (Description, Code, Category and

Subcategory) are textual fields and at this point, the

star schema would be sufficient for the most

companies to implement functional and optimal data

marts. The limitation of this approach in a

multilingual environment would be extremely large

dimension tables. For example, if there are ten

descriptive attributes for “product” dimension, in the

case of the five languages, there would be an

additional forty columns.

To demonstrate the problem, we convert the

product dimension table (Table 1) to a conceptual

view (Table 2a). The sample Product dimension,

based on Table 2a, which additionally includes

German, Italian and Bosnian language besides

English would look like Table 2b.

Table 2a: Conceptual view of the Product dimension.

Key (Primary Key)

Description

Code

Category

Subcategory

From_Date

To_Date

Description_DE

Code_DE

Category_DE

Subcategory_DE

Description_IT

Code_IT

Category_IT

Subcategory_IT

Description_BA

Code_BA

Category_BA

Subcategory_BA

As we see, new attribute columns for German, Italian

and Bosnian language are added for every possible

textual description. They have suffix _DE, _IT and

_BA (Table 2b).This simplified example does not

fully convey the scale of the problem. In

implementation practice, the Product dimension

might contain more than 20 of textual attributes and

the problem would be replicated for all dimensions.

A real-world example of a Product dimension would

include descriptive attributes to be used as reporting

aggregates, for example, description, category,

subcategory, assortment, assortment area, buying

department, brand, brand origin, country,

international categorization, product level, season

information, product state, class and type.

As they require large amounts of maintenance

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development

199

time and CPU (Poolet, 2008), large and wide

dimension tables can be problematic, especially for

rapidly changing dimensions such as a Customer

dimension (Ponniah, 2004). Rapidly changing

dimensions are those dimensions where attribute or

hierarchical values change frequently (Boakye,

2012). As an example, consider a Customer

dimension with several million rows of data intended

to be used in five languages. In this example, the

Customer dimension has descriptive attributes such

as “buying category”, “buying frequency” and

“monetary value” in all five languages. These

categories are intended to be updated on a daily basis.

This and similar scenarios could lead to system

overhead on a daily basis. In addition, wide

dimension tables require duplicate storage for

descriptive attributes and make ETL transformation

complex as the language-based columns must be

taken into account. More complex SQL/MDX

statements are required with different language-based

columns to change the language of data previews at

the semantic level (reports, queries or dashboards).

Moreover, queries that return data sets must be re-

executed in the required language. There are other

external, but related problems caused by using this

approach. For example, updating or changing some

descriptive hierarchical attributes that are used as

basis for tables containing aggregated data. Suppose

a specific group of products change their category

from non-alcoholic drinks to energy drinks, affecting

also subcategories. It is necessary to update the

dimension table to change the descriptive records for

every language and also to re-aggregate the data in

tables holding aggregated data. In this scenario, we

would have to delete all data in tables that hold

aggregate data according to category and re-

aggregate. The process of re-aggregation could take

several days if we have billions of records in fact

tables, which is not unusual; Wallmart.com sells more

than 4.000.000 and Amazon.com more than

350.000.000 different products (Scrapehero.com,

2015). The situation is more critical with wide

dimension tables that represent rapidly changing

dimensions. The overhead would also increase as the

company needs to store more languages.

3.3.2 Extending a Primary Key to Include

Language Identifier

The second approach, discussed by Imhoff (2003),

proposes extending a primary key to include a

language identifier. As we can see from Table 3, the

limitation in this case is duplication of the records

with every new language. With five languages for the

product dimension, which for example holds one

million of data, there would be five million records.

Larger dimension tables slow the process of query

execution and make it harder to manage updates

according to the slowly changing dimension rules.

Slowly changing dimensions are dimensions whose

attribute or hierarchical values change over time, but

unlike rapidly changing dimensions, values are

changed unpredictably and less frequently (Kimball

et al, 2008). This approach to multilingualism is

problematic also for rapidly changing dimensions

(Ponniah, 2004), and as with the additional attributes

approach, makes heavy increased demands in terms

of maintenance time and CPU (Poolet, 2008). From a

memory perspective this approach is less efficient

than that discussed in 3.3.1 as it doubles storage

requirements with every additional language. This

approach also suffers from the semantic

layerproblems previously discussed: to change the

language of data preview on semantic layer (reports,

dashboards), query statements that return data sets

must be re-executed. This approach, unlike the

additional attributes approach, does not lead to more

complex ETL transformations and SQL/MDX

statements but the same problems exist with regard to

rapidly changing dimensions and changing the

structure of externally aggregated tables. For

companies using several languages and holding

millions of records in their dimensions, re-executing

queries and re-aggregating data according to a

specific language can be time, memory and CPU

demanding.

3.3.3 Schema and Multiple Dimensional

Tables Solution

A third approach discussed by Kimball (2001),

Imhoff et al (2003) and Corr and Stagnittno (2014),

proposes implementing one fact table and multiple

dimensional tables – depending on the number of

languages required. Different languages are saved in

different database schema and/or in different tables.

For example, for five different languages, we would

have five “product” dimension tables - one for every

language. For the same example, if there are one

hundred initial dimensions in data warehouse, five

hundred dimension tables would be required to satisfy

the MLrequirements for five languages. This

approach has numerous limitations. As additional

tables and possibly additional schemas are needed in

the data warehouse, this approach makes ETL

processes more complex as the language-based tables

must be planned for. It requires additional

transformations to every table for every additional

ICEIS 2016 - 18th International Conference on Enterprise Information Systems

200

Table 3: Product dimension with extended primary key.

Key Lang Description Code Category Subcategory From Date To Date

123 EN Apples FA Fruits vegetables Fruits 01.01.2014 31.01.2014

124 EN Beer DB Drinks Alcoholic 01.01.2014 31.01.2014

123 DE Äpfel FA Obst und Gemüse Obst 01.01.2014 31.01.2014

124 DE Bier DB Getränke Alcoholisch 01.01.2014 31.01.2014

123 IT Mele FA Frutta e Verdura Frutta 01.01.2014 31.01.2014

124 IT Birra DB Beve Alcolico 01.01.2014 31.01.2014

123 SI Jabloka FA Sadje in Zelenjava Sadje 01.01.2014 31.01.2014

124 SI Pivo DB Pijače Alkoholna 01.01.2014 31.01.2014

123 BA Jabuka FA Voće i povrće Voće 01.01.2014 31.01.2014

124 BA Pivo DB Pića Alkoholna 01.01.2014 31.01.2014

language. The data to be used for aggregation and

reporting is doubled and so is the metadata for tables

and schemas. This approach requires more complex

SQL/MDX statements than two previous approaches,

and changing the language of data preview at the

semantic level requires the query to be re-executed.

Changing any descriptive data in dimensions requires

re-aggregation of relevant tables holding aggregated

data, which can be critical considering the ETL and

SQL/MDX complexity of this approach. If one part

of the business (country), for example, changes the ID

for a specific dimension value, this could lead to

consistency problems. Having different IDs for the

same data category in different languages disables

consolidated reporting for that aspect at the enterprise

level. Other subtle problems that might arise when

using this approach as discussed by Kimball (2001)

include: the possibility of translating two distinct

attributes as the same word in a new language causing

ETL and reporting problems; reports cannot preserve

sort orders easily across different languages. To

overcome these problems, this concept requires

additional programming or the application of

additional or surrogate keys as actual keys in fact

table.

3.3.4 Discussion

Besides data redundancy, sub-optimal memory usage

and complex management, the common limitation of

the all three approaches is the fact that the business

users need to re-execute the query or report to change

the interface language. As BI-based reports and

queries do complex calculations besides querying

huge amounts of data from the data warehouse, this is

problematic and introduces performance issues.

Further, in the case of translation corrections for

master data, corrections have to go through the whole

ETL process. For example, to change an attribute

description for a specific language, it is necessary to

change the value in the source system, then wait for

the execution of the process for the respective master

data. Other options, such as normalizing star schema

dimensions for the purpose of multilingualism could

lead to the Star schema developing into a snowflake

schema. A snowflake schema is not an optimal design

solution for data marts with large amounts of data.

Issues include: the execution of the main query needs

to consider all joins between tables in the same

dimension; multi-level dimension tables inside one

dimension have to be joined during query execution;

the structure of the snowflake is complex and

includes large number of database tables per

dimension; harder implementation and maintenance

of ETL processes; greater complexity in

reorganization than the star schema; changes in the

semantic model could lead to the extensive

reorganization which would use additional resources.

3.4 Vendor Specific Approaches: SAP

Extended Star Schema

We reviewed the data warehouse and BI software

market and found that the biggest vendors, such as

Oracle, IBM and Microsoft, support one or more of

three approaches explained above. However, one of

the biggest vendors in Business Intelligence, SAP

proposes a specific solution for ML, using the concept

of an extended star schema, which also includes

language as part of the key. In this approach the

dimensions and the fact table are linked to one

another using abstract identification numbers

(dimension IDs), which are contained in the key part

of the respective database table (SAP, 2015).

Dimensions are not represented as one table with

redundant data as in classical star schema. Values

from the tables that hold information about a specific

dimension attribute text or value are mapped to an

abstract dimension key.

This is an implementation driven approach which

is only supported by SAP BW. This means it cannot

be seen as a general design solution as it is a vendor

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development

201

specific proprietary solution which relies on complex

joins to retrieve content for reporting purposes.

4 PROPOSED SOLUTION AND

FUTURE WORK

4.1 Conclusions

We have reviewed the problems presented by ML in

a BI context and discussed the limitations of current

solutions to the design challenge of supporting ML in

the star schema. Existing solutions propose design

workarounds based on extensions of the star schema

but lack theoretical underpinnings and introduce data

redundancy and data manipulation and performance

and maintenance issues. The challenge of ML in a

data warehouse environment is to develop an

approach that can support ML without introducing

inefficiencies into the star schema.

4.2 Proposed Solution

Imhoff et al (2003) propose that language-based

values should be generated during the delivery

process. Corr and Stagnittno (2014) suggest a similar

approach and propose attribute name translation

handled by the BI tool semantic layer. The main

challenge to providing support for ML in the context

of the Star Schema is that attribute and hierarchy

descriptions are saved inside the dimensional tables.

We therefore propose an alternative approach that

would regard the Star Schema as a higher level entity

and save textual descriptions from attributes and

hierarchies elsewhere as language files.

Figure 1 shows the concept of proposed approach

based on a three-layered BI architecture idea. We

describe more detailed technical functionality of the

proposed solution in Figure 2. Figure 1 shows that the

master and transactional data are extracted from

source layer applications into a staging area. From the

staging area data can be extracted directly into

reporting Data Marts, following the philosophy of

Kimball (dashed arrows), or to Data Warehouse and

then to reporting Data Marts, following the Data

Vault and Inmon philosophies (full line arrows).

Before storing data into reporting Data Marts,

attributes and hierarchical descriptions are extracted,

together with their IDs, to language files elsewhere on

the server. As we extract attributes and hierarchical

descriptions and their IDs to separate language files,

we store only integer values to dimensional tables.

Descriptions of attributes and hierarchies would be

associated with relevant IDs from the dimensional

tables during report or query execution (on the fly),

depending on the default language or language

selected. This concept has numerous benefits

compared with conventional methods of enabling

multilingualism in Business Intelligence

environment. Working only with integers makes

SQL/MDX queries faster, reduces table size and

storage requirements (Heflin and Pan, 2010; Smith,

2012).

Initially, the end user sees an Initial Filter Page

(ivp.file) that enables filtering of the content from

data marts to be presented in the Reporting Page

(rep.file). Initial Filter Page uses a default language

set-up via Global Variables and Language Content

Pages (steps a, b, c and d). The application sends an

SQL/MDX request defined using Initial Filter Page

to the data mart (step 1). The first results set is than

sent to Check and Define Page (cdp.file), which takes

it and defines attribute and hierarchical numerical

values from dimensions as language variables (step

2). The second result with variabilized values is then

sent to Reporting Page (rep.file) (step 3). Reporting

Page communicate with Global Variables and

Language Content Page using header (steps 4 and b),

thus it reads the relevant configuration settings for the

language before taking the second result set. After

taking the second result set, Reporting Page reads

language values for attributes and hierarchies from

Language Content files, assigns them and displays

final Result Set in the form of business readable

report (steps 5 and c).

Using this approach, there is no need to re-execute

the SQL/MDX query to change the preview language

– we need only to read new language values from the

Language Content files. As there is no need for new

query execution, switching between languages would

be easy and fast.

The manipulation of the language content would

be easier for business users, as this could be supported

by additional tools or modulesthat enable

manipulation of textual descriptions saved in textual

files. In this context, we consider the idea from

Arefin, Marimoto and Yasmin (2011) that efficient

Content Management System (CMS), or better

Multilingual CMS (MCMS), would help us to

overcome the technical limitations of multilingual

content in semantic web. In the proposed solution,

language files are part of the Warehousing layer, but

content is manageable by MCSM.

Adding new languages for semantic web interface

would not be dependent on the languages that exist in

the source system. If MCSM is implemented so that

it has a backend and a frontend for example, we could

ICEIS 2016 - 18th International Conference on Enterprise Information Systems

202

Figure 1: The concept underlying the proposed approach based on three-layered BI architecture.

define new languages through the MCSM backend.

This would allow the business user to change

descriptions for languages as required.

As it would be possible for business users to

change descriptive content directly and by

themselves, the ETL process required to perform

language changes would be simplified or in some

cases eliminated. In conventional solutions, the whole

ETL process may be performed just to change a small

descriptive value for the specified master data. This is

unnecessary in our proposed solution although it

would be highly recommended that values should be

changed in source systems as well to reflect

corrections made in language files. Otherwise, if there

is a future need to load whole master data for specific

dimension, it would overwrite the corrections made.

We recommend using language files only for smaller

language corrections and executing standard ETL

process when dealing with three or more corrections

at the same time.

This approach simplifies the manipulation of

textual changes which would be performed outside

the values stored in dimensions. This is especially

important with dimensions which are typically large

such as Article, Product or Customer as there would

be no need to re-aggregate existing data according to

textual descriptions of attributes or hierarchies. This

would allow a more optimal use of database memory

as there would be no duplicated data descriptions in

dimensions.

Our proposed approach was implemented in a

proof-of-concept (PoC) artefact. We had one fact

table for Sales data, one dimension table for Product

Categories data, and one language file holding

Product Categories data.The fact table had Product

Category ID as primary key and Price and Amount as

measures. Product Category dimension had Product

Category ID used for relation with fact table and a

Description field holding descriptions of categories.

The language file had the same descriptive data as the

Product Category dimension. The PoC was tested

with 107.768 records in the fact table, 97 records in

Product Category dimension table and 97 records for

Product categories as language file. For testing

purposes, we wanted to show total price per category

in our report. In our first PoC method, we multiplied

Amount and Price from fact table per Product

Category ID and assigned descriptions from language

file during report execution - to present the names

(descriptions) of categories shown. Using a second

method we made the same multiplication, however,

we used the names (descriptions) of categories from

Product Category dimension and assigned them result

set using SQL JOIN.

Results from the PoC showed enviable

improvement in performance when using language

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development

203

Figure 2: Detailed description of technical functionality of the proposed approach.

files. The average execution time for a report that

sums the product sales for 97 different categories

using language files method was seven times faster

than using conventional methods of implementing

star schema. We executed PoC report 228 times: 139

times for method based on language files and 89 times

for other method. Using the language files approach

we could perform smaller descriptive language

changes quickly (simultaneously in data mart and

language files). The approach was also less expensive

in terms of memory. As this approach can be

implemented in the form of add-on to existing

monolingual DW structure, it would be optimal way

to enable multilingualism in existing BI environment.

4.3 Future Work

The findings from the PoC artefact were encouraging

and provide the basis for our future work which will

test our solution in a real world environment, as part

of a MCSM. Future challenges include integrating

multilingualism into the Business Intelligence

framework and addressing migration issues as

businesses move from single language to multilingual

business intelligence systems.

REFERENCES

Anadiotis, G., 2013. Agile business intelligence: reshaping

the landscape. , p.3.

Arefin, M.S., Morimoto, Y. & Yasmin, A., 2011.

Multilingual Content Management in Web

Environment. In 2011 International Conference on

Information Science and Applications. pp. 1–9.

Boakye, E.A., 2012. From Design and Build to

Implementation and Validation. In C. McKinney, ed.

Implementing Business Intelligence in Your Healthcare

Organization. Chicago: HIMSS, pp. 73–86.

Brannon, N., 2010. Business Intelligence and E-Discovery.

Intellectual Property & Technology Law Journal,

22(7), pp.1–5.

Cenoz, J., 2013. Defining Multilingualism. Annual Review

of Applied Linguistics, 33, pp.3–18.

Chaudhuri, S., Dayal, U. & Narasayya, V., 2011. An

overview of business intelligence technology.

Communications of the ACM, 55(8), p. 88–98 .

Cios, K.J. et al., 2007. Data Mining: A Knowledge

Discovery Approach 1st ed., New York, USA: Springer

Science+Business Media, LLC.

Collins, R.W., 2002. Software localization for internet

software: Issues and methods. IEEE Software.

Corr, L. & Stagnittno, J., 2014. Agile Data Warehouse

Design: Collaborative Dimensional Modeling, from

Whiteboard to Star Schema, Leeds: DecisionOne Press.

Cravero, A. & Sepúlveda, S., 2015. Using GORE in Data

Warehouse: A Systematic Mapping Study . Latin

America Transactions, IEEE (Revista IEEE America

Latina) , 13(5), pp.1654 – 1660.

Croatian Government, 1990. The Constitution of the

Republic of Croatia. , p.Article 12.1. .

Dekkers, J., Versendaal, J. & Batenburg, R., 2007.

Organising for Business Intelligence: A framework for

ICEIS 2016 - 18th International Conference on Enterprise Information Systems

204

aligning the use and development of information. In

BLED 2007 Proceedings. Bled, pp. 625 – 636.

Dokeroglu, T., Sert, S.A. & Cinar, M.S., 2014.

Evolutionary Multiobjective Query Workload

Optimization of Cloud Data Warehouses . The

Scientific World, 2014(14), pp.1 – 16.

European Comission, 2008. Final Report: Commission of

the European Communities High Level Group on

Multilingualism, Luxembourg.

Forbes, 2015. The World’s Biggest Public Companies.

Forbes.com. Available at: http://www.forbes.com/

global2000/list/ (Accessed November 24, 2015).

Gouvernement de la République française, 1958.

Constitution of France. , p.Article 2.1. .

Government of Federation of Bosnia and Herzegovina,

1995. Federation Constitution. , p.Article 6.1.

Gracia, J. et al., 2012. Challenges for the multilingual Web

of Data. Web Semantics: Science, Services and Agents

on the World Wide Web, 11, pp.63–71.

Graefe, G. et al., 2013. Elasticity in cloud databases and

their query processing. International Journal of Data

Warehousing and Mining, 9(2), p.1.

Hannula, M. & Pirttimäki, V., 2003. Business intelligence

empirical study on the top 50 Finnish companies. Journal

of American Academy of Business, 2, pp.593–599.

Hau, E. & Aparício, M., 2008. Software

Internationalization and Localization in Web Based

ERP. In SIGDOC ’08 Proceedings of the 26th annual

ACM international conference on Design of

communication. New York: ACM , pp. 175 – 180.

Heflin, J. & Pan, Z., 2010. Semantic Integration: The

Hawkeye Approach. In Semantic Computing. Haboken,

New Jersey, US: IEEE Press; John Wiley & Sons, pp.

199–227.

Hensch, K., 2005. IBM History of Far Eastern Languages

in Computing, Part 1: Requirements and Initial

Phonetic Product Solutions in the 1960s. IEEE Annals

of the History of Computing, 27(1), pp.17–26.

Available at: http://ieeexplore.ieee.org/lpdocs/epic03/

wrapper.htm?arnumber=1401743.

Hillier, M., 2003. The role of cultural context in

multilingual website usability. Electronic Commerce

Research and Applications, 2(1), pp.2–14. Available at:

www.elsevier.com.

Huang, S. & Tilley, S., 2001. ssues of content and structure

for a multilingual web site. In Proceedings of the 19th

annual international conference on computer

documentation. ACM, pp. 103–110.

Huff, A., 2013. Big Data II: Business Intelligence - Fleets

Analyze Information from several sources to improve

overall performance. Commercial Carrier Journal,

(April), p.53.

Imhoff, C., Galemmo, N. & Geiger, J.G., 2003. Mastering

Data Warehouse Design: Relational and Dimensional

Techniques, Indianapolis: Wiley Publishing, Inc.

Inmon, B.W., 2005. Building the Data Warehouse 4th ed.,

Indianapolis: John Wiley & Sons.

Inmon, B.W., 1995. Building the Data Warehouse 1st ed.,

New York: Wiley.

Jensen, C.S., Pedersen, T.B. & Thomsen, C., 2010.

Multidimensional Databases and Data Warehousing

1st ed., Morgan & Claypool Publishers.

Jovanovic, V., Subotic, D. & Mrdalj, S., 2014. Data

Modeling Styles in Data Warehousing. In 37th

International Convention on Information and

Communication Technology, Electronics and

Microelectronics (MIPRO). Opatija, Croatia: MIPRO,

pp. 1458 – 1463.

Kimball, R., 2001. Design Tip #24: Designing Dimensional

In A Multinational Data Warehouse. Kimball Group.

Available at: http://www.kimballgroup.com/2001/

06/ design-tip-24-designing-dimensional-in-a-multina

tional-data-warehouse/ Accessed December 1, 2015).

Kimball, R. et al., 2008. The Data Warehouse Lifecycle

Toolkit 2nd ed., Indianapolis: John Wiley & Sons.

Kimball, R. & Ross, M., 2011. The Data Warehouse

Toolkit: The Complete Guide to Dimensional Modeling

2nd ed., John Wiley & Sons.

Kingdom of Spain, 1978. Spanish Constitution. , p. Article

3.1.

Law No. 482, 1999. Norme in materia di tutela delle

minoranze linguistiche storiche. Gazzetta Ufficiale, 297

(Article 1.1).

Linstedt, D., Graziano, K. & Hultgren, H., 2010. The

Business of Data Vault Modeling 2nd ., Lulu.com.

Marchand, M. & Raymond, L., 2008. Researching

performance measurement systems: An information

systems perspective. International Journal of Operations

& Production Management, 28(7), pp.663 – 686.

Obeidat, M. et al., 2015. Business Intelligence Technology,

Applications, and Trends. International Management

Review, 11(2), pp.47–56.

Olszak, C. & Ziemba, E., 2007. Approach to Building and

Implementing Business Intelligence Systems.

Interdisciplinary Journal of Information, Knowledge &

Management, 2, pp.135–148.

Orlov, V., 2014. Data Warehouse Architecture: Inmon CIF,

Kimball Dimensional or Linstedt Data Vault? WMP

Blog. Available at: http://blog.westmonroepartners

.com/data-warehouse-architecture-inmon-cif-kimball-

dimensional-or-linstedt-data-vault/ (Accessed February

20, 2015).

Ponniah, P., 2004. Data Warehousing Fundamentals: A

Comprehensive Guide for IT Professionals, New Yor:

John Wiley & Sons.

Poolet, M.A., 2008. Data Warehousing: Rapidly Changing

Monster Dimensions. SQL Server Pro. Available at:

http:// sqlmag.com/database-administration/data-ware

housing-rapidly-changing-monster-dimensions (Acces

sed November 24, 2015).

Popovič, A., Turk, T. & Jaklič, J., 2010. Conceptual Model

of Business Value of Business Intelligence Systems.

Management: Journal of Contemporary Management,

15(1), pp.5–29.

Purba, S., 1999. Handbook of Data Management 3rd ed.,

Boca Raton, FL, US: Auerbach, CRC Press.

Sano, M. Di, 2014. Business Intelligence as a Service : a

new approach to manage business processes in the

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development

205

Cloud. In 2014 IEEE 23rd International WETICE

Conference. Parma, pp. 155–160.

SAP AG, 2015. SAP Library - XML: BW - Data

Warehousing - Modeling. Available at: http://

help.sap.com/saphelp_snc70/helpdata/en/4c/89dc37c7f

2d67ae10000009b38f889/frameset.htm (Accessed

March 2, 2015).

Scrapehero, 2015. How many products does Walmart.com

sell in comparison to Amazon.com? Scrapehero.com.

Available at: http://learn.scrapehero.com/how-many-

products-does-walmart-com-sell-vs-amazon-com/

[Accessed November 24, 2015].

Sen, A. & Sinha, A.P., 2005. A comparison of data

warehousing methodologies. Communications of the

ACM, 48(3), pp.79–84.

Smith, P., 2012. Professional Website Performance:

Optimizing the Front-End and Back-End, Indianapolis,

USA: John Wiley & Sons.

Vázquez, S.R., 2013. Localizing Accessibility of Text

Alternatives for Visual Content in Multilingual

Websites. In ACM SIGACCESS Accessibility and

Computing. ACM, pp. 34–37.

Yrjö-Koskinen, P., 1973. Johto – Tiedontarve –

Tiedonhankinta. Miten informaatiopalvelu voi palvel la

yrityksen johtoa? Publications of Insinöörijärjestö,

No.77 - 73.

ICEIS 2016 - 18th International Conference on Enterprise Information Systems

206

Efficient of bitmap join indexes for optimising star join queries in relational data warehouses , Int. J. Computational Intelligence Studies, Vol. 9, No. 3, pp.220–233. 2020

Article

Full-text available

Sep 2020

Data warehouses are dedicated to analysis and decision-making applications. They are often schematised as star relational models or variants for on-line analysis. Typically, the analysis process is conducted via online analytical processing (OLAP) type queries. These queries are usually complex, characterised by multiple selections operations, joins, grouping and aggregations on large tables. Which require a lot of calculation time and thus a very high response time. The performance of these queries depends directly on the use of the secondary memory. Indeed, each input-output on disk requiring up to ten milliseconds. In order to reduce and minimise the cost of executing these queries, the data warehouse administrator must make a good physical design during the physical design and tuning phase by optimising access to the secondary memory. We focus on bitmap join indexes that share the same resource, that is, the selection attributes extracted from the business intelligence queries. To optimise star join queries.

Efficient of bitmap join indexes for optimising star join queries in relational data warehouses , Int. J. Computational Intelligence Studies, Vol. 9, No. 3, pp.220–233. 2020

Article

Full-text available

Sep 2020

Quão experimentais e estratégicas são as aplicações de Business Intelligence (BI) e Data Mining? ; HOW EXPERIMENTAL AND STRATEGIC ARE BUSINESS INTELLIGENCE (BI) AND DATA MINING APPLICATIONS?

Article

Full-text available

May 2022

Objective: Identify and characterize the methodologies used for the experimental development of intelligent applications aligned with strategic planning.Methodology: A systematic mapping was carried out to characterize the research in the area, considering the last ten years.Originality: No scientific studies were found with the same research object of this article, to identify and characterize the methodologies for the experimental development of intelligent applications aligned with strategic planning, which increases the importance of the results presented here.Main results: As a result, no studies were found that presented any complete approach to discipline strategic alignment and experimentation, providing clear compliance with strategic objectives and an experimental phase in the validation of results. However, some trials of parts of these characteristics could be mapped, such as experimentation found in 28,57% of the studies. Among the countries, China, the United States and Brazil led the ranking of publications on the subject. As for the medium of publication, Journal was the most used option for publication. In addition, the "IEEE International Conference on Advanced Communications, Control and Computing Technologies" and the journal "Expert Systems with Applications" stood out as major publishers.Theoretical Contributions: This research presents results relevant to academia and entrepreneurs, providing evidence that there is a gap in research on a formal method of BI and Data Mining applications experimental and strategy-driven development. In addition, this work is presented as a source of consultation to the existing method standards for the development of intelligent applications, as well as being replicable and extended by the applied systematization. Finally, there is a focus on research that proposes methods of creating experimental applications validated experimentally and aligned with strategy.

Business Intelligence for IT Governance of a Technology Company

Article

Full-text available

Dec 2021

Managers are required to make fast, reliable, and fact-based decisions to encompass the dynamicity of modern business environments. Data visualization and reporting are thus crucial activities to ensure a systematic organizational intelligence especially for technological companies operating in a fast-moving context. As such, this paper presents case-study research for the definition of a business intelligence model and related Key Performance Indicators (KPIs) to support risk-related decision making. The study firstly comprises a literature review on approaches for governance management, which confirm a disconnection between theory and practice. It then progresses to mapping the main business areas and suggesting exemplary KPIs to fill this gap. Finally, it documents the design and usage of a BI dashboard, as emerged via a validation with four managers. This early application shows the advantages of BI for both business operators and governance managers.

Design and Development of Automatic Recommendation Generation Module of Prescriptive Maintenance Model (AutoPriMa)

Thesis

Jul 2019

Linus Kohl

Industry 4.0 creates a change in maintenance. Due to the rise of Cyber-Physical Production Systems (CPPS) and the availability of sensor data, maintenance was changed from descriptive to prescriptive maintenance. The Internet of Things (IoT), Data Science and Artifical Intelligence (AI) all play a vital role in the development of manufacturing technology. Predictive and prescriptive maintenance is expected to grow by approximately 39% to a total of, 10.96$B by 2022. Smart Manufacturing Leadership Coalition (SMLC) has also predicted that the following targets can be achieved by data driven analytics in smart manufacturing (1), 30% reduction in capital intensity, (2) up to 40% reduction in product cycle times, and (3) overarching positive impact across energy and productivity. Lueth K. et al. (2016) stated in their report that 79% of all decision makers of Original Equipment Manufacturers will see predictive and prescriptive maintenance as one of the most important applications in the next 1-3 years. In the area of prescriptive analytics the goal is to find the best course of action for a given problem, by using techniques like recommendation engines and neural networks for solving a problem. Those techniques can then be converted for use in maintenance. A rising demand for prescriptive maintenance, which offers decision support can be anticipated, while currently predictive maintenance mostly consists of inappropriate maintenance strategies and conditions. According to Cheng et al. (2018) and R. Ranjan (2014) state of the art decision-making processes combine different data sources with data science methods to either improve the system intelligence or establish an automated big data pipeline Cheng et al. (2018) and R. Ranjan (2014). The concept Knowledge Based Maintenance (KBM)9101112 is a key enabler for digital transformation to prescriptive maintenance. As stated by Ansari, Glawar, et al. (2019), the PriMa model and its four-step methodology have been introduced and an applied as part of a proof-of-concept study, however while the paper specifies the methodology and approach in detail, it does not go into detail on how to achieve problem 1 (P1) the data input into the data warehouse, problem 2 (P2) how to build aggregator functions and most importantly, how to handle the feedback loop between the Knowledge-Base and the Decision Support Dashboard problem 3 (P3). This works aims to design an automated PriMa model, specifically focusing on the knowledge pipeline from the textual data from maintenance reports to the recommendation of a solution for the problem identified in the report. These questions have been answered by looking into the requirements given by ODonovan et al. (2015) for the data ingest process and proposing an own requirement list for a data warehouse solution (P1). In the next step three machine learning (ML) algorithms, namely Hamilton Monte-Carlo (HMC), Random Forest (RF) and Neural Networks (NN) reasoning have been generated and minimum working examples provided. Their outputs later on have been aggregated by a weighted hybrid function (P2). For the knowledge pipeline a Natural Language Processing (NLP) algorithm was applied which uses maintenance reports and extracts nouns and verbs. Those than can be matched against an ontology by using case-based reasoning (CBR) with the help of SPARQL. To sum up, the present thesis contributes on design and technical realization of the knowledge pipeline in the context of maintenance by analyzing technical requirements and developing a proof of concept demonstrator.

Data-Based Creativity in Systemic Evolutionary Innovation

Conference Paper

Full-text available

Jan 2023

Fast evolving business environment requires excellent innovation performance for the success and growth of organizations. Above all, creativity is a critical component in the competition. Analytical thinking and innovation, active learning and learning strategies and creativity are the main skills in continuous coevolution of businesses. All these skills are essential during the early phases of innovation. These skills give us a guidance in the pathway to growth – entrench the competitive advantages. A deep study reveals answers to early activities - pre-inventive actions open the way of innovation initiatives to growth. It is important really immerse us to explanations for origins of innovation.Ambitious growth objectives cannot be achieved without effective, dynamic and systematic innovation. The early activities – pre-inventive actions - are designed to be a valuable innovation assistance in the step-by-step evolutionary pathway to growth. This is a new value stream innovation paradigm. Challenges and insights emerge during the pathway.Data management is policies and procedures to effectively manage the information lifecycle needs. It means data collection, classification, validation, storing, protection, and processing during lifecycle from creation to deletion - the practice of collecting, keeping, and using data securely, efficiently, and cost-effectively. This is a foundation for digitalization – mining for insights in company’s data.The amount of usable data in business environment and as big data is exponentially increasing. Technology opportunities are available to be used in data- analytics and in managing by data for the purpose of business co-evolution. The understanding of human factors inside self- organized teams during collective interpretation and imagination of future opportunities is becoming most important feature.The goal of this article is to identify and analyze futures thinking in innovation capability. The article attempts to develop evolutionary framework for data and event analytics in content management of fuzzy front end of innovation. An essential team performance is collective interpretation and imagination. That can be used in prospective innovation.This research is partly constructive, conceptual and analytical because it introduces data analytics progression framework in fuzzy front end of innovation.Data for this concept creation has been collected over several years on continuous flow of qualitative analysis of scientific articles and used methodologies in operative industrial environment. This can be seen as a type of applied science. Experiences and many results gathered during this analysis work can be used in sustainable development. These can be benefited in development work on data- based creativity, interpretation, innovation and foresight. A generic perception of this analysis work is that successful fuzzy front end of innovation needs structured data- analysis methodology and dynamic team performance to execute collective interpretation and imagination to serve as basis for common value creation in systemic and evolutionary innovation.

Data-Driven Exercise Medicine for Cardiovascular Disease

Chapter

Apr 2022

According to the Global Burden of Disease Study, in the past 20 years, cardiovascular disease (CVD) has consistently ranked as the leading cause of human death and loss of healthy life. Regular and systematic physical exercise is a crucial treatment strategy for CVDs, and many evidence-based medical studies support its safety and effectiveness. Therefore, most CVD patients are encouraged to take long-term physical exercise. Although there are many exercise guidelines for patients with CVD, most of them are at population level rather than individualized, and thus overlook individual differences such as disease severity, drug intake, and common risk factors. In addition, when prescribing exercise for individuals, knowing how to combine different exercise training guidelines properly is a major challenge in the clinical practice of traditional exercise medicine. Data-driven exercise medicine, based on objective data and computer technology, improves the efficiency and safety of exercise intervention and offers an excellent solution to these issues by fully considering the individual differences of patients and dynamically generating optimally customized exercise prescription. This article takes CVDs as an example to review the concept, data integration, and modeling methods of data-driven exercise medicine, discussing specific application cases and summarizing challenges and future development trends.

DWH-DIM: A Blockchain Based Decentralized Integrity Verification Model for Data Warehouses

Conference Paper

Dec 2021

Implementation of Multilingual Education for Sustainable Development in the Netherlands and Kazakhstan

Chapter

Jan 2022

The chapter presents a comparative analysis of the implementation of multilingual/bilingual education in Kazakhstan and the Netherlands. The study explores the origin and role of multilingualism, for socio-economic development, relevant regional and international practices in multilingual education, the growth of trilingual instruction, English language training and the use of digital technologies (distant, e-learning programs) in the Eurasian region, Major common features and differences of implementing multilingual/bilingual teaching between the two case studies and the main findings from research trips are identified. A theoretical analysis of scholarly approaches, as well as practical and methodological implications, are made on the example of implementing multilingual instruction at the Kazakhstani agricultural higher educational institutions, particularly Kazakh National Agrarian University and Kazakh S. Seifullin Agritechnical University and Dutch counterparts - Wageningen University, and the Hague University of Applied Sciences.

Big Data: Forecasting and Control for Tourism Demand

Chapter

Nov 2020

Miguel Ángel Ruiz Reina

In this study, innovative forecasting techniques and data sources from Big Data are used for the study of Hotel Overnight Stays for Spain, from January 2018 to June 2019. The unstoppable development of the Tourism sector with the application of Big Data technologies, allow to make efficient decisions by economic agents. In this work, the use of the data collected from the Google Data Mining tools allows to obtain knowledge about Hotel Tourism Demand in Spain. The analysis carried out meets the four basic principles of Big Data analysis: volume, velocity, variety and veracity. In this setting, the methodology used corresponds to ARDL models, and ECM models being developed Granger-Causality extended to seasonality. The first one explains easily when economic agents will make their decisions; while the second one allows forecasting for short-term and long-term. This fact means that tourist offers and demands can be perfectly adjusted at every moment of the year. As a criterion for the selection of models, the innovative Matrix U1 Theil is proposed, this allows to quantify how much a model is better than another in terms of forecasting.

Business Intelligence Technology, Applications, and Trends

Article

Full-text available

Feb 2015

Enterprises are considering substantial investment in Business Intelligence (BI) theories and technologies to maintain their competitive advantages. BI allows massive diverse data collected from virus sources to be transformed into useful information, allowing more effective and efficient production. This paper briefly and broadly explores the business intelligence technology, applications and trends while provides a few stimulating and innovate theories and practices. The authors also explore several contemporary studies related to the future of BI and surrounding fields.

Conceptual model of business value of business intelligence systems

Article

Full-text available

Jan 2010

With advances in the business intelligence area, there is an increasing interest for the introduction of business intelligence systems into organizations. Although the opinion about business intelligence and its creation of business value is generally accepted, economic justification of investments into business intelligence systems is not always clear. Measuring the business value of business intelligence in practice is often not carried out due to the lack of measurement methods and resources. Even though the perceived benefits from business intelligence systems, in terms of better information quality or achievement of information quality improvement goals, are far from being neglected, these are only indirect business benefits or the business value of such systems. The true business value of business intelligence systems hides in improved business processes and thus in improved business performance. The aim of the paper is to propose a conceptual model to assess business value of business intelligence systems that was developed on extensive literature review, in-depth interviews, and case study analysis for researching business intelligence systems' absorbability capabilities or key factors facilitating usage of quality information provided by such systems respectively.

Using GORE in data warehouse: A systematic mapping study

Article

Full-text available

May 2015
Lat Am Trans IEEE

The Goal-Oriented Requirements Engineering, known as GORE, is the first step in the development process of systems Data Warehouses (DW). It is based on the identification of the goal of the business that wants to analyze the executive, which enables developers to obtain the information requirements of DW through a set of transformations from the goal modeling in specifications model. This paper provides an overview of how some development proposals DW used GORE to elicit, negotiate, specify and validate information requirements. The methodology is the systematic mapping studies.

Data Warehousing Fundamentals: A Comprehensive Guide for IT Professionals

Book

Apr 2002

Paulraj Ponniah

Handbook of Data Management

Book

Jul 2019

Sanjiv Purba

Issues of content and structure for a multilingual web site

Conference Paper

Jan 2001

Business intelligence empirical study on the top 50 Finnish companies

Article

Jan 2003

From Design and Build to Implementation and Validation

Chapter

Jan 2012

Eric Adjei Boakye

Business Intelligence as a Service: A New Approach to Manage Business Processes in the Cloud

Conference Paper

Jun 2014

Marco Di Sano

Business intelligence (BI) processes have increased significantly in recent years and now play a fundamental role in many organizations, in order to support strategic decisions. Data warehouse is the core technical solution to design BI platforms. An effective BI infrastructure that leverages the power of a data warehouse can deliver value by helping companies enhancing their customer experience, reducing the gap between some strategic goals and their realization in an IT business process. In this paper, the author explains the realization of a BI platform solution (called BIOP) hosted on a custom cloud system, exploiting both the advantages of a flexible and on-demand solution.

Data Mining Basics

Chapter

Apr 2002

Paulraj Ponniah

The objectives of this chapter are to (1) learn what exactly data mining is and examine its features; (2) compare data mining with online analytical processing (OLAP) and understand the relationships and differences; (3) notice the place of data mining in a data warehouse environment; (4) carefully go through the important data mining techniques and understand how each works; and (5) study a few data mining applications in different industries and perceive the application of the technology to one's environment.

An Evaluation of the Challenges of Multilingualism in Data Warehouse Development

Abstract and Figures

Recommended publications

Intelligent Iraqi Health System (IIHS) using Online Analytical Process (OLAP) model

Business Intelligence Strategy for Data Warehouse in Andalusian Health Service

Data Warehousing & Business Intelligence ROI

Building an Effective Data Warehousing for Financial Sector