Article

Strategic Prototyping for Developing Big Data Systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Conventional horizontal evolutionary prototyping for small-data system development is inadequate and too expensive for identifying, analyzing, and mitigating risks in big data system development. RASP (Risk-Based, Architecture-Centric Strategic Prototyping) is a model for cost-effective, systematic risk management in agile big data system development. It uses prototyping strategically and only in areas that architecture analysis can't sufficiently address. Developers use less costly vertical evolutionary prototypes instead of blindly building full-scale prototypes. An embedded multiple-case study of nine big data projects at a global outsourcing firm validated RASP. A decision flowchart and guidelines distilled from lessons learned can help architects decide whether, when, and how to do strategic prototyping. This article is part of a special issue on Software Engineering for Big Data Systems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In projects developing business systems using advanced technologies, such as Big Data analytics or machine learning, developers often cannot elicit sufficient requirements for developed systems from users. In such a case, the minimum viable product (MVP) is considered a first step to prototype a system by implementing the bare minimum of functions [1]. After that, users use the MVP in their business and validate whether their business can be improved by the system using new technologies. ...
... In system development projects where we develop an MVP for a business, there are three processes "Value discovery," "MVP development," and "Technical and business evaluation" considered in [1]. In the value discovery step, use On the contrary, in AI service system development projects, it is said that a business goal is divided into AI goals, and each AI goal is divided into AI engine goals [15]. ...
... Figure 3 shows the goal tree represented by ArchiMate. Process defined in [1] Process in AI service system development project Value discovery Identify a business goal and a new business service. Value discovery Identify AI service for business service and its goal. ...
Article
Full-text available
In this work, we propose a model of a project to develop an artificial intelligence (AI) service system used in an office environment. Our model is based on enterprise architecture (EA) approach and consists of business layer elements, application layer elements, and motivation extensions, so that project participants from both business and IT divisions can have the same understanding of the project. By applying the proposed model to the project analysis results, we show that we can derive actionable insights for project risk management.
... However, how to engineer value from big data poses many new challenges. (To engineer means that a set of procedures can be applied with predicable results [6] [9].) Organizations face challenges due to: (1) the technical complexity arising from the 4V (Volume, Variety, Velocity, Veracity) characteristics of big data [16]; (2) the organizational agility required for rapid delivery of value [5]; and (3) the rapid technology proliferation and evolution [12]. ...
... Value engineering includes two phases: Value Discovery, and Value Realization. We have augmented the original Eco-ARCH method [10] for big data value discovery with: 1) "Priming" techniques [19] for futuring scenario generation, 2) a Big Data Architecture Scenario (BDAS) template for big data modeling, 3) a Big Data-Data Flow Diagram (BD-DFD) for process modeling, and 4) strategic prototyping [12] to meet the requirements stated above. These augmentation techniques were each validated independently before being integrating into Eco-ARCH. ...
... Step 6 -Risk Analysis and Strategic Prototyping: In this step, risks are analyzed using a combination of architecture analysis and strategic prototyping [12] to achieve the value-based objectives. Using architecture analysis, risk scenarios are developed to describe challenges to the system from multiple quality attribute perspectives and threats to the triple bottom line. ...
... To extract "value" from big data is daunting task. Organizations face challenges due to: (1) the technical complexity arising from the 4V (Volume, Variety, Velocity, Veracity) characteristics of big data [13]; (2) the organizational agility required for rapid delivery of value [10]; and (3) the rapid pace of technology proliferation and evolution [9]. Big data adoption is surrounded by high level of risks and uncertainty regarding costs, schedules, and benefits. ...
... The rapid rate of technology proliferation and evolution in the big data area also creates problem for value assessment. We have proposed an architecture-centric approach, combined with strategic prototyping, to address this concern [9]. ...
... Value engineering includes two phases: Value Discovery, and Value Realization. We have augmented the original Eco-ARCH method for big data value discovery with: 1) "Priming" techniques [16] for Futuring scenario generation, 2) a Big Data Architecture Scenario (BDAS) template for big data modeling, 3) a Big Data-Data Flow Diagram (BD-DFD) for process modeling, and 4) strategic prototyping [9] to meet the requirements stated above. These augmentation techniques were each validated independently before being integrating into Eco-ARCH. ...
Conference Paper
This article articulates the requirements for an effective big data value engineering method. It then presents a value discovery method, called Eco-ARCH (Eco-ARCHitecture), tightly integrated with the BDD (Big Data Design) method for addressing these requirements, filling a methodological void. Eco-ARCH promotes a fundamental shift in design thinking for big data system design -- from "bounded rationality" for problem solving to "expandable rationality" for design for innovation. The Eco-ARCH approach is most suitable for big data value engineering when system boundaries are fluid, requirements are ill-defined, many stakeholders are unknown and design goals are not provided, where no architecture pre-exists, where system behavior is non-deterministic and continuously evolving, and where co-creation with consumers and prosumers is essential to achieving innovation goals. The method was augmented and empirically validated in collaboration with an IT service company in the energy industry to generate a new business model that we call "eBay in the Grid".
... Big data contains a lot of data and different types of data [11,12]. e software should be used to complete the processing of relevant datasets within the specified period, analyze the basis of the decision-making process, and consider the value of a large amount of information. ...
Article
Full-text available
The development of information technology is changing all walks of life. People’s health problem is more and more prominent; people begin to talk about the reform of college sports training. Sports no longer rely on individual games, but on the comprehensive strength of science and technology competition. The fierce competition for Olympic gold medal in modern competitive sports is largely due to the competition of scientific and technological strength of various countries. China has also conducted a lot of research on sports information and made some achievements. Through the investigation, we know that, at present, the provincial level sports teams have established the relevant sports training information management system, which is very effective. The latest scientific and technological achievements are combined with sports to establish the university sports information system. The purpose of this paper is to analyze the university sports physical training information system based on big data mobile terminal, study the big data embedded system, improve the effect of sports skills training, and meet the social demand for high skilled sports talents. This paper uses the literature method, experimental investigation method, and big data spectral clustering algorithm-related experiments to study the advantages of big data and uses the value of big data and embedded system model to study the university sports physical fitness training information system based on big data mobile terminal. The results show that 40.6% of college students spend more time in physical exercise, based on the application of big data embedded system in college sports training; it is of great significance to arrange sports training methods to improve students’ sports training performance.
... Throwaway prototyping is a prototyping method to demonstrate ability or interface simulations but not to be used as a final version. The prototyping method can decrease project risks [15,16]. Throwaway prototyping can be done by planning and pre-analyzing requirements. ...
Article
Full-text available
To improve e-government services released by Communication and Information Technology Office of Samarinda City, each system needs to be able to interoperate even when developed by different developers. Interoperation can be achieved by using one data source which is API (Application Programming Interface) for general data objects such as announcements. Given this condition, API built by using microservice can support further enhancement even if API is developed by developers who use different programming languages. The result shows that microservice API can be used to interoperate in relaying data between e-government services and can be developed using more than one programming language and base codes. Further development of this API can be done by adding more data objects, using AWS Cognito as authorization management, adding AWS Elasticsearch to load and filter data, and by showing data objects in real-time on the front end.
... (2015) (Callon & Courtial, 1995 (Chow & Cao, 2008 1986 1989 1991 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 Nota-se que algumas pesquisas citaram versões diferentes de uma mesma referência, como por exemplo as citações de Beck (1999de Beck ( , 2000, demonstrando a relevância de cada uma das versões para a composição dos resultados. (Boehm, 2002) G 1 1 2 3 2 0 X 2 0 1 0 5 1 4 3 1 1 0 1 0 28 H 0 3 1 1 0 0 2 X 0 0 0 0 3 2 0 0 1 0 1 0 14 I 1 0 2 3 0 0 0 0 X 3 5 5 1 2 0 1 2 0 0 0 25 (Boehm et al., 2004) J 0 1 3 2 1 0 1 0 3 X 3 1 5 2 3 0 1 1 3 0 30 (Chow & Cao, 2008) K 2 0 2 3 2 1 0 0 5 3 X 1 1 2 0 1 2 0 0 0 25 L 1 0 1 1 1 1 5 0 5 1 1 X 0 4 2 1 4 0 0 0 28 (Cockburn, 2002) M 1 2 5 1 1 1 1 3 1 5 1 0 X 2 3 0 2 0 5 0 34 (Dybå & Dingsøyr, 2008 (Chen et al., 2016) RAD (Rapid Application Development). Modelo de eficiência de custos e gerenciamento de riscos no desenvolvimento ágil de sistemas de big data. ...
Article
Full-text available
Ainda que a gestão de riscos influencie os resultados dos projetos, sabe-se que há casos de fracasso cuja principal causa é a incorreta utilização das práticas dessa área de conhecimento. Sabe-se também que a abordagem ágil de gerenciamento de projetos e sua influência nos resultados dos projetos tem sido amplamente pesquisada, despertando interesse dos mais diversos no âmbito dos projetos. No entanto, a relação entre essas áreas é um fenômeno que não está consolidado na literatura especializada em gestão de projetos. O objetivo deste estudo foi verificar a evolução da relação entre a gestão de riscos e abordagem ágil. Espera-se, com isso, conhecer os principais riscos e as características dos modelos de gerenciamento de riscos em projetos gerenciados por abordagens ágeis. Este artigo apresenta uma análise exploratória realizada com base em um estudo bibliométrico seguido de uma revisão sistemática da literatura. Assim, verificou-se que a maioria dos 1023 artigos avaliados quantitativamente por meio da análise bibliométrica, está em sua fase inicial, pois foi somente publicada em conferências e congressos e, os 16 artigos avaliados qualitativamente na revisão sistemática de literatura tinham como principal objetivo, propor um modelo para o gerenciamento de riscos em projetos gerenciados por abordagens ágeis. Por fim, muitos riscos identificados na revisão sistemática estão associados ao fator pessoas, este que é um dos valores centrais da abordagem ágil, contrapondo os benefícios e as vantagens reconhecidos na literatura para esta abordagem.
... The scales of big data dimension are divided according to the data flow process. It includes three types: the supporting method and technology of big data, the flow process of big data and the platform of big data (Chen et al., 2016). Among them, the method technology is also around the data flow process to support the technology, such as: data processing technology, data analysis technology, data management technology. ...
... However, big data is having a huge impact on today's global economy, and it has also caused setbacks to traditionally-developed accounting models. With the continuous emergence of relevant data, accounting market big data (AMBD) is bound to have an irreversible impact on current accounting methods [2,3]. Therefore, the accounting model should be reformed to meet the current requirements. ...
Article
Current accounting methods for small and medium-sized enterprises (SMEs) have long running times and low user satisfaction. Therefore, a method for the selection of accounting models for SMEs based on accounting market big data (AMBD) is proposed in this paper. Firstly, some indicators such as the current ratio, quick ratio, asset-liability ratio, accounts receivable turnover rate, and other indicators taken from the solvency, operating capacity, profitability, and growth capacity of a company are selected to set up an AMBD constraint system. Then, the principal component analysis method is used to achieve the classification of the constraints of the AMBD. Finally, by combining particle swarm optimization with ant colony optimization, the optimal accounting model is obtained through iteration. Experimental results show that the proposed method has high efficiency and user satisfaction, and achieves a high coefficient of rationality. Furthermore, the method incorporates the constraints found in the AMBD, and meets the selection requirements of the SME accounting model.
... Evolutionary prototyping is used when requirements are certain. The system is developed with full functionality [21]. ...
Conference Paper
Full-text available
Requirement validation is the most vital phase of Requirement Engineering (RE) process which provides clear, complete and consistent requirements to software development team. Although several requirement validation approaches have been proposed such as review checklists, prototyping and model-based validation. Among all these existing techniques, prototyping being model-based approach is most widely utilized. It is analyzed that functional implementation and iterations often makes traditional manual prototyping to be inefficient, time consuming and expensive. In order to bridge this gap, a meta-model for automatic generation of prototypes is proposed in this paper. This prototype will be developed directly from the elicited requirements for the purpose of efficient validation of requirements. This approach provides thorough knowledge, reactions foresight and design analysis of a system, steering towards the validation of the necessary, verifiable, traceable and reusable requirements. In addition, after performing requirement validation using proposed approach, rework on design and implementation is avoided.
... (1) Industry (I) dimension refers to the major areas of industry, including: industrial areas of electric power equipment, aerospace equipment, new material, advanced rail transportation equipment [23][24][25], and etc. Combining with other two dimensions, each industrial domain in this dimension can form an I-BD reference framework in this area. Table 1 Current situation analysis of I-BD based on SWOT method Strength (S) [6][7][8][9] Weaknesses (W) [10][11][12][13][14] 1) "World-class universities and world-class disciplines "have successively established big data research centers. ...
Article
Full-text available
In today’s information age with rapid development, all kinds of data have exploded. The Industrial Big data Application (I-BD) is generated by the continuous infiltration of big data in industry. Scholars have done a lot of research on the method, technology, and architecture of industrial big data from the perspective of data flow. However, there are relatively few studies on the reference model, reference architecture, and implementation path for industrial big data from the perspectives of detailed application scenarios, common service platform, and specific implementation. In this paper, firstly, the current situation of I-BD is analyzed. Secondly, a general reference model for I-BD is proposed, which consists of Industry (I) dimension, Application scenario (A) dimension, and common service platform (P) dimension. Further, the overall planning of application scenarios for I-BD based on industrial value chain for I-BD is studied. Again, a reference architecture of common service platform for I-BD based on multi-party co-construction is proposed. Finally, the implementation path of common service platform for I-BD is given. It can be used as a reference for industry and government to design, set, and carry out I-BD.
... The stages of designing Prototype Model hardware arrangement is shown in Figure 1. Stages of designing Prototype Model are as follows: Listen to customers, collecting the needs of the system to be designed based on the problems that occur [22]; Designing and making Prototype, tailored to the needs of the system that has been defined in the previous stage [23]; Trial, do testing to system which has been designed to produce the final product to suit the needs [24]. ...
Conference Paper
Full-text available
The system of awareness of the overflow of river water that causes flooding cannot work automatically and in real time in providing a warning about the elevation of river surface that potentially flood. It's caused residents who live around the river area do not know the situation when the river overflow. There are several alternatives to detect river overflow, one of them is by using microcontroller ultrasonic sensor based. The research aims to design prototype decision support system based on the ultrasonic sensor for flood detection that works automatically detects the river overflow. The software development model used is Prototype Model. The result of this research is a prototype in the form of microcontroller ultrasonic sensor based which can be used as flood detection decision support system that will know the height of river surface designed to detect certain level about flood potency with a low-level error.
... Several works have shown that rapid prototyping and portability are crucial factors when building big data systems [24,33,59]. We confirmed these observations when developing our Rheem applications. ...
Article
Solving business problems increasingly requires going beyond the limits of a single data processing platform (platform for short), such as Hadoop or a DBMS. As a result, organizations typically perform tedious and costly tasks to juggle their code and data across different platforms. Addressing this pain and achieving automatic cross-platform data processing is quite challenging: finding the most efficient platform for a given task requires quite good expertise for all the available platforms. We present Rheem, a general-purpose cross-platform data processing system that decouples applications from the underlying platforms. It not only determines the best platform to run an incoming task, but also splits the task into subtasks and assigns each subtask to a specific platform to minimize the overall cost (e.g., runtime or monetary cost). It features (i) an interface to easily compose data analytic tasks; (ii) a novel cost-based optimizer able to find the most efficient platform in almost all cases; and (iii) an executor to efficiently orchestrate tasks over different platforms. As a result, it allows users to focus on the business logic of their applications rather than on the mechanics of how to compose and execute them. Using different real-world applications with Rheem, we demonstrate how cross-platform data processing can accelerate performance by more than one order of magnitude compared to single-platform data processing.
... Pesquisadores expressam argumentos correlatos ao comparar o desenvolvimento tradicional de sistemas com o desenvolvimento de sistemas big data. Por exemplo, segundoChen et al. (2016b), alguns riscos são a dificuldade na seleção de tecnologias big data, a complexa integração de sistemas legados com novos sistemas e, por ser um campo novo, profissionais detém pouco ou nenhum conhecimento. Em outro estudo(Chen et al., 2016a), os mesmos autores discorrem que o desenvolvimento de sistemas que lidam com dados em menor escala é tradicionalmente baseado em bancos de dados relacionais ou data warehouses e explicitam a importância de um processo de design de arquitetura. ...
Thesis
Full-text available
According to study made in 2015, the current amount of data generated in organizations have led to an increased investment in infrastructure development and data analytics. However, the applications software development side is underestimated. In order to enable the development of end-user applications utilizing big data, software engineering field presents a solid set of directives to assess different application domains, development processes and requirements engineering. It is fundamental to investigate what efforts in big data systems development have been employed in order to provide both researchers and practitioners with information that enable further research activities. This study aims at surveying existing research on big data software engineering in order to identify approaches employed, development strategies, and identifying current research. A systematic mapping study was performed based on a set of 8 research questions. In total, 305 studies, dated from 2011 to 2016, were evaluated. We proposed a novel protocol and retrieved a set of primary studies, identified a list of approaches, and analyzed the current state on building big data software systems, identifying trends and gaps where new research efforts can be invested. The results of this systematic mapping can support researchers and practitioners in development choices and future research.
... First, a set of control papers was selected to provide input into the search for primary studies [25][36] [9]. According to Kitchenham and Charters [43], control papers aim at helping in the definition of the search string, providing input for the adjustment of the search string until the search retrieves the control papers. ...
Conference Paper
Full-text available
Context] Data is being collected at an unprecedented scale. Data sets are becoming so large and complex that traditionally engineered systems may be inadequate to deal with them. While software engineering comprises a large set of approaches to support engineering robust software systems, there is no comprehensive overview of approaches that have been proposed and/or applied in the context of engineering big data systems. [Goal] This study aims at surveying existing research on big data software engineering to unveil and characterize the development approaches and major contributions. [Method] We conducted a systematic mapping study, identifying 52 related research papers, dated from 2011 to 2016. We classified and analyzed the identified approaches, their objectives, application domains, development lifecycle phase, and type of contribution. [Results] As a result, we outline the current state of the art and gaps on employing software engineering approaches to develop big data systems. For instance, we observed that the major challenges are in the area of software architecture and that more experimentation is needed to assess the classified approaches. [Conclusion] The results of this systematic mapping provide an overview on existing approaches to support building big data systems and helps to steer future research based on the identified gaps.
... In the wireless communication area, big data research is very important. The data communication activity in history can be recorded and analyzed by evaluating the communication activities of human beings and determining interest points [4][5][6][7]. Online individual commodity recommendation will become effective, which is good for the development of online communication companies [8,9]. ...
Article
Full-text available
Big data research is difficult because of its complex structure, vast data storage, and unpredictable change. Social network communication involves a significant amount of incomputable data created by wireless devices across the world. Such data can be used to analyze human activities, seek certain patterns using communication data, and predict emergencies. However, most data are effect of human to research our activities. So, recording effective node distribution and investigating the topological structure in communication are particularly important in big data communication. This study establishes a big data communication simulation environment by searching small data and calculating the influence of small data nodes. The experiment shows that 1% of small data can connect 75% of communication nodes and 20% of small data can transmit 80% of data packets.
... A pipeline-based architecture for heterogeneous execution environments in Big Data systems has been presented by Wu and co-authors [21]. Finally, a model for cost-effective, systematic risk management in agile Big Data system development called Risk-Based Architecture-Centric Strategic Prototyping (RASP) model [22] has been developed to help software architects of BD systems deal with strategic prototyping to manage risks involved in developing such systems. ...
Article
Full-text available
Software engineering has evolved over the last 50 years, initially as a response to the so-called software crisis (the problems that organizations had producing quality software systems on time and on budget) of the 1960s and 1970s. Software engineering (SE) has been defined as “the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software; that is, the application of engineering to software”. Software engineering has developed a number of approaches to areas such as software requirements, software design, software testing, and software maintenance. Software development processes such as the waterfall model, incremental development, and the spiral model have been successfully applied to produce high-quality software on time and under budget. More recently, agile software development has gained popularity as an alternative to the more traditional development methods for development of complex systems. Within the last decade or so, advances in technologies such as mobile computing, social networks, cloud computing, and the Internet of things have given rise to massive datasets which have been given the name Big Data (BD). Big Data has been defined as data with 3Vs—high volume, velocity, and variety. Big Data contains so much data that low probability events are captured in the data. These events can be discovered using analytics methods and turned into actionable intelligence which can be used by businesses to gain a competitive advantage. Unfortunately, the very scale of BD often renders inadequate SQL-based relational database systems which have formed the backbone of data intensive systems for the last 30 years, requiring new NoSQL technologies to be effective. In this paper, we will explore how well-established SE technology can be adapted to support successful development of BD projects, as well as how BD techniques can be used to increase the utility of SE processes and techniques. Thus, BD and SE may mutually support and enrich each other.
... To validate these needs and the promises of such middleware, we try to solve a real research problem of constructing an accurate social network. As this happens to be a big data problem, we followed the suggestions for strategic prototyping in [3]. ...
Conference Paper
Motivation: Software engineering for High Performace Computing (HPC) environments in general [1] and for big data in particular [5] faces a set of unique challenges including high complexity of middleware and of computing environments. Tools that make it easier for scientists to utilize HPC are, therefore, of paramount importance. We provide an experience report of using one of such highly effective middleware pbdR [9] that allow the scientist to use R programming language without, at least nominally, having to master many layers of HPC infrastructure, such as OpenMPI [4] and ScalaPACK [2]. Objective: to evaluate the extent to which middleware helps improve scientist productivity, we use pbdR to solve a real problem that we, as scientists, are investigating. Our big data comes from the commits on GitHub and other project hosting sites and we are trying to cluster developers based on the text of these commit messages. Context: We need to be able to identify developer for every commit and to identify commits for a single developer. Developer identifiers in the commits, such as login, email, and name are often spelled in multiple ways since that information may come from different version control systems (Git, Mercurial, SVN, ...) and may depend on which computer is used (what is specified in .git/config of the home folder). Method: We train Doc2Vec [7] model where existing credentials are used as a document identifier and then use the resulting 200-dimensional vectors for the 2.3M identifiers to cluster these identifiers so that each cluster represents a specific individual. The distance matrix occupies 32TB and, therefore, is a good target for HPC in general and pbdR in particular. pbdR allows data to be distributed over computing nodes and even has implemented K-means and mixture-model clustering techniques in the package pmclust. Results: We used strategic prototyping [3] to evaluate the capabilities of pbdR and discovered that a) the use of middleware required extensive understanding of its inner workings thus negating many of the expected benefits; b) the implemented algorithms were not suitable for the particular combination of n, p, and k (sample size, data dimension, and the number of clusters); c) the development environment based on batch jobs increases development time substantially. Conclusions: In addition to finding from Basili et al., we find that the quality of the implementation of HPC infrastructure and its development environment has a tremendous effect on development productivity.
... These concerns demand strategies to increase or decrease the number of computing resources dynamically. As a result, big data architecture design becomes critical and poses significant risks when done inadequately [24,25]. ...
Conference Paper
Full-text available
The 4D trajectory management program will require a major shift in infrastructure and operational management processes to deliver accurate and reliable information to trajectory management team. As a result, air traffic management operation will demand Network Enabled Operations (NEO) concepts such as the System Wide Information Management (SWIM) framework to ensure that the decisions are made with the correction information at the right time. SWIM provides standards, infrastructure, and governance practices to allow information exchanging through interoperable services. Consequently, SWIM must provide methods to (a) integrate a large variety of data; (b) filter information in a way that only the relevant ones are retained to explain the results; (c) enable National Airspace System (NAS) operators, pilots, controllers, and traffic flow specialists to extract value of air traffic systems in real-time; and (d) to seek and explore complex and evolving data's relationships. However, SWIM still lacks support to deal with big data analytics and to aggregate computing resources on-demand. As the main contribution of this paper, we describe the challenges and new focuses of SWIM researches. Likewise, we present an architecture to enable big data analytics services in SWIM. The proposed architecture relies on big data processing frameworks to handle data acquisition and data filtering on near-real time taking into account users' objectives; and to guarantee that the data go through all the gives stage of the life cycle of ATM applications, avoiding the silos that may happen in each data analysis stage.
... New forms of digital technology necessitate appropriate innovation management approaches addressing the challenges on shipping strategy, operations and technology management. Big data of shipping (embedded in algorithms and management information systems) can be inimitable, hard to substitute and valuable to shipping companies and their business endeavors (Chen et al, 2016, Clark and Vargo, 2014, McAfee and Brynjolfsson, 2012. ...
Conference Paper
Full-text available
In this paper, we present a multi-faceted introduction to Shipping 4.0 in order to contribute in framing the technology and managerial challenges on a vigorous, evolving conceptual and methodological basis. Primarily, we provide a review, critical annotation and adaptation of extant research on Internet of Things (IoT) and Big Data Analytics (BDA) technology and management, as relevant for cyber shipping, and elaborate the application of Industry 4.0 principles in the shipping domain (aka Shipping 4.0); we focus on exemplifying the digital innovation choices and capabilities of shipping companies, as necessitated by the contemporary changes in market equilibrium patterns, and the new maritime technology evolution trajectories. We elaborate the new elements making up an IoT and BDA enabled digital service innovation capability, comprising digital shipping strategy, cyber-physical innovation resourcing, network organization and culture, infrastructure control and data technology management. We explain the shipping " IoT/BDA technology stack " and a cyber shipping business model configuration taxonomy, while aspiring to bridge the Shipping 4.0 popular discussion with contemporary, relevant and interesting research foci of both the Information Systems (IS) and Maritime Economics and Policy research communities.
... An architecture spike is similar, but is driven by a quality attribute question, typically regarding performance but it could, in theory, address any system quality [6]. In an architecture spike, a prototype (throwaway or not) is commonly created to address the quality attribute question [23]. It may be developed on the main branch, but successful architecture spikes are typically developed on a separate branch and (if successful) merged with the main branch. ...
Article
This article contributes an architecture-centric methodology, called AABA ( A rchitecture-centric A gile B ig data A nalytics), to address the technical, organizational, and rapid technology change challenges of both big data system development and agile delivery of big data analytics for Web-based Systems (WBS). As the first of its kind, AABA fills a methodological void by adopting an architecture-centric approach, advancing and integrating software architecture analysis and design, big data modeling and agile practices. This article describes how AABA was developed, evolved and validated simultaneously in 10 empirical WBS case studies through three CPR (Collaborative Practice Research) cycles. In addition, this article presents an 11th case study illustrating the processes, methods and techniques/tools in AABA for cost-effectively achieving business goals and architecture agility in a large scale WBS. All 11 case studies showed that architecture-centric design, development, and operation is key to taming technical complexity and achieving agility necessary for successful WBS big data analytics development. Our contribution is novel and important. The use of reference architectures, a design concepts catalog and architectural spikes in AABA are advancements to architecture design methods. In addition, our architecture-centric approach to DevOps was critical for achieving strategic control over continuous big data value delivery for WBS.
Chapter
In this research, we considered projects to develop systems that use AI technologies including machine learning techniques for office environment. In many AI system development projects, both developers and users need to be involved in order to reach a consensus on discussion items before starting a project. To facilitate this, we propose a method of assessing an AI system development project by using an assurance case based on quality sub-characteristics of functionality to derive project success factors.KeywordsAssurance CaseSystem Development ProjectProject Success FactorsWork ItemsMachine Learning TechnologyThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Article
Traditionally, the quality of a software or system architecture has been evaluated in the early stages of the development process using architecture quality evaluation methods. Emergent approaches like Industry 4.0 require continuous monitoring of both run‐time and development‐time quality properties, in contrast to traditional systems where quality is evaluated at specific milestones using techniques such as project reviews. Considering the dynamics and minimum down‐time imposed by the industrial production domain, it must also be ensured that Industry 4.0 system evaluations are continuously performed with high confidence and with as much automation as possible, using simulations, for instance. In this regard, there is a need to develop new methods for continuously monitoring and evaluating the quality properties of software‐based systems for Industry 4.0, which must be supported by automated quality evaluation techniques. In this research we analyze traditional architecture evaluation methods and Industry 4.0 scenarios, and propose an approach based on Digital Twins and simulations to continuously evaluate runtime quality aspects of the architecture and systems of industrial production plants. The evaluation is based on the instantiation of our approach for a concrete demand of an automation plant in the automotive domain.
Chapter
Full-text available
Although the term of big data and related technologies received lots of attention in recent years, many projects are less successful than anticipated. One of the most crucial steps in the planning of a system includes the modeling of the underlying architecture. However, as of now, no standardized approach exists that facilitates the modeling of big data system architectures (BDSA). In this research, a systematic approach is presented that delivers a foundation towards a standard for the modeling of BDSA. Further, a prototype is introduced that automatizes the creation of those models reducing the required effort and simultaneously increasing the maintainability.
Chapter
In the era of big data, intelligent sports venues have a practical significance to provide personalized service for users and build up a platform for stadium management. This article proposes a new parallel big data promotion algorithm based on the latest achievements of big data analysis. The proposed algorithm calculates the optimal value by using the observed variables Y, the hidden variable data Z, the joint distribution P (Y, Z | θ) and distribution conditions P (Z | Y | θ). The experimental results show that the proposed algorithm has higher accuracy of big data analysis, and can serve the intelligent sports venues better.
Article
Full-text available
Machine-learning-based software defect prediction (SDP) methods are receiving great attention from the researchers of intelligent software engineering. Most existing SDP methods are performed under a within-project setting. However, there usually is little to no within-project training data to learn an available supervised prediction model for a new SDP task. Therefore, cross-project defect prediction (CPDP), which uses labeled data of source projects to learn a defect predictor for a target project, was proposed as a practical SDP solution. In real CPDP tasks, the class imbalance problem is ubiquitous and has a great impact on performance of the CPDP models. Unlike previous studies that focus on subsampling and individual methods, this study investigated 15 imbalanced learning methods for CPDP tasks, especially for assessing the effectiveness of imbalanced ensemble learning (IEL) methods. We evaluated the 15 methods by extensive experiments on 31 open-source projects derived from five datasets. Through analyzing a total of 37504 results, we found that in most cases, the IEL method that combined under-sampling and bagging approaches will be more effective than the other investigated methods.
Article
Full-text available
This paper investigates previous literature that focusses on the three elements: risk assessment, big data and cloud. We use a systematic literature mapping method to search for journals and proceedings. The systematic literature mapping process is utilized to get a properly screened and focused literature. With the help of inclusion and exclusion criteria, the search of literature is further narrowed. Classification helps us in grouping the literature into categories. At the end of the mapping, gaps can be seen. The gap is where our focus should be in analysing risk of big data in cloud computing environment. Thus, a framework of how to assess the risk of security, privacy and trust associated with big data and cloud computing environment is highly needed.
Article
Full-text available
In the era of big data, intelligent sports venues have a practical significance to provide personalized service for users and build up a platform for stadium management. This article proposes a new parallel big data promotion algorithm based on the latest achievements of big data analysis. The proposed algorithm calculates the optimal value by using the observed variables Y, the hidden variable data Z, the joint distribution P (Y, Z | θ) and distribution conditions P (Z | Y | θ). The experimental results show that the proposed algorithm has higher accuracy of big data analysis, and can serve the intelligent sports venues better.
Article
Full-text available
Software development project is a difficult task. Especially for software designed to comply with regulations that are constantly being introduced or changed, it is almost impossible to make just one change during the development process. Even if it is possible, nonetheless, the developers may take bulk of works to fix the design to meet specified needs. This iterative work also means that it takes additional time and potentially leads to failing to meet the original schedule and budget. In such inevitable changes, it is essential for developers to carefully consider and use an appropriate method which will help them carry out software project development. This research aims to examine the implementation of a software development method called evolutionary prototyping for developing software for complying regulation. It investigates the development of Land Management Information System (pseudonym), initiated by the Australian government, for use by farmers to meet regulatory demand requested by Soil and Land Conservation Act. By doing so, it sought to provide understanding the efficacy of evolutionary prototyping in helping developers address frequent changing requirements and iterative works but still within schedule. The findings also offer useful practical insights for other developers who seek to build similar regulatory compliance software.
Conference Paper
The failure rate of big data application projects is higher than 50%, and even exceed the large-scale IT projects. Ambiguous objective and lack changeability, extensibility and intercommunication ability are major risk factors of the big data applications. The critical quality defects of development process should be timely identified and improved to reduce the high risk. Iterative big data development quality management and control measures can timely improve the project development defects and transfer the risks. In this paper, based on the spiral software development model, proposes the critical quality measurement (CQM) model, and develops the risk management and control (RMC) procedure. Big data development process combines RMC procedure with CQM model that can timely identify development quality defects. Iterative identified quality defects can timely avoid the development risks extension and effectively reduce the loss of high failure risk of big data project.
Conference Paper
Context: Big data has become the new buzzword in the information and communication technology industry. Researchers and major corporations are looking into big data applications to extract the maximum value from the data available to them. However, developing and maintaining stable and scalable big data applications is still a distant milestone. Objective: To look at existing research on how software engineering concepts, namely the phases of the software development project life cycle (SDPLC), can help build better big data application projects. Method: A literature survey was performed. A manual search covered papers returned by search engines resulting in approximately 2,000 papers being searched and 170 papers selected for review. Results: The search results helped in identifying data rich application projects that have the potential to utilize big data successfully. The review helped in exploring SDPLC phases in the context of big data applications and performing a gap analysis of the phases that have yet to see detailed research efforts but deserve attention.
Article
Full-text available
Agile teams strive to balance short-term feature development with longer-term quality concerns. These evolutionary approaches often hit a 'complexity wall" from the cumulative effects of unplanned changes, resulting in unreliable, poorly performing software. So, the agile community is refocusing on approaches to address architectural concerns. Researchers analyzed quality attribute concerns from 15 years of Architecture Trade-Off Analysis Method data, gathered from 31 projects. Modifiability was the dominant concern across all project types; performance, availability, and interoperability also received considerable attention. For IT projects, a relatively new quality-deployability-emerged as a key concern. The study results provide insights for agile teams allocating architecture-related tasks to iterations. For example, teams can use these results to create checklists for release planning or retrospectives to help assess whether to address a given quality to support future needs. This article is part of a special issue on Software Architecture.
Article
Full-text available
In the past decade, researchers have devised many methods to support and codify architecture design. However, what hampers such methods' adoption is that these methods employ abstract concepts such as views, tactics, and patterns, whereas practicing software architects choose technical design primitives from the services offered in commercial frameworks. A holistic and more realistic approach to architecture design addresses this disconnect. This approach uses and systematically links both top-down concepts, such as patterns and tactics, and implementation artifacts, such as frameworks, which are bottom-up concepts. The Web extra at http://youtu.be/kygFOV8TqEw is a video in which Humberto Cervantes from Autonomous Metropolitan University interviews Josué Martìnez Buenrrostro, a software architect at Quarksoft in Mexico City, about the design process discussed in the article "A Principled Way to Use Frameworks in Architecture Design".
Conference Paper
Full-text available
This paper outlines our experiences with making architectural tradeoffs between performance, availability, security, and usability, in light of stringent cost and time-to-market constraints, in an industrial web-conferencing system. We highlight the difficulties in anticipating future architectural requirements and tradeoffs and the value of using agility and experiments as a tool for mitigating architectural risks in situations when up front pen-and- paper analysis is simply impossible.
Book
http://www.amazon.com/Designing-Software-Architectures-Practical-Engineering/dp/0134390784 Designing Software Architectures will teach you how to design any software architecture in a systematic, predictable, repeatable, and cost-effective way. This book introduces a practical methodology for architecture design that any professional software engineer can use, provides structured methods supported by reusable chunks of design knowledge, and includes rich case studies that demonstrate how to use the methods. Using realistic examples, you’ll master the powerful new version of the proven Attribute-Driven Design (ADD) 3.0 method and will learn how to use it to address key drivers, including quality attributes, such as modifiability, usability, and availability, along with functional requirements and architectural concerns. Drawing on their extensive experience, Humberto Cervantes and Rick Kazman guide you through crafting practical designs that support the full software life cycle, from requirements to maintenance and evolution. You’ll learn how to successfully integrate design in your organizational context, and how to design systems that will be built with agile methods. Comprehensive coverage includes - Understanding what architecture design involves, and where it fits in the full software development life cycle - Mastering core design concepts, principles, and processes - Understanding how to perform the steps of the ADD method - Scaling design and analysis up or down, including design for pre-sale processes or lightweight architecture reviews - Recognizing and optimizing critical relationships between analysis and design - Utilizing proven, reusable design primitives and adapting them to specific problems and contexts - Solving design problems in new domains, such as cloud, mobile, or big data
Article
As environmental sustainability issues have come to the societal and governmental forefront, a new breed of Green Information Systems (IS) — Ultra-Large-Scale (ULS) Green IS — is emerging. A ULS Green IS is an open socio-technical ecosystem that differs from traditional IS in scale, complexity and urgency. Design issues found in ULS systems, System of Systems, Edge-dominant, Metropolis systems and Green IS converge and multiply in the ULS Green IS context. This paper presents a design framework and an architecture analysis method, ECO-ARCH, to address the design of such systems. Through an action research study on architecting for Demand Response systems in the Smart Grid, this article illuminates the system characteristics of ULS Green IS and endorses a fundamental shift in design thinking for its design — from “bounded rationality” for problem solving to “expandable rationality” for design for the unknown and for innovation. ECO-ARCH advances existing software architecture analysis methods by complimenting expandable rationality design thinking with proven engineering techniques in a dual-level macroscopic-microscopic analysis. This tackles the unique architecting problems of ULS Green IS where many stakeholders are unknown and design goals are not provided, where no architecture pre-exists, where system behavior is non-deterministic and continuously evolving, and where co-creation with consumers and prosumers is essential to achieving triple bottom line goals.
Article
As the rates of business and technological changes accelerate, misalignments between business and IT architectures are inevitable. Existing alignment models, while important for raising awareness of alignment issues, have provided little in the way of guidance for actually correcting misalignment and thus achieving alignment. This paper introduces the BITAM (Business IT Alignment Method) which is a process that describes a set of twelve steps for managing, detecting and correcting misalignment. The methodology is an integration of two hitherto distinct analysis areas: business analysis and architecture analysis. The BITAM is illustrated via a case study conducted with a Fortune 100 company.
Article
Many proposed contingencies regarding the conditions when the use of prototyping will lead to successful system development appear in the literature. Using an industry survey, this exploratory study empirically investigates the effect of certain contingencies on system success. Overall, results indicate that five variables, when combined with prototyping, affect system success (as indicated by user satisfaction): innovativeness of the project, impact of the system on the organization, user participation, number of users, and developer experience with prototyping. These results provide some insight into the proper uses of prototyping to improve system success. The results also indicate that several of the current contingencies, if followed, do not ensure high levels of system success.
Article
This article presents a new approach to the management of evolutionary prototyping projects. The prototyping approach to systems development emphasizes learning and facilitates meaningful communication between systems developers and users. These benefits are important for rapid creation of flexible, usable information resources that are well-tuned to present and future business needs. The main unsolved problem in prototyping is the difficulty in controlling such projects. This problem severely limits the range of practical projects in which prototyping can be used. The new approach suggested in this article uses an explicit risk mitigation model and management process that energizes and enhances the value of prototyping in technology delivery. An action research effort validates this risk analysis approach as one that focuses management attention on consequences and priorities inherent in a prototyping situation. This approach enables appropriate risk resolution strategies to be placed in effect before the prototyping process breaks down. It facilitates consensus building through collaborative decision making and is consistent with a high degree of user involvement.
Risk Themes Discovered through Architecture Evaluations
  • L Bass