Article

Architecting ML-enabled systems: Challenges, best practices, and design decisions

October 2023
Journal of Systems and Software 207(3):111860

October 2023
207(3):111860

DOI:10.1016/j.jss.2023.111860

License
CC BY-NC-ND 4.0

Authors:

Alessio Bucaioni

Mälardalen University

Patrizio Pelliccione

Gran Sasso Science Institute

Naming the Pain in Machine Learning-Enabled Systems Engineering

Preprint

Full-text available

May 2024

Context: Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes. Objective: This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research. Method: We conducted an international survey to collect insights from practitioners on the current practices and problems in engineering ML-enabled systems. We received 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems using open and axial coding procedures. Results: Our survey results reinforce and extend existing empirical evidence on engineering ML-enabled systems, providing additional insights into typical ML-enabled systems project contexts, the perceived relevance and complexity of ML life cycle phases, and current practices related to problem understanding, model deployment, and model monitoring. Furthermore, the qualitative analysis provides a detailed map of the problems practitioners face within each ML life cycle phase and the problems causing overall project failure. Conclusions: The results contribute to a better understanding of the status quo and problems in practical environments. We advocate for the further adaptation and dissemination of software engineering practices to enhance the engineering of ML-enabled systems.

An empirical investigation of challenges of specifying training data and runtime monitors for critical software with machine learning and their relation to architectural decisions

Article

Full-text available

Mar 2024
REQUIR ENG

The development and operation of critical software that contains machine learning (ML) models requires diligence and established processes. Especially the training data used during the development of ML models have major influences on the later behaviour of the system. Runtime monitors are used to provide guarantees for that behaviour. Runtime monitors for example check that the data at runtime is compatible with the data used to train the model. In a first step towards identifying challenges when specifying requirements for training data and runtime monitors, we conducted and thematically analysed ten interviews with practitioners who develop ML models for critical applications in the automotive industry. We identified 17 themes describing the challenges and classified them in six challenge groups. In a second step, we found interconnection between the challenge themes through an additional semantic analysis of the interviews. We explored how the identified challenge themes and their interconnections can be mapped to different architecture views. This step involved identifying relevant architecture views such as data, context, hardware, AI model, and functional safety views that can address the identified challenges. The article presents a list of the identified underlying challenges, identified relations between the challenges and a mapping to architecture views. The intention of this work is to highlight once more that requirement specifications and system architecture are interlinked, even for AI-specific specification challenges such as specifying requirements for training data and runtime monitoring.

On the Interaction between Software Engineers and Data Scientists when building Machine Learning-Enabled Systems

Conference Paper

Full-text available

Apr 2024

In recent years, Machine Learning (ML) components have been increasingly integrated into the core systems of organizations. Engineering such systems presents various challenges from both a theoretical and practical perspective. One of the key challenges is the effective interaction between actors with different backgrounds who need to work closely together, such as software engineers and data scientists. This paper presents an exploratory case study that aims to understand the current interaction and collaboration dynamics between these two roles in ML projects. We conducted semi-structured interviews with four practitioners with experience in software engineering and data science of a large ML-enabled system project and analyzed the data using reflexive thematic analysis. Our findings reveal several challenges that can hinder collaboration between software engineers and data scientists, including differences in technical expertise, unclear definitions of each role's duties, and the lack of documents that support the specification of the ML-enabled system. We also indicate potential solutions to ad dress these challenges, such as fostering a collaborative culture, encouraging team communication, and producing concise system documentation. This study contributes to understanding the complex dynamics between software engineers and data scientists in ML projects and provides insights for improving collaboration and communication in this context. We encourage future studies investigating this interaction in other projects.

Component-based Approach to Software Engineering of Machine Learning-enabled Systems

Conference Paper

Jun 2024

Vladislav Indykov

Machine learning experiment management tools: a mixed-methods empirical study

Article

Full-text available

May 2024
EMPIR SOFTW ENG

Machine Learning (ML) experiment management tools support ML practitioners and software engineers when building intelligent software systems. By managing large numbers of ML experiments comprising many different ML assets, they not only facilitate engineering ML models and ML-enabled systems, but also managing their evolution—for instance, tracing system behavior to concrete experiments when the model performance drifts. However, while ML experiment management tools have become increasingly popular, little is known about their effectiveness in practice, as well as their actual benefits and challenges. We present a mixed-methods empirical study of experiment management tools and the support they provide to users. First, our survey of 81 ML practitioners sought to determine the benefits and challenges of ML experiment management and of the existing tool landscape. Second, a controlled experiment with 15 student developers investigated the effectiveness of ML experiment management tools. We learned that 70% of our survey respondents perform ML experiments using specialized tools, while out of those who do not use such tools, 52% are unaware of experiment management tools or of their benefits. The controlled experiment showed that experiment management tools offer valuable support to users to systematically track and retrieve ML assets. Using ML experiment management tools reduced error rates and increased completion rates. By presenting a user’s perspective on experiment management tools, and the first controlled experiment in this area, we hope that our results foster the adoption of these tools in practice, as well as they direct tool builders and researchers to improve the tool landscape overall.

On the Interaction Between Software Engineers and Data Scientists When Building Machine Learning-Enabled Systems

Chapter

Apr 2024

In recent years, Machine Learning (ML) components have been increasingly integrated into the core systems of organizations. Engineering such systems presents various challenges from both a theoretical and practical perspective. One of the key challenges is the effective interaction between actors with different backgrounds who need to work closely together, such as software engineers and data scientists. This paper presents an exploratory case study that aims to understand the current interaction and collaboration dynamics between these two roles in ML projects. We conducted semi-structured interviews with four practitioners with experience in software engineering and data science of a large ML-enabled system project and analyzed the data using reflexive thematic analysis. Our findings reveal several challenges that can hinder collaboration between software engineers and data scientists, including differences in technical expertise, unclear definitions of each role’s duties, and the lack of documents that support the specification of the ML-enabled system. We also indicate potential solutions to address these challenges, such as fostering a collaborative culture, encouraging team communication, and producing concise system documentation. This study contributes to understanding the complex dynamics between software engineers and data scientists in ML projects and provides insights for improving collaboration and communication in this context. We encourage future studies investigating this interaction in other projects.

An offline parallel architecture for forensic multimedia classification

Article

Full-text available

Jul 2022
MULTIMED TOOLS APPL

Nowadays, the volume of the multimedia heterogeneous evidence presented for digital forensic analysis has significantly increased, thus requiring the application of big data technologies, cloud-based forensics services, as well as Machine Learning (ML) techniques. In digital forensics domain, ML algorithms have been applied for cybercrime investigation such as child abuse investigations, malware classification, and image forensics. This paper addresses this issues and deals with forensic analysis of digital images and videos. In particular, this work aims at proposing a multimedia classification tool with a parallel software architecture for a fast inspection, which is easy to use (to be used by officers during a search), requires limited hardware resources and it is built on an open-source software to limit its costs. Moreover, this tool must be able to quickly inspect multiple devices at a time. When positives are found in a device, such device will be seized for a deeper analysis later in the lab. It will not be seized otherwise, reducing the inconvenience for the suspect as well as the time required for the next analysis phase. As a case study, we focus on the identification of child pornography images. Experimental results show that the proposed architecture is capable of guaranteeing a high recall, a fast process and high performances in real scenarios.

Software Architecture for ML-based Systems: What Exists and What Lies Ahead

Conference Paper

Full-text available

Mar 2021

The increasing usage of machine learning (ML) coupled with the software architectural challenges of the modern era has resulted in two broad research areas: i) software architecture for ML-based systems, which focuses on developing architectural techniques for better developing ML-based software systems, and ii) ML for software architectures, which focuses on developing ML techniques to better architect traditional software systems. In this work, we focus on the former side of the spectrum with a goal to highlight the different architecting practices that exist in the current scenario for architecting ML-based software systems. We identify four key areas of software architecture that need the attention of both the ML and software practitioners to better define a standard set of practices for architecting ML-based software systems. We base these areas in light of our experience in architecting an ML-based software system for solving queuing challenges in one of the largest museums in Italy.

An Anomaly Detection Algorithm for Microservice Architecture Based on Robust Principal Component Analysis

Article

Full-text available

Dec 2020

Microservice architecture (MSA) is a new software architecture, which divides a large single application and service into dozens of supporting microservices. With the increasingly popularity of MSA, the security issues of MSA get a lot of attention. In this paper, we propose an algorithm for mining causality and the root cause. Our algorithm consists of two parts: invocation chain anomaly analysis based on robust principal component analysis (RPCA) and a single indicator anomaly detection algorithm. The single indicator anomaly detection algorithm is composed of Isolation Forest (IF) algorithm, One-Class Support Vector Machine (SVM) algorithm, Local Outlier Factor (LOF) algorithm, and 3σ principle. For general and network time-consuming anomaly in the process of the MSA, we formulate different anomaly time-consuming detection strategies. We select a batch of sample data and three batches of test data of the 2020 International AIOps Challenge to debug our algorithm. According to the scoring criteria of the competition organizers, our algorithm has an average score of 0.8304 (The full score is 1) in the four batches of data. Our proposed algorithm has higher accuracy than some traditional machine learning algorithms in anomaly detection.

Adoption and Effects of Software Engineering Best Practices in Machine Learning

Conference Paper

Full-text available

Oct 2020

Background. The increasing reliance on applications with machine learning (ML) components calls for mature engineering techniques that ensure these are built in a robust and future-proof manner. Aim. We aim to empirically determine the state of the art in how teams develop, deploy and maintain software with ML components. Method. We mined both academic and grey literature and identified 29 engineering best practices for ML applications. We conducted a survey among 313 practitioners to determine the degree of adoption for these practices and to validate their perceived effects. Using the survey responses, we quantified practice adoption, differentiated along demographic characteristics, such as geography or team size. We also tested correlations and investigated linear and non-linear relationships between practices and their perceived effect using various statistical models. Results. Our findings indicate, for example, that larger teams tend to adopt more practices, and that traditional software engineering practices tend to have lower adoption than ML specific practices. Also, the statistical models can accurately predict perceived effects such as agility, software quality and traceability, from the degree of adoption for specific sets of practices. Combining practice adoption rates with practice importance, as revealed by statistical models, we identify practices that are important but have low adoption, as well as practices that are widely adopted but are less important for the effects we studied. Conclusion. Overall, our survey and the analysis of responses received provide a quantitative basis for assessment and step-wise improvement of practice adoption by ML teams.

Studying Software Engineering Patterns for Designing Machine Learning Systems

Conference Paper

Full-text available

Dec 2019

Modeling and Training of Neural Processing Systems

Conference Paper

Full-text available

Sep 2019

A Safe, Secure, and Predictable Software Architecture for Deep Learning in Safety-Critical Systems

Article

Full-text available

Nov 2019

In the last decade, deep learning techniques reached human-level performance in several specific tasks, as image recognition, object detection, and adaptive control. For this reason, deep learning is being seriously considered by the industry to address difficult perceptual and control problems in several safety-critical applications (e.g., autonomous driving, robotics, and space missions). However, at the moment, deep learning software poses a number of issues related to safety, security, and predictability, which prevent its usage in safety-critical systems. This work proposes a visionary software architecture that allows embracing deep learning while guaranteeing safety, security, and predictability by design. To achieve this goal, the architecture integrates multiple and diverse technologies, as hypervisors, run-time monitoring, redundancy with diversity, predictive fault detection, fault recovery, and predictable resource management. Open challenges that stems from the proposed architecture are finally discussed.

Systematic Mapping Studies in Software Engineering

Conference Paper

Full-text available

Jun 2008

How does Machine Learning Change Software Development Practices?

Article

Full-text available

Aug 2019

Adding an ability for a system to learn inherently adds non-determinism into the system. Given the rising popularity of incorporating machine learning into systems, we wondered how the addition alters software development practices. We performed a mixture of qualitative and quantitative studies with 14 interviewees and 342 survey respondents from 26 countries across four continents to elicit significant differences between the development of machine learning systems and the development of non-machine-learning systems. Our study uncovers significant differences in various aspects of software engineering (e.g., requirements, design, testing, and process) and work features (e.g., skill variety, problem solving and task identity). Based on our findings, we highlight future research directions and provide recommendations for practitioners.

Software Engineering for Machine Learning: A Case Study

Conference Paper

Full-text available

May 2019

Artificial Intelligence for the Early Design Phases of Space Missions

Conference Paper

Full-text available

Mar 2019

Recent introduction of data mining methods has led to a paradigm shift in the way we can analyze space data. This paper demonstrates that Artificial Intelligence (AI), and especially the field of Knowledge Representation and Reasoning (KRR), could also be successfully employed at the start of the space mission life cycle via an Expert System (ES) used as a Design Engineering Assistant (DEA). An ES is an AI-based agent used to solve complex problems in particular fields. There are many examples of ES being successfully implemented in the aeronautical, agricultural, legal or medical fields. Applied to space mission design, and in particular, in the context of concurrent engineering sessions, an ES could serve as a knowledge engine and support the generation of the initial design inputs, provide easy and quick access to previous design decisions or push to explore new design options. Integrated to the User design environment, the DEA could become an active assistant following the design iterations and flagging model inconsistencies. Today, for space missions design, experts apply methods of concurrent engineering and Model-Based System Engineering, relying both on their implicit knowledge (i.e., past experiences, network) and on available explicit knowledge (i.e., past reports, publications, data sheets). The former knowledge type represents still the most significant amount of data, mostly unstructured, non-digital or digital data of various legacy formats. Searching for information through this data is highly time-consuming. A solution is to convert this data into structured data to be stored into a Knowledge Graph (KG) that can be traversed by an inference engine to provide reasoning and deductions on its nodes. Knowledge is extracted from the KG via a User Interface (UI) and a query engine providing reliable and relevant knowledge summaries to the Human experts. The DEA project aims to enhance the productivity of experts by providing them with new insights into a large amount of data accumulated in the field of space mission design. Natural Language Processing (NLP), Machine Learning (ML), Knowledge Management (KM) and Human-Machine Interaction (HMI) methods are leveraged to develop the DEA. Building the knowledge base manually is subjective, time-consuming, laborious and error bound. This is why the knowledge base generation and population rely on Ontology Learning (OL) methods. This OL approach follows a modified model of the Ontology Layer Cake. This paper describes the approach and the parameters used for the qualitative trade-off for the selection of the software to be adopted in the architecture of the ES. The study also displays the first results of the multi-word extraction and highlights the importance of Word Sense Disambiguation for the identification of synonyms in the context. This paper includes the detailed software architecture of both front and back-ends, as well as the tool requirements. Both architectures and requirements were refined after a set of interviews with experts from the European Space Agency. The paper finally presents the preliminary strategy to quantify and mitigate uncertainties within the ES.

Platform-Centric Self-Awareness as a Key Enabler for Controlling Changes in CPS

Article

Full-text available

Sep 2018

Future cyber–physical systems will host a large number of coexisting distributed applications on hardware platforms with thousands to millions of networked components communicating over open networks. These applications and networks are subject to continuous change. The current separation of design process and operation in the field will be superseded by a life-long design process of adaptation, infield integration, and update. Continuous change and evolution, application interference, environment dynamics and uncertainty lead to complex effects which must be controlled to serve a growing set of platform and application needs. Self-adaptation based on self-awareness and self-configuration has been proposed as a basis for such a continuous in-field process. Research is needed to develop automated in-field design methods and tools with the required safety, availability, and security guarantees. The paper shows two complementary use cases of self-awareness in architectures, methods, and tools for cyber–physical systems. The first use case focuses on safety and availability guarantees in self-aware vehicle platforms. It combines contracting mechanisms, tool based self-analysis and self-configuration. A software architecture and a runtime environment executing these tools and mechanisms autonomously are presented including aspects of self-protection against failures and security threats. The second use case addresses variability and long term evolution in networked MPSoC integrating hardware and software mechanisms of surveillance, monitoring, and continuous adaptation. The approach resembles the logistics and operation principles of manufacturing plants which gave rise to the metaphoric term of an Information Processing Factory that relies on incremental changes and feedback control. Both use cases are investigated by larger research groups. Despite their different approaches, both use cases face similar design and design automation challenges which will be summarized in the end. We will argue that seemingly unrelated research challenges, such as in machine learning and security, could also profit from the methods and superior modeling capabilities of self-aware systems.

Hidden Technical Debt in Machine Learning Systems

Article

Full-text available

Jan 2015
Adv Neural Inform Process Syst

Machine learning offers a fantastically powerful toolkit for building useful com-plex prediction systems quickly. This paper argues it is dangerous to think of these quick wins as coming for free. Using the software engineering framework of technical debt, we find it is common to incur massive ongoing maintenance costs in real-world ML systems. We explore several ML-specific risk factors to account for in system design. These include boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, configuration issues, changes in the external world, and a variety of system-level anti-patterns.

Continuously Reproducing Toolchains in Pattern Recognition and Machine Learning Experiments

Conference Paper

Full-text available

Aug 2017

Pattern recognition and machine learning research work often contains experimental results on real-world data, which corroborates hypotheses and provides a canvas for the development and comparison of new ideas. Results, in this context, are typically summarized as a set of tables and figures, allowing the comparison of various methods, highlighting the advantages of the proposed ideas. Unfortunately , result reproducibility is often an overlooked feature of original research publications, competitions, or benchmark evaluations. The main reason for such a gap is the complexity on the development of software associated with these reports. Software frameworks are difficult to install, maintain, and distribute, while scientific experiments often consist of many steps and parameters that are difficult to report. The increasingly rising complexity of research challenges make it even more difficult to reproduce experiments and results. In this paper, we emphasize that a reproducible research work should be repeatable, shareable, extensible, and stable, and discuss important lessons we learned in creating, distributing, and maintaining software and data for reproducible research in pattern recognition and machine learning. We focus on a specific use-case of face recognition and describe in details how we can make the recognition experiments reproducible in practice.

TFX: A TensorFlow-Based Production-Scale Machine Learning Platform

Conference Paper

Full-text available

Aug 2017

Creating and maintaining a platform for reliably producing and deploying machine learning models requires careful orchestration of many components---a learner for generating models based on training data, modules for analyzing and validating both data as well as models, and finally infrastructure for serving models in production. This becomes particularly challenging when data changes over time and fresh models need to be produced continuously. Unfortunately, such orchestration is often done ad hoc using glue code and custom scripts developed by individual teams for specific use cases, leading to duplicated effort and fragile systems with high technical debt. We present TensorFlow Extended (TFX), a TensorFlow-based general-purpose machine learning platform implemented at Google. By integrating the aforementioned components into one platform, we were able to standardize the components, simplify the platform configuration, and reduce the time to production from the order of months to weeks, while providing platform stability that minimizes disruptions. We present the case study of one deployment of TFX in the Google Play app store, where the machine learning models are refreshed continuously as new data arrive. Deploying TFX led to reduced custom code, faster experiment cycles, and a 2% increase in app installs resulting from improved data and model analysis.

FAULT TOLERANCE IN JOB SCHEDULING THROUGH FAULT MANAGEMENT FRAMEWORK USING SOA IN GRID

Article

Full-text available

Jan 2017

The rapid development in computing resources has enhanced the recital of computers and abridged their costs. This accessibility of low cost prevailing computers joined with the fame of the Internet and high-speed networks has leaded the computing surroundings to be mapped from dispersed to grid environments. Grid is a kind of dispersed system which supports the allotment and harmonized exploit of geographically dispersed and multi-owner resources, autonomously from their physical form and site, in vibrant practical organizations that carve up the similar objective of decipher large-scale applications. Thus any type of failure can happen at any point of time and job running in grid environment might fail. Therefore fault tolerance is an imperative and demanding concern in grid computing as the steadiness of individual grid resources may not be guaranteed. In order to build computational grids more effectual and consistent fault tolerant system is required. In order to accomplish the user prospect in terms of recital and competence, the Grid system desires SOA Fault Management Framework for the sharing of tasks with fault tolerance. A Fault Management Framework endeavor to pick up the response time of user’s proposed applications by ensures maximal exploitation of obtainable resources. The main aim is to avert, if probable, the stipulation where some processors are congested by means of a set of tasks while others are flippantly loaded or even at leisure.

Major Variants of the SIS Architecture Pattern for Collective Intelligence Systems

Conference Paper

Full-text available

Jul 2016

Collective Intelligence Systems (CIS), such as social networking services, wikis, and media sharing platforms, access and harness the collective knowledge of connected people by providing a web-based environment to share, distribute, and retrieve topic-specific information in an efficient way. In order to design well-tailored CIS, software architects need a complete understanding about (1) architectural principles that all kinds of CIS have in common, and (2) system variants in the field. Thus to provide consolidated systematic knowledge of architectural commonalities and variations in the CIS domain, we present in this work five major pattern variants of CIS. We investigated a number of CIS in the field with focus on a detailed survey of existing variants among key architecture-significant principles based on previously identified basic concepts, principles, and characteristics of software architectures that all CIS have in common. The variants are identified along two dimensions with respect to the relationship between the key elements of artifacts and actor records across and within two layers of the system.

TensorFlow: A system for large-scale machine learning

Article

Full-text available

May 2016

TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

Design Patterns for Multi-agent Systems: A Systematic Literature Review

Article

Full-text available

Feb 2014

Design patterns document a field's systematic knowledge derived from experiences. Despite the vast body of work in the field of multi-agent systems (MAS), design patterns for MAS are not popular among software practitioners. As MAS have features that are widely considered as key to engineering complex distributed applications, it is important to provide a clear overview of existing patterns to make this knowledge accessible. To that end, we performed a systematic literature review (SLR) covering the main publication venues of the field since 1998, resulting in 206 patterns. The study shows that (1) there is a lack of a standard template for documenting design patterns for MAS, which hampers the use of patterns by practitioners, (2) associations between patterns are poorly described, which results in a lack of overview of the pattern space, (3) patterns for MAS have been used for a variety of application domains, which underpins their high potential for practitioners, and (4) classifications of design patterns for MAS are bounded to specific pattern catalogs, a more holistic view on the pattern space is missing. From our study, we outline a number of guidelines that are important for future work on design patterns for MAS and their adoption in practice. © 2014 Springer-Verlag Berlin Heidelberg. All rights are reserved.

A Holistic Approach to Intelligent Automated Control

Conference Paper

Full-text available

Jan 1995

Our belief is that industrial problems require holistic solutions if AI based control is to become an everyday reality. We have designed and implemented a software architecture which supports heterogenous AI subsystems and which is now controlling a working plant. The support architecture is associated with (1) a design methodology and (2) an underlying conceptual model of an integrated system. In the latter, a reactive system is seen as consisting of layers composed of interacting elements called 'basic control tasks'. Designing a control system to this model requires the analyst to consider the plant, its environment and it components as a whole.

Introduction and Challenges of Environment Architectures for Collective Intelligence Systems

Chapter

Full-text available

Nov 2015

Collective Intelligence Systems (CIS), such as wikis, social networks, and content-sharing platforms, are an integral part of today’s collective knowledge creation and sharing processes. CIS are complex adaptive systems, which realize environment-mediated coordination, in particular with stigmergic mechanisms. The behavior of CIS is emergent, as high-level, system-wide behavior is influenced by low-level rules. These rules are encapsulated by the CIS infrastructure that comprises in its center an actor-created artifact network that stores the shared content. In this chapter, we provide an introduction to the CIS domain, CIS architectural principles and processes. Further, we reflect on the role of CIS as multi-agent system (MAS) environments and conclude with an outlook on research challenges for CIS architectures.

Evaluating strategies for study selection in systematic literature studies

Conference Paper

Full-text available

Sep 2014

Context: The study selection process is critical to improve the reliability of secondary studies. Goal: To evaluate the selection strategies commonly employed in secondary studies in software engineering. Method: Building on these strategies, a study selection process was formulated and evaluated in a systematic review. Results: The selection process used a more inclusive strategy than the one typically used in secondary studies, which led to additional relevant articles. Conclusions: The results indicates that a good-enough sample could be obtained by following a less inclusive but more efficient strategy, if the articles identified as relevant for the study are a representative sample of the population, and there is a homogeneity of results and quality of the articles.

Grounded Theory

Chapter

Feb 2007

Kathy Charmaz

The term grounded theory refers to a set of methods for conducting the research process and the product of this process, the resulting theoretical analysis of an empirical problem. The name grounded theory mirrors its fundamental premise that researchers can and should develop theory from rigorous analyses of empirical data. As a specific methodological approach, grounded theory refers to a set of systematic guidelines for data gathering, coding, synthesizing, categorizing, and integrating concepts to generate middle‐range theory. Grounded theory methods are distinctive in that data collection and analysis proceed simultaneously and each informs the other. From the beginning of the research process, the researcher analyzes the data and identifies analytic leads and tentative categories to develop through further data collection. A grounded theory of a studied topic starts with concrete data and ends with rendering them in an explanatory theory.

Quantitative Narrative Analysis

Book

Jan 2010

Roberto Franzosi

Preliminary Literature Review of Machine Learning System Development Practices

Conference Paper

Jul 2021

FLSim: An Extensible and Reusable Simulation Framework for Federated Learning

Chapter

Apr 2021

Federated learning is designed for multiple mobile devices to collaboratively train an artificial intelligence model while preserving data privacy. Instead of collecting the raw training data from mobile devices to the central server, federated learning coordinates a group of devices to train a shared model in a distributed manner with their local data. However, prior to effectively deploying federated learning on resource-constrained mobile devices in large scale, different factors including the convergence rate, energy efficiency and model accuracy should be well studied. Thus, a flexible simulation framework that can be used to investigate a wide range of problems related to federated learning is urgently required. In this paper, we propose FLSim, a framework for efficiently building simulators for federated learning. Unlike ad hoc simulators, FLSim is envisioned as an open repository of building blocks for creating simulators. To this end, FLSim consists of a set of software components organized in a well-structured software architecture that provides the foundation for maximizing flexibility and extensibility. With FLSim, creating a simulator generally involves only putting the selected components together, thus allowing users to focus on the problems being studied. We describe the design of the framework in detail and use a few use cases to demonstrate the ease with which various simulators can be constructed with FLSim.

Towards classes of architectural dependability assurance for machine-learning-based systems

Conference Paper

Jun 2020

Emerging and Changing Tasks in the Development Process for Machine Learning Systems

Conference Paper

Jun 2020

Practitioners’ insights on machine-learning software engineering design patterns: a preliminary study

Conference Paper

Sep 2020

Towards Using Probabilistic Models to Design Software Systems with Inherent Uncertainty

Chapter

Sep 2020

The adoption of machine learning (ML) components in software systems raises new engineering challenges. In particular, the inherent uncertainty regarding functional suitability and the operation environment makes architecture evaluation and trade-off analysis difficult. We propose a software architecture evaluation method called Modeling Uncertainty During Design (MUDD) that explicitly models the uncertainty associated to ML components and evaluates how it propagates through a system. The method supports reasoning over how architectural patterns can mitigate uncertainty and enables comparison of different architectures focused on the interplay between ML and classical software components. While our approach is domain-agnostic and suitable for any system where uncertainty plays a central role, we demonstrate our approach using as example a perception system for autonomous driving.

What Is Really Different in Engineering AI-Enabled Systems?

Article

Jul 2020

Ipek Ozkaya

Advances in machine learning (ML) algorithms and increasing availability of computational power have resulted in huge investments in systems that aspire to exploit artificial intelligence (AI), in particular ML. AIenabled systems, software-reliant systems that include data and components that implement algorithms mimicking learning and problem solving, have inherently different characteristics than software systems alone.<sup>1</sup> However, the development and sustainment of such systems also have many parallels with building, deploying, and sustaining software systems. A common observation is that although software systems are deterministic and you can build and test to a specification, AI-enabled systems, in particular those that include ML components, are generally probabilistic. Systems with ML components can have a high margin of error due to the uncertainty that often follows predictive algorithms. The margin of error can be related to the inability to predict the result in advance or the same result cannot be reproduced. This characteristic makes AI-enabled systems hard to test and verify.<sup>2</sup> Consequently, it is easy to assume that what we know about designing and reasoning about software systems does not immediately apply in AI engineering. AI-enabled systems are software systems. The sneaky part about engineering AI systems is they are "just like" conventional software systems we can design and reason about until they?re not.

Improvement of the Quality of Cutting Tools States Recognition Using Cloud Technologies

Chapter

Jun 2020

The work considers improving the quality of constructing large-scale diagnostic models in technical diagnostics systems by developing a software architecture for high-performance computing in the form of a web service using cloud-based machine learning technologies. The obtained results are brought to practical realization in the form of tools of the automated system of technical diagnostics of cutting tools with the diagnostic parameters of large dimensions. A method has been developed for building information models of cutting tool states based on indirect measurements using test pulse effects on a cutting system in the form of loads with impacts and recording system responses, based on which information models are built in the form of multidimensional transition functions. The methods of forming test pulse loads of the cutting system by successive insertion of the cutting tool into the workpiece with different cutting depths, with variable feed, and with variable cutting duration are considered. The computational experiment demonstrates the advantages of information models in the form of multidimensional transition functions for modeling nonlinear dynamic systems in problems of diagnosing the states of cutting tools. It has been established that multiclass cutting tools state recognition can be used as an effective technology of automated technical diagnostics systems.

A Model-Driven Architectural Design Method for Big Data Analytics Applications

Conference Paper

Mar 2020

Residue Number System-Based Solution for Reducing the Hardware Cost of a Convolutional Neural Network

Article

May 2020
NEUROCOMPUTING

Convolutional neural networks (CNNs) represent deep learning architectures that are currently used in a wide range of applications, including computer vision, speech recognition, time series analysis in finance, and many others. At the same time, CNNs are very demanding in terms of the hardware and time cost of a computing system, which considerably restricts their practical use, e.g., in embedded systems, real-time systems, and mobile volatile devices. The goal of this paper is to reduce the resources required to build and operate CNNs. To achieve this goal, a CNN architecture based on Residue Number System (RNS) and the new Chinese Remainder Theorem with fractions is proposed. The new architecture gives an efficient solution to the main problem of RNSs associated with restoring the number from its residues, which determines the main contribution to the CNN structure. In accordance with the results of hardware simulation on Kintex7 xc7k70tfbg484-2 FPGA, the use of RNS in the convolutional layer of a neural network reduces hardware cost by 32.6% compared to the traditional approach based on the binary number system. In addition, the use of the proposed hardware-software architecture reduces the average image recognition time by 37.06% compared to the software implementation.

Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques, and Tools

Article

Feb 2020

Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-art results in various domains, such as image recognition and natural language processing. One of the reasons for this success is the increasing size of DL models and the proliferation of vast amounts of training data being available. To keep on improving the performance of DL, increasing the scalability of DL systems is necessary. In this survey, we perform a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures. This incorporates infrastructures for DL, methods for parallel DL training, multi-tenant resource scheduling, and the management of training and model data. Further, we analyze and compare 11 current open-source DL frameworks and tools and investigate which of the techniques are commonly implemented in practice. Finally, we highlight future research trends in DL systems that deserve further research.

Landscape of IoT Patterns

Conference Paper

May 2019

Designing Safety Critical Software Systems to Manage Inherent Uncertainty

Conference Paper

Mar 2019

Alexandru Constantin Serban

Machine Learning System Architectural Pattern for Improving Operational Stability

Conference Paper

Mar 2019

Haruki Yokoyama

An Expert Recommendation System for Design Decision Making: Who Should be Involved in Making a Design Decision?

Conference Paper

Apr 2018

In large software engineering projects, designing software systems is a collaborative decision-making process where a group of architects and developers make design decisions on how to address design concerns by discussing alternative design solutions. For the decision-making process, involving appropriate individuals requires objectivity and awareness about their expertise. In this paper, we propose a novel expert recommendation system that identifies individuals who could be involved in tackling new design concerns in software engineering projects. The approach behind the proposed system addresses challenges such as identifying architectural skills, quantifying architectural expertise of architects and developers, and finally matching and recommending individuals with suitable expertise to discuss new design concerns. To validate our approach, a quantitative evaluation of the recommendation system was performed using design decisions from four software engineering projects. The evaluation not only indicates that individuals with architectural expertise can be identified for design decision making but also provides quantitative evidence for the existence of personal experience bias during the decision-making process.

Software Architectures for Robotics Systems: A Systematic Mapping Study

Article

Aug 2016
J SYST SOFTWARE

Context Several research efforts have been targeted to support architecture centric development and evolution of software for robotic systems for the last two decades. Objective We aimed to systematically identify and classify the existing solutions, research progress and directions that influence architecture-driven modeling, development and evolution of robotic software. Research Method We have used Systematic Mapping Study (SMS) method for identifying and analyzing 56 peer-reviewed papers. Our review has (i) taxonomically classified the existing research and (ii) systematically mapped the solutions, frameworks, notations and evaluation methods to highlight the role of software architecture in robotic systems. Results and Conclusions We have identified eight themes that support architectural solutions to enable (i) operations, (ii) evolution and (iii) development specific activities of robotic software. The research in this area has progressed from object-oriented to component-based and now to service-driven robotics representing different architectural models that emerged overtime. An emerging solution is cloud robotics that exploits the foundations of service-driven architectures to support an interconnected web of robots. The results of this SMS facilitate knowledge transfer – benefiting researchers and practitioners – focused on exploiting software architecture to model, develop and evolve robotic systems.

An Exploratory Study of the Security Design Pattern Landscape and their Classification

Article

Jul 2016

Security is a critical part of information systems and must be integrated into every aspect of the system. It requires a lot of expertise to design and implement secure systems due to the broad coverage of security issues and threats. A good system design is based on sound software engineering principles which leverages proven best practices in the form of standard guidelines and design patterns. A design pattern represents a reusable solution to a recurring problem in a specific context. The current security design pattern landscape contains several patterns, pattern catalogs and pattern classification schemes. To apply appropriate patterns for a specific problem context, a deeper understanding of this domain is essential. A survey of patterns and their classification schemes will aid in understanding pattern coverage and identifying gaps. In this paper, the authors have presented a detailed exploratory study of the security design pattern landscape. Based on their study, the authors have identified shortcomings and presented future research directions.

An Architecture for Agile Machine Learning in Real-Time Applications

Conference Paper

Aug 2015

Johann Schleier-Smith

Machine learning techniques have proved effective in recommender systems and other applications, yet teams working to deploy them lack many of the advantages that those in more established software disciplines today take for granted. The well-known Agile methodology advances projects in a chain of rapid development cycles, with subsequent steps often informed by production experiments. Support for such workflow in machine learning applications remains primitive. The platform developed at if(we) embodies a specific machine learning approach and a rigorous data architecture constraint, so allowing teams to work in rapid iterative cycles. We require models to consume data from a time-ordered event history, and we focus on facilitating creative feature engineering. We make it practical for data scientists to use the same model code in development and in production deployment, and make it practical for them to collaborate on complex models. We deliver real-time recommendations at scale, returning top results from among 10,000,000 candidates with sub-second response times and incorporating new updates in just a few seconds. Using the approach and architecture described here, our team can routinely go from ideas for new models to production-validated results within two weeks.

Guide to Advanced Empirical Software Engineering

Book

Jan 2008

Empirical studies have become an integral element of software engineering research and practice. This unique text/reference includes chapters from some of the top international empirical software engineering researchers and focuses on the practical knowledge necessary for conducting, reporting and using empirical methods in software engineering. Part 1, 'Research Methods and Techniques', examines the proper use of various strategies for collecting and analysing data, and the uses for which those strategies are most appropriate. Part 2, 'Practical Foundations', provides a discussion of several important global issues that need to be considered from the very beginning of research planning. Finally, 'Knowledge Creation' offers insight on using a set of disparate studies to provide useful decision support. Topics and features: Offers information across a range of techniques, methods, and qualitative and quantitative issues, providing a toolkit for the reader that is applicable across the diversity of software development contexts Presents reference material with concrete software engineering examples Provides guidance on how to design, conduct, analyse, interpret and report empirical studies, taking into account the common difficulties and challenges encountered in the field Arms researchers with the information necessary to avoid fundamental risks Tackles appropriate techniques for addressing disparate studies - ensuring the relevance of empirical software engineering, and showing its practical impact Describes methods that are less often used in the field, providing less conventional but still rigorous and useful ways of collecting data Supplies detailed information on topics (such as surveys) that often contain methodological errors This broad-ranging, practical guide will prove an invaluable and useful reference for practising software engineers and researchers. In addition, it will be suitable for graduate students studying empirical methods in software development.

Software Architecture of a Mobile Robot

Conference Paper

Dec 2015

Experimentation in Software Engineering

Chapter

May 2012

The experiment data from the operation is input to the analysis and interpretation. After collecting experimental data in the operation phase, we want to be able to draw conclusions based on this data. To be able to draw valid conclusions, we must interpret the experiment data.

Wyrm: A Brain-Computer Interface Toolbox in Python

Article

May 2015

In the last years Python has gained more and more traction in the scientific community. Projects like NumPy, SciPy, and Matplotlib have created a strong foundation for scientific computing in Python and machine learning packages like scikit-learn or packages for data analysis like Pandas are building on top of it. In this paper we present Wyrm ( https://github.com/bbci/wyrm ), an open source BCI toolbox in Python. Wyrm is applicable to a broad range of neuroscientific problems. It can be used as a toolbox for analysis and visualization of neurophysiological data and in real-time settings, like an online BCI application. In order to prevent software defects, Wyrm makes extensive use of unit testing. We will explain the key aspects of Wyrm's software architecture and design decisions for its data structure, and demonstrate and validate the use of our toolbox by presenting our approach to the classification tasks of two different data sets from the BCI Competition III. Furthermore, we will give a brief analysis of the data sets using our toolbox, and demonstrate how we implemented an online experiment using Wyrm. With Wyrm we add the final piece to our ongoing effort to provide a complete, free and open source BCI system in Python.

Qualitative Methods in Empirical Studies of Software Engineering

Article

Jan 1999

Carolyn Seaman

Guidelines for snowballing in systematic literature studies and a replication in software engineering

Article

May 2014

Claes Wohlin

Background: Systematic literature studies have become common in software engineering, and hence it is important to understand how to conduct them efficiently and reliably. Objective: This paper presents guidelines for conducting literature reviews using a snowballing approach, and they are illustrated and evaluated by replicating a published systematic literature review. Method: The guidelines are based on the experience from conducting several systematic literature reviews and experimenting with different approaches. Results: The guidelines for using snowballing as a way to search for relevant literature was successfully applied to a systematic literature review. Conclusions: It is concluded that using snowballing, as a first search strategy, may very well be a good alternative to the use of database searches.

A systematic review of systematic review process research in software engineering

Article

Dec 2013
INFORM SOFTWARE TECH

Context Many researchers adopting systematic reviews (SRs) have also published papers discussing problems with the SR methodology and suggestions for improving it. Since guidelines for SRs in software engineering (SE) were last updated in 2007, we believe it is time to investigate whether the guidelines need to be amended in the light of recent research. Objective To identify, evaluate and synthesize research published by software engineering researchers concerning their experiences of performing SRs and their proposals for improving the SR process. Method We undertook a systematic review of papers reporting experiences of undertaking SRs and/or discussing techniques that could be used to improve the SR process. Studies were classified with respect to the stage in the SR process they addressed, whether they related to education or problems faced by novices and whether they proposed the use of textual analysis tools. Results We identified 68 papers reporting 63 unique studies published in SE conferences and journals between 2005 and mid-2012. The most common criticisms of SRs were that they take a long time, that SE digital libraries are not appropriate for broad literature searches and that assessing the quality of empirical studies of different types is difficult. Conclusion We recommend removing advice to use structured questions to construct search strings and including advice to use a quasi-gold standard based on a limited manual search to assist the construction of search stings and evaluation of the search process. Textual analysis tools are likely to be useful for inclusion/exclusion decisions and search string construction but require more stringent evaluation. SE researchers would benefit from tools to manage the SR process but existing tools need independent validation. Quality assessment of studies using a variety of empirical methods remains a major problem.

Research state of the art on GoF design patterns: A mapping study

Article

Jul 2013
J SYST SOFTWARE

Design patterns are used in software development to provide reusable and documented solutions to common design problems. Although many studies have explored various aspects of design patterns, no research summarizing the state of research related to design patterns existed up to now. This paper presents the results of a mapping study of about 120 primary studies, to provide an overview of the research efforts on Gang of Four (GoF) design patterns. The research questions of this study deal with (a) if design pattern research can be further categorized in research subtopics, (b) which of the above subtopics are the most active ones and (c) what is the reported effect of GoF patterns on software quality attributes. The results suggest that design pattern research can be further categorized to research on GoF patterns formalization, detection and application and on the effect of GoF patterns on software quality attributes. Concerning the intensity of research activity of the abovementioned subtopics, research on pattern detection and on the effect of GoF patterns on software quality attributes appear to be the most active ones. Finally, the reported research to date on the effect of GoF patterns on software quality attributes are controversial; because some studies identify one pattern's effect as beneficial whereas others report the same pattern's effect as harmful.

Architecting ML-enabled systems: Challenges, best practices, and design decisions

No full-text available

Recommended publications

Contextualized Reflective Support in Designing Instruction Based on Both Theory and Practice

The LogBarrier adversarial attack: making effective use of decision boundary information

The Impact of Culture on Usability: Designing Usable Products for the International User

Designing to Motivate: Motivational Techniques to Incorporate in E-Learning Experiences

Reduce Costs, Improve Quality and Drive Business Value with Exchanger Design & Rating Best Practices

Research Informed Design, Best Practice, and Fresh Perspectives: Can We All Get Along?

Final Report - Prabhsimran and Rafae - HCI

A Practitioner's Guide to Instructional Design in Higher Education

Best practice in design and testing of isolation rooms in Nordic hospitals

Trauma-informed Design

Working Toward Best Practices for Design Records

Designing textual password systems for children