Figure 2 - uploaded by Aleksander Fabijan
Content may be subject to copyright.
The "Xbox deals" experiment. 

The "Xbox deals" experiment. 

Source publication
Conference Paper
Full-text available
Software development companies are increasingly aiming to become data-driven by trying to continuously experiment with the products used by their customers. Although familiar with the competitive edge that the A/B testing technology delivers, they seldom succeed in evolving and adopting the methodology. In this paper, and based on an exhaustive and...

Context in source publication

Context 1
... one of the experiments, a product team at Xbox aimed to identify whether showing prices (original price and the discount) in the weekly deals stripe, and using algorithmic as opposed to editorial ordering of the items in the stripe impacts engagement and purchases. They experimented with two different variants. On Figure 2, we illustrate the experiment control (A) and both of the treatments (B, C). At Xbox, instrumentation is well established and a reliable pipeline for data collection exists. Metrics that measure user engagement and purchases are established and consist of a combination of different signals from the logs aggregated per user, session and other analysis units. In contrast to the Office Word experiment above, the Xbox team autonomously set-up their experiments, however, they still require assistance on the execution and monitoring of the experiment and at the analysis stage to interpret results. The two-week experiment showed that, compared to control, treatment B decreased engagement with the stripe. The purchases, however, did not decrease. By showing prices upfront treatment B provided better user experience by engaging the users who are interested in a purchase and sparing a click for those not interested. Treatment C provided even greater benefit, increasing both engagement with the stripe and purchases made. In this experiment the team learned that (1) Showing prices upfront results in better user experience, and (2) Algorithmic ordering of deals beats manual editorial ...

Similar publications

Article
Full-text available
Interspecific interactions are contingent upon organism phenotypes, and thus phenotypic evolution can modify interspecific interactions and affect ecological dynamics. Recent studies have suggested that male–male competition within a species selects for capability to reproductively interfere with a closely related species. Here, we examine the effe...

Citations

... The goal of an RCE is to to simplify the initial adoption of A/B testing for software teams, and accelerate their transition from doing zero experiments to the Crawl and W alk phases of the Experimentation Evolution Model [12], [13] by covering the technical, organizational, and business experimentation needs of a company willing to adopt data-driven experimentation. Thus, a development team can import and reuse a pre-implemented controlled experiment in a plugand-play manner similar to how libraries and frameworks can be imported and used in an existing project. ...
... A software team can be practicing A/B testing at a different level of sophistication, ranging from occasional ad-hoc experimentation to a continuous structured experimentation at scale, and it is the journey a company has to take to become truly data-driven. Fabijan et al [12] explored different phases of the journey and described it through the Experimentation Evolution Model (EEM). EEM outlines (Fig. 2) that a company typically undergoes 4 phases, namely Crawl, W alk, Run, and F ly, and has to evolve technical, organizational, and business aspects of its operations. ...
... EEM outlines (Fig. 2) that a company typically undergoes 4 phases, namely Crawl, W alk, Run, and F ly, and has to evolve technical, organizational, and business aspects of its operations. [12] Interestingly, the same group of scholars proposed Experimentation Growth Model [15] in an attempt to define what a mature company in regards to controlled experimentation does. The model argues that a mature company should run experiments across most of its functionality, and that all proposed software changes should be a subject to experimentation. ...
Conference Paper
Full-text available
Online controlled experimentation, and more specifically A/B testing, is an effective method for assessing the impact of software changes. However, when adopting A/B testing, a development team faces various organizational and technical challenges. In this paper, we propose a new notion of reusable controlled experiments (RCE) to simplify and accelerate the adoption of A/B testing for software teams. In its essence, an RCE is a reusable software component supplied with the built-in A/B testing functionality. We provide a proof-of-concept implementation of an RCE, integrate it into a mobile applications in the field of educational technology, and run a experiment to validate the proposed solution. We conclude by checking the resulting integration against the six criteria categories of Experimentation Evolution Model (EEM) to identify the maturity phase for each category. The resulting RCE is found to correspond to the experimentation evolution model's Walk maturity phase in three out of six categories, and to the Crawl phase in the other three categories.
... Halper and Stodder 2017). On the other side, authors such as Anderson (2015), Fabijan et al. (2017), Kearny et al. (2016), or Thusoo and Sarma (2017) combine two or more characteristics to craft more complex, multi-characteristic DDO understandings, thereby referring to culture, technology, abilities, data assets, processes and value-creation mechanisms. Illustratively, Fabijan et al. (2017) state that "data-driven companies acquire, process, and leverage data in order to create efficiencies, iterate on and develop new products, and navigate the competitive landscape" (p. 1). ...
... On the other side, authors such as Anderson (2015), Fabijan et al. (2017), Kearny et al. (2016), or Thusoo and Sarma (2017) combine two or more characteristics to craft more complex, multi-characteristic DDO understandings, thereby referring to culture, technology, abilities, data assets, processes and value-creation mechanisms. Illustratively, Fabijan et al. (2017) state that "data-driven companies acquire, process, and leverage data in order to create efficiencies, iterate on and develop new products, and navigate the competitive landscape" (p. 1). Likewise, Anderson (2015) combines "tools, abilities, and, most crucially, a culture that acts on data" (p. 1) to explain his DDO understanding. ...
Conference Paper
Full-text available
In today’s data-centric era, organizations increasingly aim to operate more data-driven and therefore engage in digital transformations toward becoming a data-driven organization (DDO). To govern such transformations, top managers develop digital transformation strategies (DTS) characterized by different organizational ambidexterity approaches. This study analyzes how such DTS influence the process and (intermediate) outcomes of organizations’ digital transformations toward becoming a DDO by studying two organizations undertaking such DDO transformations using the concept of organizational ambidexterity as a theoretical lens. On this empirical basis, we find that DTS characterized by different organizational ambidexterity approaches lead to different transformation processes and (intermediate) outcomes. Thereby, this study contributes to existing academic literature in the field of DDOs and DTS, as such transformation journeys toward becoming a DDO have not been studied in its entirety yet. Furthermore, our paper offers practical guidance for top managers to develop and implement a DTS suitable for their organization.
... using the search string described above. Furthermore, when analyzing our review sample, it became apparent that practitioner work-such as Patil (2011) and Anderson (2015)-is frequently cited in the academic literature as well (e.g., in Fabijan et al., 2017;Hupperz et al., 2021). Therefore, we ...
... For example, three studies in our review sample present the sourcing and processing of data, combined with the goal of using data to gain competitive advantage, as key DDO characteristics. A corresponding understanding is evident in Fabijan et al. (2017) who state that "data-driven companies acquire, process, and leverage data in order to create efficiencies, iterate on and develop new products, and navigate the competitive landscape" (p. 1). Likewise, Gualo et al. (2021) put particular emphasis on the importance of the quality of the obtained data and name better service to the organization's customer as a DDO characteristic. ...
Article
Full-text available
With companies and other organizations increasingly striving to become (more) data-driven, there has been growing research interest in the notion of a data-driven organization (DDO). In existing literature, however, different understandings of such an organization emerged. The study at hand sets forth to synthesize the fragmented body of research through a review of existing DDO definitions and implicit understandings of this concept in the information systems and related literatures. Based on the review results and drawing on the established concept of the “knowing organization,” our study identifies five core dimensions of a DDO—namely, data sourcing & sensemaking, data capabilities, data-driven culture, data-driven decision-making, and data-driven value creation—which we integrate into a conceptual DDO framework. Most notably, the proposed framework suggests that—like its predecessor, the knowing organization—a DDO may draw on an outside-in view; however, it may also draw on an inside-out view, or even combine the two views, thereby setting itself apart from the knowing organization. To illustrate our conceptual DDO framework and demonstrate its usefulness, we apply this framework to three empirical examples. Theoretical and practical contributions as well as directions for future research are discussed.
... Central to this is innovation by exploring new software features or experimenting with software changes. In order to enable such innovation in practice, software companies often employ A/B testing [92,58,106,71]. A/B testing, also referred to as online controlled experimentation or continuous experimentation, is a form of hypothesis testing where two variants of a piece of software are evaluated in the field (ranging from variants with a slightly altered GUI layout to variants of software with new features). In particular, the merit of the two variants are analyzed using metrics such as click rates of visitors of websites, members' lifetime values (LTV) in a subscription service, and user conversions in marketing [82,161,48]. ...
... To achieve this, A/B testing is used to setup and evaluate online controlled experiments in the software system. Fabijan et al. [58] for example perform a case study on the evolution of scaling up continuous experimentation at Microsoft, providing guidelines for other companies to conduct continuous experimentation. ...
... Application of A/B testing 51 [175,123,95,102,16,15,121,171,99,148,33,70,66,52,20,174,107,63,155,150,65,163,170,27,143,2,5,135,149,7,98,147,141,173,19,26,8,114,6,122,50,97,136,125,22,124,128,159,67,3,176] Improving efficiency of A/B testing 20 [1,28,127,164,23,85,47,39,44,86,40,45,46,109,100,83,18,78,37,64] Beyond standard A/B testing 18 [166,38,82,48,77,79,112,134,72,139,126,29,118,75,49,117,30,151] Concrete A/B testing problems 17 [138,73,168,105,146,43,162,14,103,111,71,24,153,137,96,101,25] Pitfalls and challenges of A/B testing 13 [91,88,54,60,167,42,169,120,11,41,110,140,90] Experimentation frameworks and platforms 13 [144,154,106,156,108,9,131,74,21,36,179,177,152] A/B testing at scale 9 [89,160,58,165,81,157,76,57,56] ...
Preprint
In A/B testing two variants of a piece of software are compared in the field from an end user's point of view, enabling data-driven decision making. While widely used in practice, no comprehensive study has been conducted on the state-of-the-art in A/B testing. This paper reports the results of a systematic literature review that analyzed 141 primary studies. The results shows that the main targets of A/B testing are algorithms and visual elements. Single classic A/B tests are the dominating type of tests. Stakeholders have three main roles in the design of A/B tests: concept designer, experiment architect, and setup technician. The primary types of data collected during the execution of A/B tests are product/system data and user-centric data. The dominating use of the test results are feature selection, feature rollout, and continued feature development. Stakeholders have two main roles during A/B test execution: experiment coordinator and experiment assessor. The main reported open problems are enhancement of proposed approaches and their usability. Interesting lines for future research include: strengthen the adoption of statistical methods in A/B testing, improving the process of A/B testing, and enhancing the automation of A/B testing.
... Awareness could be raised by training engineers in interdisciplinary work so that it becomes easier to integrate HF experts in agile teams (as in I3). In addition, research is needed to determine how to increase the ability of agile teams to manage open questions (see I6) as well as their experimentation infrastructure (see I2) (Fagerholm et al., 2017b;Schermann et al., 2018;Fabijan et al., 2017). ...
... In particular, the need to have AV developers participate in (or even run) HF experiments (I1) requires the attention of researchers. In continuous software development, there is a trend towards data-driven decision making and experimentation (Fabijan et al., 2017;Schermann et al., 2018;Meyer, 2015;Kohavi et al., 2009;Kevic et al., 2017). ...
... Experimentation maturity models (Fabijan et al. 2017(Fabijan et al. , 2018Optimizely 2018;Wider Funnel 2018;Brooks Bell 2015) consist of the phases organizations are likely to go through on the way to being data-driven and running every change through A/B experiments: Crawl, Walk, Run, and Fly. ...
Chapter
Full-text available
Many good resources are available with motivation and explanations about online controlled experiments (Kohavi et al. 2009a, 2020; Thomke 2020; Luca and Bazerman 2020; Georgiev 2018, 2019; Kohavi and Thomke 2017; Siroker and Koomen 2013; Goward 2012; Schrage 2014; King et al. 2017; McFarland 2012; Manzi 2012; Tang et al. 2010). For organizations running online controlled experiments at scale, Gupta et al. (2019) provide an advanced set of challenges. We provide a motivating visual example of a controlled experiment that ran at Microsoft’s Bing. The team wanted to add a feature allowing advertisers to provide links to the target site. The rationale is that this will improve ads quality by giving users more information about what the advertiser’s site provides and allow users to directly navigate to the sub-category matching their intent. Visuals of the existing ads layout (Control) and the new ads layout (Treatment) with site links added are shown in Fig. 1.
... INTRODUCTION A/B testing enables companies to make trustworthy datadriven decisions at scale and has been a research area in the software industry for many years [1], [2]. Companies run A/B tests to assess ideas and to safely validate [3] what delivers value to their customers. ...
... However, in case of B2B partnerships there is an alternativeintegrators can sometimes reuse A/B platforms UI and there are good reasons to do this. A/B testing platform teams have been publishing research on the importance of intuitive and comprehensive User Interface (UI) for the process of running and operating an A/B testing platform for many years [1], [2]. A/B testing UI can be as simple as a notebook with sample code on how to make an API call to start an A/B test for new teams starting to run A/B tests or a well-designed and comprehensive user experience. ...
Conference Paper
Full-text available
A/B tests are the gold standard for evaluating product changes. At Microsoft, for example, we run tens of thousands of A/B tests every year to understand how users respond to new designs, new features, bug fixes, or any other ideas we might have on what will deliver value to users. In addition to testing product changes, however, A/B testing is starting to gain momentum as a differentiating feature of platforms or products whose primary purpose may not be A/B testing. As we describe in this paper, organizations such as Azure PlayFab and Outreach have integrated experimentation platforms and offer A/B testing to their customers as one of the many features in their product portfolio. In this paper and based on multiple-case studies, we present the lessons learned from enabling A/B integrations-integrating A/B testing into software products. We enrich each of the learnings with a motivating example, share the trade-offs made along this journey, and provide recommendations for practitioners. Our learnings are most applicable for engineering teams developing experimentation platforms, integrators considering embedding A/B testing into their products, and for researchers working in the A/B testing domain.
... While agile methods emphasise customer value [6], building the right product appears to be a feat that few startups achieve. Continuous experimentation (CE) is a software engineering method where product development is driven by field experiments with real users [29,8,9,41,2]. It strives to establish virtuous feedback loops between business, development, and operations [11], and reportedly improves product quality and business performance [7,8], with promising implications for startups. ...
... Continuous experimentation (CE) is a software engineering method where product development is driven by field experiments with real users [29,8,9,41,2]. It strives to establish virtuous feedback loops between business, development, and operations [11], and reportedly improves product quality and business performance [7,8], with promising implications for startups. ...
... CE approaches software product development through experiments with real users [29,8,9,41,2]. This includes collecting and analysing experimental data to test product hypotheses, gaining insights for new feature ideas to be evaluated in subsequent experiments. ...
Preprint
Full-text available
Background: Continuous experimentation (CE) has been proposed as a data-driven approach to software product development. Several challenges with this approach have been described in large organisations, but its application in smaller companies with early-stage products remains largely unexplored. Aims: The goal of this study is to understand what factors could affect the adoption of CE in early-stage software startups. Method: We present a descriptive multiple-case study of five startups in Finland which differ in their utilisation of experimentation. Results: We find that practices often mentioned as prerequisites for CE, such as iterative development and continuous integration and delivery, were used in the case companies. CE was not widely recognised or used as described in the literature. Only one company performed experiments and used experimental data systematically. Conclusions: Our study indicates that small companies may be unlikely to adopt CE unless 1) at least some company employees have prior experience with the practice, 2) the company's limited available resources are not exceeded by its adoption, and 3) the practice solves a problem currently experienced by the company, or the company perceives almost immediate benefit of adopting it. We discuss implications for advancing CE in early-stage startups and outline directions for future research on the approach.
... However, the experiments are often defined by the product owner of the company, while the software developers do their implementation in the product development [15]. In turn, experiments can be implemented with different techniques like feature toggles [26] or API traffic management [13]. This results in additional communication and synchronization efforts to directly use the experimentation results during the development. ...
... Continuous experimentation describes the concept of continuously testing the underlying assumptions of the software product with the users based on experiments [27]. Controlled experiments split the software product into different variants and test those variants on distinct user groups [13]. Here, split-testing (or A/B testing with only two variants) allows the analysis and comparison of a single variable over time, while multivariate testing allows the comparison of multiple variables simultaneously. ...
... In addition to the adoption of cross-functional teams, sprints, and iterative development, innovation initiatives that aim to generate new and recurring revenue streams require a shift towards customer-driven innovation and lean start-up ways-of-working [21], [22], [23], [24]. In addition, companies need experimentation practices and mechanisms that help them continuously deploy, measure and evaluate what constitutes customer value [25], [26], [27], [28]. As recognized in [24], agile development methods help answer 'how' to build products and how to increase speed in development. ...