Table 3 - uploaded by Cor-Paul Bezemer
Content may be subject to copyright.
Reasons for choosing reimplementing over reusing source code. (Multi- selection allowed, hence the sum of the percentages is larger than 100%.)

Reasons for choosing reimplementing over reusing source code. (Multi- selection allowed, hence the sum of the percentages is larger than 100%.)

Source publication
Article
Full-text available
Technical question and answer Q&A platforms, such as Stack Overflow, provide a platform for users to ask and answer questions about a wide variety of programming topics. These platforms accumulate a large amount of knowledge, including hundreds of thousands lines of source code. Developers can benefit from the source code that is attached to the qu...

Context in source publication

Context 1
... majority of developers (65%) prefer reimplementing source code, due to the code modication that is required to make the code from the post work in their own project. Table 3 shows the reasons for choosing reimplementation over the reuse of source code. The top reason that makes developers prefer reimplementing source code is the code modication that is required to make the code from the post work in their own projects. ...

Similar publications

Article
Full-text available
With the technology advancements and easy availability of internet, every day millions of users share information electronically through emails, file sharing, e-commerce, etc. As, internet is highly vulnerable to various attacks, sending sensitive information over the Internet may be dangerous. One of the ways to protect the sensitive Information i...
Thesis
Full-text available
Nowadays, more and more companies use Enterprise Models to integrate and coordinate their business processes with the aim of remaining competitive on the market. Consequently, Enterprise Models play a critical role in this integration enabling to improve the objectives of the enterprise, and ways to reach them in a given period of time. Through Ent...
Article
Full-text available
Information leaks can occur through many Android applications, including unauthorized access to sensors data. Hooking is an important technique for protecting Android applications and add security features to them even without its source code. Various hooking frameworks are developed to intercept events and process their own specific events. The ho...
Article
Full-text available
Background: Subnormal level of vitamin D are associated with the higher frequency of cancer and correlated with inferior prognosis in some cancers; no data exist for acute leukemia. Aim of the work: To clarify the relationship between serum vitamin D levels and acute leukemia in children. Materials and Methods: Vitamin D level was measured in sixty...
Article
Full-text available
The continuous progress of information technology makes China’s new media constantly develop, and the new media environment has become the mainstream environment of the current society. The development of new media has promoted the reform of various fields, especially the reform of archives management in colleges and universities. To realize the co...

Citations

... Additionally, our interviews highlighted the advantages from searching with AI about unfamiliar problems. Prior to AI tools, a developer would probably ask Stack Overflow or search Google for a few possible solutions [86,106,110], but identifying the most relevant search terms may be difficult. Similarly, relying on AI tools to debug code errors was widely reported by our interviewees (P1, P3, P5, P6, P9, P11-P15, P17, P20, P21, P23, P25, P26) and surveyees (242/291 (83.2%)). ...
Preprint
Full-text available
AI assistance tools such as ChatGPT, Copilot, and Gemini have dramatically impacted the nature of software development in recent years. Numerous studies have studied the positive benefits that practitioners have achieved from using these tools in their work. While there is a growing body of knowledge regarding the usability aspects of leveraging AI tools, we still lack concrete details on the issues that organizations and practitioners need to consider should they want to explore increasing adoption or use of AI tools. In this study, we conducted a mixed methods study involving interviews with 26 industry practitioners and 395 survey respondents. We found that there are several motives and challenges that impact individuals and organizations and developed a theory of AI Tool Adoption. For example, we found creating a culture of sharing of AI best practices and tips as a key motive for practitioners' adopting and using AI tools. In total, we identified 2 individual motives, 4 individual challenges, 3 organizational motives, and 3 organizational challenges, and 3 interleaved relationships. The 3 interleaved relationships act in a push-pull manner where motives pull practitioners to increase the use of AI tools and challenges push practitioners away from using AI tools.
... This paper aims to provide a systematic analysis of the issues, challenges, and solutions associated with multilingual development by examining relevant Stack Overflow (SO) posts. We selected SO as our primary source of information due to its significance as a platform where developers exchange information about software development and as an educational resource that influences their practices [25], [26]. To conduct our analysis, we manually inspected 586 randomly sampled and highly relevant posts 1 , covering the entire period of SO's existence until late 2021. ...
... However, SO is currently the most accessible and widely used data source for our study. It is a well-known repository where developers post questions and receive answers, and has been frequently utilized in prior software engineering studies [25], [66], [67]. We thus assume that SO reasonably reflects the issues and challenges faced by developers, including those related to multilingual development. ...
Article
Full-text available
Developing software projects that incorporate multiple languages has been a prevalent practice for many years. However, the issues encountered by developers during the development process, the underlying challenges causing these issues, and the solutions provided to developers remain unknown. In this paper, our objective is to provide answers to these questions by conducting a study on developer discussions on Stack Overflow (SO). Through a manual analysis of 586 highly relevant posts spanning 14 years, we revealed that multilingual development is a highly and sustainably active topic on SO, with older questions becoming inactive and newer ones getting first asked (and then mostly remaining active for more than one year). From these posts, we observed a diverse array of issues (11 categories), primarily centered around interfacing and data handling across different languages. Our analysis suggests that error/exception handling issues were the most difficult to resolve among those issue categories, while security related issues were most likely to receive an accepted answer. The primary challenge faced by developers was the complexity and diversity inherent in building multilingual code and ensuring interoperability. Additionally, developers often struggled due to a lack of technical expertise on the varied features of different programming languages (e.g., threading and memory management mechanisms). In addition, properly handling message passing across languages constituted a key challenge with using implicit language interfacing. Notably, Stack Overflow emerged as a crucial source of solutions to these challenges, with the majority (73%) of the posts receiving accepted answers, most within a week (36.5% within 24 hours and 25% in the following six days). Based on our analysis results, we have formulated actionable insights and recommendations that can be utilized by researchers and developers in this field.
... Stack Overflow has been utilized in several previous studies, such as user profile analysis (4) and new developer support for programming languages (5). Other research was also conducted to investigate the use of source code in the Stack Overflow discussion as a reference source for compiling the source code of a software (6). Research on Stack Overflow is also used to analyze the trend of discussions that occur among its users (7). ...
... Currently work units can use information systems in carrying out various tasks quickly and precisely. This is because in the information system it can be reached through software that can be used, so that work units get the convenience of completing their work (9,10). Using paper as a medium for data processing and data storage is less effective and efficient in this era of information systems. ...
... Also, SO data is public and can be accessed through the Stack Exchange Data Explorer tool. 3 In addition to being a widely used tool by technology professionals, SO is significantly used for scientific studies [11,[14][15][16]. We can highlight the studies from [17] and [15], which were conducted to understand the types of questions asked by the developers, and from [18,19] that observed how the SO code snippets are used. ...
... Stack Overflow data also have been used as income for many scientific works [11,[14][15][16][17][18][19]. In addition, some researchers have investigated why the questions are asked or the kind of information requested in the questions. ...
Article
Full-text available
This paper extends an initial investigation of eHealth from the developers’ perspective. In this extension, our focus is on mobile health data. Despite the significant potential of this development area, few studies try to understand the challenges faced by these professionals. This perspective is relevant to identify the most used technologies and future perspectives for research investigation. Using a KDD-based process, this work analyzed eHealth and mHealth discussions from Stack Overflow (SO) to comprehend this developers’ community. We got and processed 6082 eHealth and 1832 mHealth questions. The most discussed topics include manipulating medical images, electronic health records with the HL7 standard, and frameworks to support mobile health (mHealth) development. Concerning the challenges faced by these developers, there is a lack of understanding of the DICOM and HL7 standards, the absence of data repositories for testing, and the monitoring of health data in the background using mobile and wearable devices. Our results also indicate that discussions have grown mainly on mHealth, primarily due to monitoring health data through wearables and about how to optimize resource consumption during health-monitoring.
... Semantic clones or variants are essential to prevent inconsistencies within software systems. Wu et al. [11] searched for Java files using "stackoverflow" and manually inspected the results. The researchers found that in 31.5% of their samples, developers had to modify SO source code for compatibility with their projects. ...
Conference Paper
Semantic and Cross-language code clone generation may be useful for code reuse, code comprehension, refactoring and benchmarking. OpenAI's GPT model has potential in such clone generation as GPT is used for text generation. When developers copy/paste codes from Stack Overflow (SO) or within a system, there might be inconsistent changes leading to unexpected behaviours. Similarly, if someone possesses a code snippet in a particular programming language but seeks equivalent functionality in a different language, a semantic cross-language code clone generation approach could provide valuable assistance. In this study, using SemanticCloneBench as a vehicle, we evaluated how well the GPT-3 model could help generate semantic and cross-language clone variants for a given fragment. We have comprised a diverse set of code fragments and assessed GPT-3's performance in generating code variants. Through extensive experimentation and analysis, where 9 judges spent 158 hours to validate, we investigate the model's ability to produce accurate and semantically correct variants. Our findings shed light on GPT-3's strengths in code generation, offering insights into the potential applications and challenges of using advanced language models in software development. Our quantitative analysis yields compelling results. In the realm of semantic clones, GPT-3 attains an impressive accuracy of 62.14% and 0.55 BLEU score, achieved through few-shot prompt engineering. Furthermore, the model shines in transcending linguistic confines, boasting an exceptional 91.25% accuracy in generating cross-language clones.
... Wu et al. [13] evaluated 289 open-source projects to assess the prevalence of Stack Overflow code snippets in these projects. Their study found that 30.5% of the evaluated projects contained code snippets from Stack Overflow with minimal to no modifications. ...
Conference Paper
Data privacy and protection are essential in today's digital landscape, with software developers' playing a critical role in addressing these challenges. This paper presents a comprehensive study of the challenges and issues faced by software developers' in the context of data privacy and protection. Our analysis is based on a dataset of questions posted on popular online platforms, such as Stack Overflow, Information Security Stack Exchange, and Software Engineering Stack Exchange. Our findings reveal a range of challenges, including the design and generation of privacy policies, compliance with legal frameworks, and implementation of privacy-preserving features in software systems. We also observed interest in policy-related questions and confusion between data privacy concepts and programming language access control mechanisms. Based on our findings, we provide recommendations to address these challenges and promote privacy-by-design principles in software development.
... Today, software reuse is realized and made available in various forms, such as software libraries, design patterns, and software frameworks, embodying typical software functionalities, practices, and architectures [2][3][4]. Moreover, research is advancing on the partial and small-scale reuse of existing software in other software developments through the reuse of code snippets [5,6]. Building on these achievements, as surveyed by Barros-Justo and others, modern software practices are increasingly leveraging reuse in various forms based on past assets [7]. ...
... As shown in Figure 1, valuable functions and designs are retained and used in new projects while discarding implementation details. Materials used for upcycling are not limited to open resources available on the web; they can be extracted from software projects shared exclusively within development organizations as well. Existing exploratory studies report that careless reuse of online assets from platforms such as StackOverflow and GitHub can lead to issues such as bugs and increased development costs [5,32,33]. Therefore, leveraging closed assets, which are easier to validate for reliability, becomes crucial. The materials extracted from the original project need to be processed, such as by selecting important parts and modifying materials according to their purpose rather than directly integrating them into a new project. ...
Article
Full-text available
Software upcycling, a form of software reuse, is a concept that efficiently generates novel, innovative, and value-added development projects by utilizing knowledge extracted from past projects. However, how to integrate the materials derived from these projects for upcycling remains uncertain. This study defines a systematic model for upcycling cases and develops the Sharing Upcycling Cases with Context and Evaluation for Efficient Software Development (SUCCEED) system to support the implementation of new upcycling initiatives by effectively sharing cases within the organization. To ascertain the efficacy of upcycling within our proposed model and system, we formulated three research questions and conducted two distinct experiments. Through surveys, we identified motivations and characteristics of shared upcycling-relevant development cases. Development tasks were divided into groups, those that employed the SUCCEED system and those that did not, in order to discern the enhancements brought about by upcycling. As a result of this research, we accomplished a comprehensive structuring of both technical and experiential knowledge beneficial for development, a feat previously unrealizable through conventional software reuse, and successfully realized reuse in a proactive and closed environment through construction of the wisdom of crowds for upcycling cases. Consequently, it becomes possible to systematically perform software upcycling by leveraging knowledge from existing projects for streamlining of software development.
... Other studies were performed to identify the reuse of SO answers by GitHub projects, via clone detection or keywordbased search [37], [40], [46], [57], [59]. Specifically, Yang et al. [59] applied a clone detection tool-SourcererCCto Python projects on GitHub and answer code on SO. ...
... Wu et al. [57] searched Java files with the keyword "stackoverflow" and manually inspected retrieved files, to locate GitHub projects referencing SO posts. They observed that in 31.5% of the data, developers needed to modify source code from SO to make it work in their own projects. ...
Preprint
Full-text available
StackOverflow (SO) is a widely used question-and-answer (Q\&A) website for software developers and computer scientists. GitHub is an online development platform used for storing, tracking, and collaborating on software projects. Prior work relates the information mined from both platforms to link user accounts or compare developers' activities across platforms. However, not much work is done to characterize the SO answers reused by GitHub projects. For this paper, we did an empirical study by mining the SO answers reused by Java projects available on GitHub. We created a hybrid approach of clone detection, keyword-based search, and manual inspection, to identify the answer(s) actually leveraged by developers. Based on the identified answers, we further studied topics of the discussion threads, answer characteristics (e.g., scores, ages, code lengths, and text lengths), and developers' reuse practices. We observed that most reused answers offer programs to implement specific coding tasks. Among all analyzed SO discussion threads, the reused answers often have relatively higher scores, older ages, longer code, and longer text than unused answers. In only 9% of scenarios (40/430), developers fully copied answer code for reuse. In the remaining scenarios, they reused partial code or created brand new code from scratch. Our study characterized 130 SO discussion threads referred to by Java developers in 357 GitHub projects. Our empirical findings can guide SO answerers to provide better answers, and shed lights on future research related to SO and GitHub.
... Although this behavior was observed in the setting of a lab study and may not necessarily reflect how developers approach code comprehension in their daily work, it raises concerns about the potential impact of such a trend on code quality. Developers' heavy reliance on Stack Overflow, despite its known shortcomings in accuracy and currency [69], [70], underscores the need for caution before the widespread adoption of language model-based tools in code development. ...
Preprint
Full-text available
Developers often face challenges in code understanding, which is crucial for building and maintaining high-quality software systems. Code comments and documentation can provide some context for the code, but are often scarce or missing. This challenge has become even more pressing with the rise of large language model (LLM) based code generation tools. To understand unfamiliar code, most software developers rely on general-purpose search engines to search through various programming information resources, which often requires multiple iterations of query rewriting and information foraging. More recently, developers have turned to online chatbots powered by LLMs, such as ChatGPT, which can provide more customized responses but also incur more overhead as developers need to communicate a significant amount of context to the LLM via a textual interface. In this study, we provide the investigation of an LLM-based conversational UI in the IDE. We aim to understand the promises and obstacles for tools powered by LLMs that are contextually aware, in that they automatically leverage the developer's programming context to answer queries. To this end, we develop an IDE Plugin that allows users to query back-ends such as OpenAI's GPT-3.5 and GPT-4 with high-level requests, like: explaining a highlighted section of code, explaining key domain-specific terms, or providing usage examples for an API. We conduct an exploratory user study with 32 participants to understand the usefulness and effectiveness, as well as individual preferences in the usage of, this LLM-powered information support tool. The study confirms that this approach can aid code understanding more effectively than web search, but the degree of the benefit differed by participants' experience levels.