Figure 4 - uploaded by Oscar Nierstrasz
Content may be subject to copyright.
2: Mapping a series of relational tables onto an inheritance hierarchy. (a) one to one; (b) rolled down; (c) rolled up

2: Mapping a series of relational tables onto an inheritance hierarchy. (a) one to one; (b) rolled down; (c) rolled up

Source publication
Conference Paper
Full-text available
The rapid growth of object-oriented development over the past twenty years has given rise to many object-oriented systems that are large, complex and hard to maintain. These systems exhibit a range of problems, effectively preventing them from satisfying the evolving requirements imposed by their customers. In our paper, we address problem of under...

Similar publications

Conference Paper
Full-text available
In this paper, we describe temporal invariants, which are class invariants that are qualified by the operators eventually, always, never, or already. Temporal invariants can capture assertions that may not be valid initially but, as the program continues, must eventually become valid. Moreover, temporal invariants can indicate references to memory...

Citations

... . Reverse, forward, and re-engineering (extracted from [1]). ...
... The aim is to improve its performance, maintainability, or other characteristics. Therefore, reverse engineering is a prerequisite for re-engineering to understand how these systems work before making any changes to them [1]. Fig. 1 illustrates the reverse engineering, forward engineering, and reengineering concepts. ...
Article
Full-text available
Many organizations depend on software systems to accomplish their daily tasks, but these systems need to be maintained and evolved to cope with various changes and requirements. Before starting to maintain and evolve software systems, it is necessary to understand them. Reverse engineering plays a crucial role in comprehending various aspects of software systems by extracting different models and diagrams that represent the structure and behaviour of software systems. This article presents a systematic literature review (SLR) to understand the current state of research in model-driven engineering (MDE) for reverse engineering software systems. The considered articles came from five electronic databases (Scopus, IEEE Xplore, Web of Science, ACM Digital Library, and Google Scholar), and were supplemented by additional articles recommended by experts and provided by manual snowballing. From 538 surveyed papers, 83 principal studies were selected, which present the main characteristics of 64 model-driven reverse engineering (MDRE) approaches. These approaches are analyzed and evaluated based on their objectives and characteristics. Additionally, research gaps and areas where more research is needed are also identified. Therefore, the review provides comprehensive answers to several widely interesting questions for researchers and practitioners who are considering using MDRE.
... Of course, old documentation is also written using the terminology, or taxonomy, in use at the time the documentation was created and the reader must be familiar with this taxonomy, otherwise, he will not be able to understand the documentation properly. However, even if the documentation is sophisticated and can be understood by the reader, it is nearly impossible to gain a 19 Here must be added that individuals tend to rely on their colleagues' incomplete and inaccurate knowledge rather than reading available documentation. The documentation, if it exists, is rather to be understood as holistic understanding of how a system works from the documentation alone 19 [28]. ...
Article
Full-text available
Aged information systems are commonly referred to as legacy information systems or just merely as legacy systems. Typically, these are mission-critical systems developed years ago that significantly resist evolution. Many organizations are confronted with these systems and consider them a burden because they cement their businesses and cause unreasonably high evolution and operating costs. However, they also cannot easily be discarded and replaced by modern solutions. Despite their relevance, no universally accepted concept exists that explains these systems or the mechanisms behind them. Accordingly, the properties and challenges associated with legacy systems vary significantly across different authors and their respective intentions – this needs to be revised, and this is where our research comes in. To this end, we first describe the causes of information systems’ aging, typical symptoms by which legacy systems are often recognized, and the consequences of using outdated solutions. Based on this empirical point of view, we introduce a holistic model to explain legacy systems objectively and introduce the concept of the business-technological age to distinguish between the chronological age and obsolescence of a deployed system. This approach provides practitioners and researchers the foundation for stringently explaining the hitherto fuzzy legacy phenomenon, promoting a common understanding of legacy systems, and simplifying communication through conceptualization. Practitioners can use this approach to better understand their stock application systems in terms of aging and improve their decision-making regarding evolution.
... The best developers can do is to manually retrain a model and re-release the tool with the updated model, yet this requires them to decide when to retrain and what training data to use. Such retraining-from-scratch (RFS) typically requires a large data set (making retraining slower [11,13,56]), and ignores prior models' knowledge, which goes against traditional software re-engineering practices [15]. ...
Preprint
Full-text available
Nowadays, software analytics tools using machine learning (ML) models to, for example, predict the risk of a code change are well established. However, as the goals of a project shift over time, and developers and their habits change, the performance of said models tends to degrade (drift) over time, until a model is retrained using new data. Current retraining practices typically are an afterthought (and hence costly), requiring a new model to be retrained from scratch on a large, updated data set at random points in time; also, there is no continuity between the old and new model. In this paper, we propose to use lifelong learning (LL) to continuously build and maintain ML-based software analytics tools using an incremental learner that progressively updates the old model using new data. To avoid so-called ''catastrophic forgetting'' of important older data points, we adopt a replay buffer of older data, which still allows us to drastically reduce the size of the overall training data set, and hence model training time. We empirically evaluate our LL approach on two industrial use cases, i.e., a brown build detector and a Just-in-Time risk prediction tool, showing how LL in practice manages to at least match traditional retraining-from-scratch performance in terms of F1-score, while using 3.3-13.7x less data at each update, thus considerably speeding up the model updating process. Considering both the computational effort of updates and the time between model updates, the LL setup needs 2-40x less computational effort than retraining-from-scratch setups.
... Researchers report that over half of the maintenance time is spent on reading and understanding source code [1], [2],where developers pore over source code, looking for clues that help them to construct a coherent mental model of a system [3], so as to make appropriate changes while ensuring its quality [4]- [6]. This is a difficult undertaking for any programming language, however maintaining and monitoring the quality of an object-oriented system is more complex than for procedural programs [7], [8], due to several reasons, such as inheritance and polymorphism [9]- [12]: Inheritance and polymorphism increase the flexibility of programs by allowing dynamic binding of messages. Inheritance allows the extension of an existing behavior through an inheritance hierarchy; polymorphism the performance of a task in multiple forms, with different objects responding to messages with the same name but different implementations. ...
Conference Paper
Full-text available
In object-oriented programming, classes are the primary abstraction mechanism used by and exposed to developers. Understanding classes is key for the development and evolution of object-oriented applications. The fundamental problem faced by developers is that while classes are intrinsically structured entities, in IDEs they are represented as a blob of text. The idea behind the original CLASS BLUEPRINT visualization was to represent the internal structure of classes in terms of fields, their accesses, and the method call flow. Additional information was depicted using colors. The thus created visualization proved to be an effective means to support program comprehension. However, a number of omissions rendered it only partially useful. We propose CLASS BLUEPRINT V2 (in short BLUEPRINTV2), which in addition to the information depicted by CLASS BLUEPRINT also supports dead code identification, methods under tests, and calling relationships between class and instance level methods. In addition, BLUEPRINTV2 enhances the understanding of fields by showing how fields of super/subclasses are accessed. We present the enhanced visualization and report on a first validation with 26 developers and 18 projects.
... Reverse engineering defined as the process of evaluating a specific system to identify its components and interconnections, as well as to construct the system into another form or a more complex system. This process is about understanding the system, whereas re-engineering is about restructuring / refactoring the system [14], [15]. ...
... Software has to evolve (Lehman 1996;Demeyer et al. 2002;Mens et al. 2004). Application programming interfaces (APIs) change and their users get impacted (Robbes, Lungu, & Röthlisberger 2012). ...
... Documentation structure not always facilitate developer work, there is a gap between information needs by developers and the structure of documentation [22]. In 2013 Demeyer et al proposed a guidelines structure for software documentation [5] Demeyer agreed with John et.al. in their emphases on quality issues of software documentations [12]. ...
... Readme file is a short documentation associated with most OSS. Based on the literature, software developers found it difficult to write documentation, on another hand, they use social media (audio or video conferencing) a lot to communicate, when developing a project many researchers suggest reusing developer communication history for documentation [22,5,8]. ...
... Documentation structure not always facilitate developer work, there is a gap between information needs by developers and the structure of documentation [22]. In 2013 Demeyer et al proposed a guidelines structure for software documentation [5] Demeyer agreed with John et.al. in their emphases on quality issues of software documentations [12]. ...
... Readme file is a short documentation associated with most OSS. Based on the literature, software developers found it difficult to write documentation, on another hand, they use social media (audio or video conferencing) a lot to communicate, when developing a project many researchers suggest reusing developer communication history for documentation [22,5,8]. ...
... At first glance, the task of creating tests is straightforward -to test, but in addition to that tests describe the expected behavior of the production code being tested. Years ago, Demeyer et al. [6] suggested that if the tests are maintained together with the production code, their implementation is the most accurate mirror of the product specification and can be considered as up-todate documentation. Obviously, tests can contain a number of useful production code metadata that can support program comprehension. ...
Preprint
Full-text available
Software testing is one of the very important Quality Assurance (QA) components. A lot of researchers deal with the testing process in terms of tester motivation and how tests should or should not be written. However, it is not known from the recommendations how the tests are actually written in real projects. In this paper the following was investigated: (i) the denotation of the test word in different natural languages; (ii) whether the test word correlates with the presence of test cases; and (iii) what testing frameworks are mostly used. The analysis was performed on 38 GitHub open source repositories thoroughly selected from the set of 4.3M GitHub projects. We analyzed 20,340 test cases in 803 classes manually and 170k classes using an automated approach. The results show that: (i) there exists weak correlation (r = 0.655) between the word test and test cases presence in a class; (ii) the proposed algorithm using static file analysis correctly detected 95\% of test cases; (iii) 15\% of the analyzed classes used main() function whose represent regular Java programs that test the production code without using any third-party framework. The identification of such tests is very low due to implementation diversity. The results may be leveraged to more quickly identify and locate test cases in a repository, to understand practices in customized testing solutions and to mine tests to improve program comprehension in the future.
... From the end-user point of view, they need to offer functionalities entirely unforeseen when they were first conceived. From the developer point of view, they need to adapt to the new technologies that would allow one to implement these functionalities [8]. ...
... We cannot illustrate these two points for lack of space. 8 ...
Chapter
Full-text available
Advanced reverse engineering tools are required to cope with the complexity of software systems and the specific requirements of numerous different tasks (re-architecturing, migration, evolution). Consequently, reverse engineering tools should adapt to a wide range of situations. Yet, because they require a large infrastructure investment, being able to reuse these tools is key. Moose is a reverse engineering environment answering these requirements. While Moose started as a research project 20 years ago, it is also used in industrial projects, exposing itself to all these difficulties. In this paper we present ModMoose, the new version of Moose. ModMoose revolves around a new meta-model, modular and extensible; a new toolset of generic tools (query module, visualization engine, ...); and an open architecture supporting the synchronization and interaction of tools per task. With ModMoose, tool developers can develop specific meta-models by reusing existing elementary concepts, and dedicated reverse engineering tools that can interact with the existing ones.