Figure 4 - uploaded by Ahmed E. Hassan
Content may be subject to copyright.
Reflexion Diagram for an Operating System

Reflexion Diagram for an Operating System

Source publication
Conference Paper
Full-text available
Maintenance of evolving software systems has become the most frequently performed activity by software developers. A good understanding of the software system is needed to reduce the cost and length of this activity. Various approaches and tools have been proposed to assist in this process such as code browsers, slicing techniques, etc. These techn...

Contexts in source publication

Context 1
... the concrete architecture is compared against the proposed conceptual architecture. Figure 4 shows a reflexion diagram which highlights the differences (gaps) between the proposed and the actual extracted dependencies among the subsystems. In this case all expected dependen- cies existed in the software system. ...
Context 2
... this case all expected dependen- cies existed in the software system. There are two unex- pected dependencies; these are the dashed lines in Figure 4. ...
Context 3
... the third step, the developer investigates the discovered gaps between her/his conceptual view and the concrete (as implemented) view of the system. In particular for the exam- ple shown in Figure 4, she/he needs to uncover the reasons for: ...

Citations

... This work did not consider the erroneous changes and only focuses on design flaws based on change couple relation. Using the comments from the version control system, sticky notes are seen to provide useful information [8] but the concept of error is not seen there. Furthermore, the relationship between the evolution of software artifacts and the way they are affected by problems is visualized by D'Ambros et al but it did not consider component based analysis using commit history [9]. ...
... We achieve this by obtaining and contrasting the concrete and conceptual architectures of the system based on available documentation and the source code of the project. We replicate techniques from previous work [3], [15], [7]; where alongside the goals and findings of each study, their authors were able to expose many unexpected and missing dependencies between subsystems in an architecture (architectural divergences). Bowman and Brewster [3] pointed out that many of the unexpected dependencies they found in Linux could not be explained by rationale and their occurrence is due to developers bad practices or expediency. ...
... Figure 6 shows the contrasting model for the View and Control layer in ArgoUML; here we observe 20 divergences, surpassing the amount of convergences in the model. We numbered these divergences (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20) to identify them throughout our work (see Table II). The model shows 2 absences that were expected from the conceptual architecture but were not found in the file dependencies from the source code; they are Reverse Engineering to Code Generation and Notation to GUI. ...
... Moreno et al. have built ARENA [46,47], a tool which combines multiple kinds of changes, i.e., changes to source code, libraries, documentation and licenses, with issues from software repositories to generate release notes. Hassan and Holt proposed an approach, named Source Sticky Notes, to better explain the static dependencies of a software system using historical modification records [24]. ...
Conference Paper
Full-text available
Commit messages can be regarded as the documentation of software changes. These messages describe the content and purposes of changes, hence are useful for program comprehension and software maintenance. However, due to the lack of time and direct motivation, commit messages sometimes are neglected by developers. To address this problem, Jiang et al. proposed an approach (we refer to it as NMT), which leverages a neural machine translation algorithm to automatically generate short commit messages from code. The reported performance of their approach is promising, however, they did not explore why their approach performs well. Thus, in this paper, we first perform an in-depth analysis of their experimental results. We find that (1) Most of the test diffs from which NMT can generate high-quality messages are similar to one or more training diffs at the token level. (2) About 16% of the commit messages in Jiang et al.’s dataset are noisy due to being automatically generated or due to them describing repetitive trivial changes. (3) The performance of NMT declines by a large amount after removing such noisy commit messages. In addition, NMT is complicated and time-consuming. Inspired by our first finding, we proposed a simpler and faster approach, named NNGen (Nearest Neighbor Generator), to generate concise commit messages using the nearest neighbor algorithm. Our experimental results show that NNGen is over 2,600 times faster than NMT, and outperforms NMT in terms of BLEU (an accuracy measure that is widely used to evaluate machine translation systems) by 21%. Finally, we also discuss some observations for the road ahead for automated commit message generation to inspire other researchers.
... Such wealth of information helps researchers and software project personnel to understand and manage the development of complex projects within estimated budget and time deadline. For example, historical information can assist developers in understanding the rationale for the current structure of the software system [3]. ...
Chapter
Full-text available
A software repository contains a historical and valuable wealth of information about overall development of software system (project’s status, progress, and evolution). Mining software repositories (MSR) are one of the interesting and fastest growing fields within software engineering. It focuses on extracting and analyzing the heterogeneous data available in software repositories to uncover interesting, useful, and actionable information about software system and projects. Using well-established data mining tools and techniques, professionals, practitioners, and researchers can explore the potential of this valuable data in order to better understand and manage their complicated projects and also to produce high reliable software system delivered on time and within estimated budget. This paper is an effort to discover problems encountered during development of software projects and the role of mining software repositories to resolve these problems. A comparative study of data mining tools and techniques for mining software repositories has been presented.
... We define development knowledge of log lines as the information that is not conveyed directly in the log lines, but hidden in the development history of the code surrounding the logging statements from which these log lines are generated. Various sources of data generated during software development, such as development history [16], [17], design rationale, concerns of the source code [18], [19] and email discussions [20], [21] are widely used in program comprehension tasks. ...
Article
Full-text available
Logs are generated by output statements that developers insert into the code. By recording the system behaviour during runtime, logs play an important role in the maintenance of large software systems. The rich nature of logs has introduced a new market of log management applications (e.g., Splunk, XpoLog and log stash) that assist in storing, querying and analyzing logs. Moreover, recent research has demonstrated the importance of logs in operating, understanding and improving software systems. Thus log maintenance is an important task for the developers. However, all too often practitioners (i.e., operators and administrators) are left without any support to help them unravel the meaning and impact of specific log lines. By spending over 100 human hours and manually examining all the email threads in the mailing list for three open source systems (Hadoop, Cassandra and Zookeeper) and performing web search on sampled logging statements, we found 15 email inquiries and 73 inquiries from web search about different log lines. We identified that five types of development knowledge that are often sought from the logs by practitioners: meaning, cause, context, impact and solution. Due to the frequency and nature of log lines about which real customers inquire, documenting all the log lines or identifying which ones to document is not efficient. Hence in this paper we propose an on-demand approach, which associates the development knowledge present in various development repositories (e.g., code commits and issues reports) with the log lines. Our case studies show that the derived development knowledge can be used to resolve real-life inquiries about logs.
... There are several studies that have utilized the information in version control systems. Version control logs are investigated to explain the rational of dependencies and software architecture [4]. Other applications are like measures of expertise [6] and social network analysis [5]. ...
Chapter
Full-text available
Ownership architecture was usually constructed by investigating the comments at the top of source files. That is, to associate developer names with source files is to examine the comments manually. If such documentation can be produced automatically, it will be more immediate to indicate the status of the project. This research focus on the logs in the version control system. The data within version control logs is in a regular form and information can be retrieved quickly. The importance of developers can also be estimated by the number of own files and frequency of making a change. In order to understand the system architecture, the directory structure of source code can be used to identify function components of the system essentially. The source files in a directory implement the same function component, and the owners of these source files can be considered a team. Using the documents, researcher can know the ownership architecture and more information about the status of the project.
... Chen et al. [34] developed a tool called CVSSearch, which uses the CVS comments to track source code fragments. Hassan and Holt [35] introduce the idea of attaching Source Sticky Notes to static dependency graphs, which assist in better understanding the software architecture. Our approach leverages time dependence between changes to identify the foundational subsystems. ...
Data
Full-text available
Up-to-date preservation of project knowledge like developer communication and design documents is essential for the successful evolution of software systems. Ideally, all knowledge should be preserved, but since projects only have limited resources, and software systems continuously grow in scope and complexity, one needs to prioritize the subsystems and development periods for which knowledge preservation is more urgent. For example, core subsystems on which the majority of other subsystems build are obviously prime candidates for preservation, yet if these subsystems change continuously, picking a development period to start knowledge preservation and to maintain knowledge for over time become very hard. This paper exploits the time dependence between code changes to automatically determine for which subsystems and development periods of a software project knowledge preservation would be most valuable. A case study on two large open source projects (PostgreSQL and FreeBSD) shows that the most valuable subsystems to preserve knowledge for are large core subsystems. However, the majority of these subsystems (1) are continuously foundational, i.e., ideally for each development period knowledge should be preserved, and (2) experience substantial changes, i.e., preserving knowledge requires substantial effort.
... Hassan proposes [Hassan 2009] a technique to predict faults in a system by applying complexity metrics on the changes that are present in the repository. Source Sticky Notes [Hassan 2004] is an approach that annotates a static dependency graph of a system with information that is extracted from the history of a system, to help developers to understand the context of the changes they are applying. DynaMine [Livshits 2005] is a tool that applies data mining techniques on version archives to find common usage patterns by analyzing co-changed methods. ...
Article
Modern software is built by teams of developers that work in a collaborative environment. The goal of this kind of development is that multiple developers can work in parallel. They can alter a set of shared artifacts and inspect and integrate the source code changes of other developers. For example, bug fixes, enhancements, new features or adaptations due to changing environment might be integrated into the system release. At a technical level, a collaborative development process is supported by version control systems. Since these version control systems allow developers to work in their own branch, merging and integration have become an integral part of the development process. These systems use automatic and advanced merging techniques to help developers to merge their modifications in the development repositories. However, these techniques do not guarantee to have a functional system. While the use of branching in the development process offers numerous advantages, the activity of merging and integrating changes is hampered by the lack of comprehensive support to assist developers in these activities. For example, the integration of changes can have an unexpected impact on the design or behavior of the system, leading to the introduction of subtle bugs. Furthermore, developers are not supported when integrating changes across branches (cherry picking), when dealing with branches that have diverged, when finding the dependencies between changes, or when assessing the potential impact of changes. In this dissertation we present an approach that aims at alleviating these problems by providing developers and, more precisely, integrators with semi-automated support for assisted integration within a branch and across branches. We focus on helping integrators with their information needs when understanding and integrating changes by means of characterizations of changes and streams of changes (i.e., sequence of successive changes within a branch) together with their dependencies. These characterizations rely on the first-class representation of systems' histories and changes based on program entities and their relationships rather than on files and text. For this, we provide a family of meta-models (Ring, RingH, RingS and RingC) that offer us the representation of program entities, systems' histories, changes and their dependencies, along with analyses for version comparison, and change and dependency identification. Instances of these meta-models are then used by our proposed tool support to enable integrators to analyze the characterizations and changes. Torch, a visual tool, and JET, a set of tools, actually provide the information needs to assist integration within a branch and across branches by means of the characterization of changes and streams of changes respectively.
... Chen et al. [34] developed a tool called CVSSearch, which uses the CVS comments to track source code fragments. Hassan and Holt [35] introduce the idea of attaching Source Sticky Notes to static dependency graphs, which assist in better understanding the software architecture. Our approach leverages time dependence between changes to identify the foundational subsystems. ...
... This approach was batch-oriented where a model was defined, the mappings to the code created and an analysis tool was executed to give feedback to the user at periodic intervals. Over the years several enhancements have been suggested for this approach, including hierarchical Reflexion Modeling [19] and augmenting Reflexion Model information with information derived from CVS repositories [26]. ...
Article
Full-text available
Architecting software systems is an integral part of the software development lifecycle. However, often the implementation of the resultant software ends up diverging from the designed architecture due to factors such as time pressures on the development team during implementation/evolution, or the lack of architectural awareness on the part of (possibly new) programmers. In such circumstances, the quality requirements addressed by the as-designed architecture are likely to be unaddressed by the as-implemented system. This paper reports on in-vivo case studies of the ACTool, a tool which supports real-time Reflexion Modeling for architecture recovery and on-going consistency. It describes our experience conducting architectural recovery sessions on three deployed, commercial software systems in two companies with the tool, as a first step towards ongoing architecture consistency in these systems. Our findings provide the first in-depth characterization of real-time Reflexion-based architectural recovery in practice, highlighting the architectural recovery agendas at play, the modeling approaches employed, the mapping approaches employed and characterizing the inconsistencies encountered. Our findings also discuss the usefulness of the ACTool for these companies.