Figure 11 - uploaded by Godmar Back
Content may be subject to copyright.
Most frequent grammar in MVEL heap dump.  

Most frequent grammar in MVEL heap dump.  

Source publication
Conference Paper
Full-text available
Memory leaks are caused by software programs that prevent the reclamation of memory that is no longer in use. They can cause significant slowdowns, exhaustion of available storage space and, eventually, application crashes. Detecting mem- ory leaks is challenging because real-world applications are built on multiple layers of software frameworks, m...

Context in source publication

Context 1
... the resulting heap dump showed the grammar in Figure 11, which contains a recursive production RegExM atch → nextAST N ode RegExM atch. This mined grammar mirrors the synthetic linked list discussed in Section 4.2. ...

Similar publications

Conference Paper
Full-text available
Recently many data types arising from data mining and Web search applications can be modeled as bipartite graphs. Ex- amples include queries and URLs in query logs, and authors and papers in scientific literature. However, one of the issues is that previous algorithms only consider the content and link information from one side of the bipartite gra...
Thesis
Full-text available
Metal organic Frameworks(MOFs) that experience stimuli induced structural transformation could enable a whole new class of materials with remarkable properties. Photoactuating moieties in the structure could effect changes in the pore space or macroscale shape change enabling light driven gas separation and actuators. Here, we present a novel appro...

Citations

... In the literature, research on memory bloat focusses more on the diagnosis of bloat, especially the bloat caused by memory leaks [4,18]. One typical way for memory leak diagnosis is to investigate the memory occupations and growths of data structures, for example, Ref. [6,7,19,20]. Another is to watch the object lifetime information, such as the object staleness, for example, Ref. [9,[21][22][23]]. An object is considered stale if it is alive but has not been used for a long time. ...
Article
Full-text available
Memory bloat frequently occurs in web applications. It affects system performance and may even cause out‐of‐memory crashes. For web applications, testers often do performance testing that repeatedly runs test scripts to reveal potential memory bloats. Under that kind of testing, without guidance to determine the running order of test scripts, time may be wasted on testing with those non‐bloat‐inducing scripts. To address the problem, a test script prioritisation approach is proposed for the testing of memory bloat in Java web applications. The approach predicts which test scripts are more likely to make the underlying web application exhibit memory bloat phenomena by using a learning‐to‐rank technique. With this prediction, the execution of test scripts can be prioritised, and the revealing of memory bloat can thereby be accelerated. The experiments on a group of web applications obtained from Github and SourceForge show that the proposed prioritisation approach is effective.
... Second, experimenting with real data of our company with the goal to use our data-centric approaches in production revealed us several insights on why/how it is difficult to make adopt techniques such as Subgroup Discovery, it requires a perfect combination of the three SD axis, along with interactivity, prior knowledge and feedback integration, visualization tools, etc. We already started working on other applications linked to the incident management procedure, namely, crash deduplication [21] and early detection of JVM memory leaks [33]. These are two critical problems for Infologic and here again, we believe that Subgroup Discovery has a high potential w.r.t. the state of the art. ...
Preprint
Full-text available
The genuine supervision of modern IT systems brings new challenges as it requires higher standards of scalability, reliability and efficiency when analysing and monitoring big data streams. Rule-based inference engines are a key component of maintenance systems in detecting anomalies and automating their resolution. However, they remain confined to simple and general rules and cannot handle the huge amount of data, nor the large number of alerts raised by IT systems, a lesson learned from expert systems era. Artificial Intelligence for Operation Systems (AIOps) proposes to take advantage of advanced analytics and machine learning on big data to improve and automate every step of supervision systems and aid incident management in detecting outages, identifying root causes and applying appropriate healing actions. Nevertheless, the best AIOps techniques rely on opaque models, strongly limiting their adoption. As a part of this PhD thesis, we study how Subgroup Discovery can help AIOps. This promising data mining technique offers possibilities to extract interesting hypothesis from data and understand the underlying process behind predictive models. To ensure relevancy of our propositions, this project involves both data mining researchers and practitioners from Infologic, a French software editor.
... The dominator tree from s allows one to answer several reachability under failure queries, such as "are there two edge-(or vertex-) disjoint paths from s to v?" and "is there a path from s to v avoiding a vertex x (or an edge e)?", in asymptotically optimal time. The notion of dominators has been widely used in domains like circuit testing [7], theoretical biology [5], memory profiling [44], constraint programming [46], connectivity [27], just to state some. Due to their numerous applications, dominators have been extensively studied for over four decades [4,37,41,51] and several linear-time algorithms for computing dominator trees are known [6,14,15,29]. ...
Preprint
Full-text available
In this paper we present an efficient reachability oracle under single-edge or single-vertex failures for planar directed graphs. Specifically, we show that a planar digraph $G$ can be preprocessed in $O(n\log^2{n}/\log\log{n})$ time, producing an $O(n\log{n})$-space data structure that can answer in $O(\log{n})$ time whether $u$ can reach $v$ in $G$ if the vertex $x$ (the edge~$f$) is removed from $G$, for any query vertices $u,v$ and failed vertex $x$ (failed edge $f$). To the best of our knowledge, this is the first data structure for planar directed graphs with nearly optimal preprocessing time that answers all-pairs queries under any kind of failures in polylogarithmic time. We also consider 2-reachability problems, where we are given a planar digraph $G$ and we wish to determine if there are two vertex-disjoint (edge-disjoint) paths from $u$ to $v$, for query vertices $u,v$. In this setting we provide a nearly optimal 2-reachability oracle, which is the existential variant of the reachability oracle under single failures, with the following bounds. We can construct in $O(n\log^{O(1)}{n})$ time an $O(n\log^{3+o(1)}{n})$-space data structure that can check in $O(\log^{2+o(1)}{n})$ time for any query vertices $u,v$ whether $v$ is 2-reachable from $u$, or otherwise find some separating vertex (edge) $x$ lying on all paths from $u$ to $v$ in $G$. To obtain our results, we follow the general recursive approach of Thorup for reachability in planar graphs [J.~ACM~'04] and we present new data structures which generalize dominator trees and previous data structures for strong-connectivity under failures [Georgiadis et al., SODA~'17]. Our new data structures work also for general digraphs and may be of independent interest.
... The dominator tree from s allows one to answer several reachability under failure queries, such as "are there two edge-(or vertex-) disjoint paths from s to v?" and "is there a path from s to v avoiding a vertex x (or an edge e)?", in asymptotically optimal time. The notion of dominators has been widely used in domains like circuit testing [7], theoretical biology [5], memory profiling [42], constraint programming [44], connectivity [27], just to state some. Due to their numerous applications, dominators have been extensively studied for over four decades [4,35,39,48] and several linear-time algorithms for computing dominator trees are known [6,13,14,30]. ...
... Several works have been proposed in the literature that are aimed at detecting and fixing memory leaks in managed runtime environments based on virtual machines. In the context of Java applications, several techniques based on the analysis of the trend of the size of the heap memory [31]- [33], or, more specifically, on the analysis of the objects stored in the heap memory [34]- [36] have been proposed. In particular, several techniques which analyze the staleness of the heap objects have been proposed [37]- [43]. ...
Article
Full-text available
Memory leaks represent a remarkable problem for mobile app developers since a waste of memory due to bad programming practices may reduce the available memory of the device, slow down the apps, reduce their responsiveness and, in the worst cases, they may cause the crash of the app. A common cause of memory leaks in the specific context of Android apps is the bad handling of the events tied to the Activity Lifecycle. In order to detect and characterize these memory leaks, we present FunesDroid, a toolsupported black box technique for the automatic detection of memory leaks tied to the Activity Lifecycle in Android apps. FunesDroid implements a testing approach that can find memory leaks by analyzing unnecessary heap object replications after the execution of three different sequences of Activity Lifecycle events. In the paper, we present an exploratory study that shows the capability of the proposed technique to detect memory leaks and to characterize them in terms of their size, persistence and growth trend. The study also illustrates how memory leak causes can be detected with the support of the information provided by the FunesDroid tool.
... Offline approaches that collect information about an application for later analysis, separated into approaches that a. analyze heap dumps and other kinds of captured state [22,[27][28][29]. Compared to online approaches, offline approaches often perform more complicated analyses based on the object reference graph, involving graph reduction, graph mining and ownership analysis. ...
Conference Paper
Modern memory monitoring tools do not only offer analyses at a single point in time, but also offer features to analyze the memory evolution over time. These features provide more detailed insights into an application's behavior, yet they also make the tools more complex and harder to use. Analyses over time are typically performed on certain time windows within which the application behaves abnormally. Such suspicious time windows first have to be detected by the users, which is a non-trivial task, especially for novice users that have no experience in memory monitoring. In this paper, we present algorithms to automatically detect suspicious time windows that exhibit (1) continuous memory growth, (2) high GC utilization, or (3) high memory churn. For each of these problems we also discuss its root causes and implications. To show the feasibility of our detection techniques, we integrated them into AntTracks, a memory monitoring tool developed by us. Throughout the paper, we present their usage on various problems and real-world applications.
... A flow graph G = (V, E, s) is a directed graph (digraph) with a distinguished start vertex [2,8,14,15]. The dominator tree is a central tool in program optimization and code generation [11], and it has many applications in other diverse areas including constraint programming [40], circuit testing [4], biology [1,29], memory profiling [38], the analysis of diffusion networks [28], and in connectivity problems [17,18,21,22,24,31,32,33,34]. ...
Conference Paper
Full-text available
We consider practical algorithms for maintaining the dominator tree and a low-high order in directed acyclic graphs (DAGs) subject to dynamic operations. Let G be a directed graph with a distinguished start vertex s. The dominator tree D of G is a tree rooted at s, such that a vertex v is an ancestor of a vertex w if and only if all paths from s to w in G include v. The dominator tree is a central tool in program optimization and code generation, and has many applications in other diverse areas including constraint programming, circuit testing, biology, and in algorithms for graph connectivity problems. A low-high order of G is a preorder of D that certifies the correctness of D, and has further applications in connectivity and path-determination problems. We first provide a practical and carefully engineered version of a recent algorithm [ICALP 2017] for maintaining the dominator tree of a DAG through a sequence of edge deletions. The algorithm runs in O(mn) total time and O(m) space, where n is the number of vertices and m is the number of edges before any deletion. In addition, we present a new algorithm that maintains a low-high order of a DAG under edge deletions within the same bounds. Both results extend to the case of reducible graphs (a class that includes DAGs). Furthermore, we present a fully dynamic algorithm for maintaining the dominator tree of a DAG under an intermixed sequence of edge insertions and deletions. Although it does not maintain the O(mn) worst-case bound of the decremental algorithm, our experiments highlight that the fully dynamic algorithm performs very well in practice. Finally, we study the practical efficiency of all our algorithms by conducting an extensive experimental study on real-world and synthetic graphs.
... These metrics range from simple absolute count differences between allocations and deallocations [5] to more complex ones based on the structure of the object reference graph [12,13]. (2) Offline approaches that collect information about an application for later analysis, separated into approaches that (a) analyze heap dumps as well as other kinds of captured state [15,[18][19][20]. Compared to online approaches, offline approaches often perform more complicated analyses based on the object reference graph, involving graph reduction, graph mining and ownership analysis. ...
Conference Paper
Memory leaks are a major threat in modern software systems. They occur if objects are unintentionally kept alive longer than necessary and are often indicated by continuously growing data structures. While there are various state-of-the-art memory monitoring tools, most of them share two critical shortcomings: (1) They have no knowledge about the monitored application's data structures and (2) they support no or only rudimentary analysis of the application's data structures over time. This paper encompasses novel techniques to tackle both of these drawbacks. It presents a domain-specific language (DSL) that allows users to describe arbitrary data structures, as well as an algorithm to detect instances of these data structures in reconstructed heaps. In addition, we propose techniques and metrics to analyze and measure the evolution of data structure instances over time. This allows us to identify those instances that are most likely involved in a memory leak. These concepts have been integrated into AntTracks, a trace-based memory monitoring tool. We present our approach to detect memory leaks in several real-world applications, showing its applicability and feasibility.
... Subsequently, several linear-time algorithms were discovered [5,17,35,38]. The problem of finding dominators has been extensively studied, as it occurs in several applications, including program optimization and code generation [25], constraint programming [84], circuit testing [6], theoretical biology [3], memory profiling [78], fault-tolerant computing [7,8], connectivity and path-determination problems [40,41,47,48,54,65,69,70], and the analysis of diffusion networks [50]. ...
Thesis
Full-text available
Some of the most basic connectivity concepts in directed graphs are reachability and strong connectivity. The concept of strong connectivity naturally extends to the 2-connectivity in directed graphs. There exist two main concepts in 2-connectivity in directed graphs, namely the maximal 2-connected subgraphs, and the 2-connected components. A maximal 2-edge-connected subgraph is a maximal subgraph that remains strongly connected after the removal of any of its edge. The pairwise version of 2-connectivity is defined as follows. We say that two vertices are 2-edge-connected if the removal of any edge leaves them in the same strongly connected component. A 2-edge-connected component of a directed graph is a maximal set of vertices such that all pairs of vertices are 2-edge-connected. Similar definitions can be given for 2-vertex-connectivity. In this thesis, we presented algorithms for basic connectivity concepts in directed graphs. In particular, we presented algorithms for reachability, strong connectivity, and 2-connectivity in different settings. We begin by presenting new algorithms for computing the maximal 2-edge- and 2-vertex-connected subgraphs in directed graphs in O(m^{3/2}) time, where and are respectively the number of edges and vertices of the graph. That improves the previously best-known O(n^2) bound for sparse graphs. We presented an optimal data structure that can answer strong connectivity queries under edge or vertex failures. More specifically, after O(m+n) time preprocessing, we build a O(n)-size data structure such that, given a failing edge e, it can answer various strong connectivity queries in G\e, such as reporting the strongly connected components, or their number, or the size of the largest strongly connected component, and more. All of the queries are answered in asymptotically optimal time. The same bounds apply with respect to vertex failures. We further study connectivity problems in the dynamic setting, where the goal is to maintain a solution to the problems at hand while the graph undergoes edges updates. First, we presented an algorithm that can maintain the strongly connected components and single-source reachability information under any sequence of edge deletions in a total of O(m \sqrt{n log n})time. Our algorithm achieves a sharp improvement over the previously best-known algorithm for both of these problems that runs in a total of O(mn^{0.9+o(1)}) time. The second problem that we study in the dynamic setting is the maintenance of a dominator tree under any sequence of edge deletions. We presented an algorithm that runs in a total of O(mn log n) time. This is the first dynamic algorithm for these problems that beats the naive algorithms that simply recompute from scratch the solution after each update. Our algorithm also maintains various 2-connectivity notions. We further give a conditional lower bound that provides evidence that these running times may be tight up to subpolynomial factors. Finally, we presented algorithms for maintaining the 2-edge-connected components under any sequence of edge insertions in a total of O(mn) time. Again, this is the first algorithm to beat the trivial algorithm that recomputes from scratch the solution after every update. Also here, we give a conditional lower bound showing that shaving polynomial factors from our bound has interesting consequences.
... There are many other areas which dominators are used in, such circuit testing [4], constraint programming [30], connectivity and path-determination problems [12], [13], [23], memory profiling [28], theoretical biology [2], and the analysis of diffusion networks [22]. ...
Conference Paper
The visualization of large graphs in interactive applications, specifically on small devices, can make harder to understand and analyze the displayed information. We show as simple topological properties of the graph can provide an efficient automatic computation of features which improves the ``readability'' of a large graph by a proper selection of the displayed information. The connectivity (existence of a path) is a very intuitive structural property of a network; in this paper we propose an approach to the visualization of a network based on connectivity and related concepts as effective tools for visual analysis. In particular, given a root vertex $r$ and a target vertex $t$, it is possible to check at a glance if there are some dominators, i.e., mandatory vertices that are on every path from $r$ to $t$. Furthermore, using a recent graph algorithm from Georgiadis and Tarjan, by selecting a target vertex it is possible to see two distinct paths from r to t: the paths are vertex-disjoints if there are no dominators from r to t, otherwise the paths have only the dominators in common. We conclude by presenting, as a relevant case study that motivated our work, as this approach improves a personalized eLearning application. In a framework, for dynamic configuration of paths of learning activities for both individual and group education, we can add visual analysis capabilities for both the final user/learner, and for the administrator of a repository.