Scalable program recognition for knowledge-based reverse engineering
Reverse engineering is the process of reconstructing high-level design information from lower-level information such as program code. Reverse engineering and re-engineering (a reverse engineering step followed by a forward engineering step) have become pressing needs for many organisations as existing legacy code no longer meets their needs. Program understanding plays an important role in any reverse engineering activity since the user (typically a maintainer) needs to reconstruct the cognitive conceptualisation of the programmer to be able to understand and make any changes to the existing system. There have been some attempts in artificial intelligence (AI) to automatically "understand" a program in terms of a set of pre-defined plans. Most of these attempts have encountered many problems of scalability and brittleness when applied to real world problems such as reverse engineering. Software engineering is different from traditional engineering disciplines. A software system is a cognitive artifact and is intimately tied to the conceptualisations of the human programmer. Hence techniques such as formal analysis, originally developed in traditional engineering disciplines do not adapt well to software engineering. Attempts to remedy the problems faced by software engineering, such as the "software crisis", yielded only limited results. It is necessary to acknowledge the fact that software engineering is primarily a cognitive process. We need to develop tools and techniques that address the problems of software engineering from this perspective. In this thesis, we present a human-centered software reverse engineering environment using a scalable, robust program recognition technique based on granularity. The granularity-based formalism as used in SCENT is extended by adding additional types of constraints and context modifiers to make the program recognition efficient. Granularity-based program recognition overcomes many limitations of the traditional approaches by allowing the human expert using the system to be always "in-the-loop" of problem solving. The agenda-based recognition method presented here is flexible to be able to use various sources of information to guide the system. A prototype system called KARE (Knowledge-based Assistant for Reverse Engineering) is implemented using this approach. Three different experiments were conducted with KARE by using it on two different real-world software systems. The results from the experiments show evidence that the powerful granularity mechanisms such as the context modifiers along with appropriate human interventions help make KARE scalable. The thesis describes KARE and our experiences of using KARE on real-world software systems along with some experimental evidence which demonstrates the usefulness of the approach.