Open Access. Powered by Scholars. Published by Universities.®

Software Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 24 of 24

Full-Text Articles in Software Engineering

Nort: Runtime Anomaly-Based Monitoring Of Malicious Behavior For Windows, Narcisa Andrea Milea, Siau-Cheng Khoo, David Lo, Cristi Pop Dec 2011

Nort: Runtime Anomaly-Based Monitoring Of Malicious Behavior For Windows, Narcisa Andrea Milea, Siau-Cheng Khoo, David Lo, Cristi Pop

David LO

Protecting running programs from exploits has been the focus of many host-based intrusion detection systems. To this end various formal methods have been developed that either require manual construction of attack signatures or modelling of normal program behavior to detect exploits. In terms of the ability to discover new attacks before the infection spreads, the former approach has been found to be lacking in flexibility. Consequently, in this paper, we present an anomaly monitoring system, NORT, that verifies on-the-fly whether running programs comply to their expected normal behavior. The model of normal behavior is based on a rich set of …


Search-Based Fault Localization, Shaowei Wang, David Lo, Lingxiao Jiang, - Lucia, Hoong Chuin Lau Dec 2011

Search-Based Fault Localization, Shaowei Wang, David Lo, Lingxiao Jiang, - Lucia, Hoong Chuin Lau

David LO

Many spectrum-based fault localization measures have been proposed in the literature. However, no single fault localization measure completely outperforms others: a measure which is more accurate in localizing some bugs in some programs is less accurate in localizing other bugs in other programs. This paper proposes to compose existing spectrum-based fault localization measures into an improved measure. We model the composition of various measures as an optimization problem and present a search-based approach to explore the space of many possible compositions and output a heuristically near optimal composite measure. We employ two search-based strategies including genetic algorithm and simulated annealing …


Towards More Accurate Retrieval Of Duplicate Bug Reports, Chengnian Sun, David Lo, Siau-Cheng Khoo, Jing Jiang Dec 2011

Towards More Accurate Retrieval Of Duplicate Bug Reports, Chengnian Sun, David Lo, Siau-Cheng Khoo, Jing Jiang

David LO

In a bug tracking system, different testers or users may submit multiple reports on the same bugs, referred to as duplicates, which may cost extra maintenance efforts in triaging and fixing bugs. In order to identify such duplicates accurately, in this paper we propose a retrieval function (REP) to measure the similarity between two bug reports. It fully utilizes the information available in a bug report including not only the similarity of textual content in summary and description fields, but also similarity of non-textual fields such as product, component, version, etc. For more accurate measurement of textual similarity, we extend …


Bug Signature Minimization And Fusion, David Lo, Hong Cheng, Xiaoyin Wang Dec 2011

Bug Signature Minimization And Fusion, David Lo, Hong Cheng, Xiaoyin Wang

David LO

Debugging is a time-consuming activity. To help in debugging, many approaches have been proposed to pinpoint the location of errors given labeled failures and correct executions. While such approaches have been shown to be accurate, at times the location alone is not sufficient in helping programmers understand why the bug happens and how to fix it. Furthermore, a single location might not be powerful enough to discriminate failures from correct executions. To address the above challenges, there have been recent studies on extracting bug signatures which are composed of multiple locations appearing together in a particular order signifying an occurrence …


Recommending People In Developers' Collaboration Network, Didi Surian, Nian Liu, David Lo, Hanghang Tong, Ee Peng Lim, Christos Faloutsos Dec 2011

Recommending People In Developers' Collaboration Network, Didi Surian, Nian Liu, David Lo, Hanghang Tong, Ee Peng Lim, Christos Faloutsos

David LO

Many software developments involve collaborations of developers across the globe. This is true for both open-source and closed-source development efforts. Developers collaborate on different projects of various types. As with any other teamwork endeavors, finding compatibility among members in a development team is helpful towards the realization of the team’s goal. Compatible members tend to share similar programming style and naming strategy, communicate well with one another, etc. However, finding the right person to work with is not an easy task. In this work, we extract information available from Sourceforge.Net, the largest database of open source software, and build developer …


Towards Succinctness In Mining Scenario-Based Specifications, David Lo, Shahar Maoz Dec 2011

Towards Succinctness In Mining Scenario-Based Specifications, David Lo, Shahar Maoz

David LO

Specification mining methods are used to extract candidate specifications from system execution traces. A major challenge for specification mining is succinctness. That is, in addition to the soundness, completeness, and scalable performance of the specification mining method, one is interested in producing a succinct result, which conveys a lot of information about the system under investigation but uses a short, machine and human-readable representation. In this paper we address the succinctness challenge in the context of scenario-based specification mining, whose target formalism is live sequence charts (LSC), an expressive extension of classical sequence diagrams. We do this by adapting three …


Mining Temporal Rules From Program Execution Traces, David Lo, Siau-Cheng Khoo, Chao Liu Nov 2011

Mining Temporal Rules From Program Execution Traces, David Lo, Siau-Cheng Khoo, Chao Liu

David LO

Specification mining is a process of extracting specifications, often from program execution traces. These specifications can in turn be used to aid program understanding, monitoring and verification. There are a number of dynamic-analysis-based specification mining tools in the literature, however none so far extract past time temporal expressions in the form of rules stating: "whenever a series of events occurs, previously another series of events has happened". Rules of this format are commonly found in practice and useful for various purposes. Most rule-based specification mining tools only mine future-time temporal expression. Many past-time temporal rules like "whenever a resource is …


Efficient Mining Of Iterative Patterns For Software Specification Discovery, David Lo, Siau-Cheng Khoo, Chao Liu Nov 2011

Efficient Mining Of Iterative Patterns For Software Specification Discovery, David Lo, Siau-Cheng Khoo, Chao Liu

David LO

Studies have shown that program comprehension takes up to 45% of software development costs. Such high costs are caused by the lack-of documented specification and further aggravated by the phenomenon of software evolution. There is a need for automated tools to extract specifications to aid program comprehension. In this paper, a novel technique to efficiently mine common software temporal patterns from traces is proposed. These patterns shed light on program behaviors, and are termed iterative patterns. They capture unique characteristic of software traces, typically not found in arbitrary sequences. Specifically, due to loops, interesting iterative patterns can occur multiple times …


Smartic: Specification Mining Architecture With Trace Filtering And Clustering, David Lo, Siau-Cheng Khoo Nov 2011

Smartic: Specification Mining Architecture With Trace Filtering And Clustering, David Lo, Siau-Cheng Khoo

David LO

Improper management of software evolution, compounded by imprecise, and changing requirements, along with the "short time to market" requirement, commonly leads to a lack of up-to-date specifications. This can result in software that is characterized by bugs, anomalies and even security threats. Software specification mining is a new technique to address this concern by inferring specifications automatically. In this paper, we propose a novel API specification mining architecture called SMArTIC Specification Mining Architecture with Trace fIltering and Clustering) to improve the accuracy, robustness and scalability of specification miners. This architecture is constructed based on two hypotheses: (1) Erroneous traces should …


Mining Software Specifications, David Lo, Siau-Cheng Khoo Nov 2011

Mining Software Specifications, David Lo, Siau-Cheng Khoo

David LO

No abstract provided.


Model Checking In The Absence Of Code, Model And Properties, David Lo, Siau-Cheng Khoo Nov 2011

Model Checking In The Absence Of Code, Model And Properties, David Lo, Siau-Cheng Khoo

David LO

Model checking is a major approach in ensuring software correctness. It verifies a model converted from code against some formal properties. However, difficulties and programmers ’ reluctance to formalize formal properties have been some hurdles to its widespread industrial adoption. Also, with the advent of commercial off-the-shelf (COTS) components provided by third party vendors, model checking is further challenged as often only a binary version of the code is provided by vendors. Interestingly, latest instrumentation tools like PIN and Valgrind have enable execution traces to be collected dynamically from a running program. In this preliminary study, we investigate what can …


Mining Past-Time Temporal Rules: A Dynamic Analysis Approach, David Lo, Siau-Cheng Khoo, Chao Liu Nov 2011

Mining Past-Time Temporal Rules: A Dynamic Analysis Approach, David Lo, Siau-Cheng Khoo, Chao Liu

David LO

No abstract provided.


Efficient Mining Of Recurrent Rules From A Sequence Database, David Lo, Siau-Cheng Khoo, Chao Liu Nov 2011

Efficient Mining Of Recurrent Rules From A Sequence Database, David Lo, Siau-Cheng Khoo, Chao Liu

David LO

We study a novel problem of mining significant recurrent rules from a sequence database. Recurrent rules have the form "whenever a series of precedent events occurs, eventually a series of consequent events occurs". Recurrent rules are intuitive and characterize behaviors in many domains. An example is in the domain of software specifications, in which the rules capture a family of program properties beneficial to program verification and bug detection. Recurrent rules generalize existing work on sequential and episode rules by considering repeated occurrences of premise and consequent events within a sequence and across multiple sequences, and by removing the "window" …


Mining Modal Scenarios From Execution Traces, David Lo, Shahar Maoz, Siau-Cheng Khoo Nov 2011

Mining Modal Scenarios From Execution Traces, David Lo, Shahar Maoz, Siau-Cheng Khoo

David LO

Specification mining is a dynamic analysis process aimed at automatically inferring suggested specifications of a program from its execution traces. We describe a method, a framework, and a tool, for mining inter-object scenario-based specifications in the form of a UML2-compliant variant of Damm and Harel's Live Sequence Charts (LSC), which extends the classical partial order semantics of sequence diagrams with temporal liveness and symbolic class level lifelines, in order to generate compact and expressive specifications. Moreover, we use previous research work and tools developed for LSC to visualize, analyze, manipulate, test, and thus evaluate the scenario-based specifications we mine. Our …


Towards Better Quality Specification Miners, David Lo, Siau-Cheng Khoo Nov 2011

Towards Better Quality Specification Miners, David Lo, Siau-Cheng Khoo

David LO

Softwares are often built without specification. Tools to automatically extract specification from software are needed and many techniques have been proposed. One type of these specifications – temporal API specification – is often specified in the form of automaton (i.e., FSA/PFSA). There have been many work on mining software temporal specification using dynamic analysis techniques; i.e., analysis of software program traces. Unfortunately, the issues of scalability, robustness and accuracy of these techniques have not been comprehensively addressed. In this paper, we describe a framework that enables assessments of the performance of a specification miner in generating temporal specification of software …


Mining Modal Scenarios-Based Specifications From Execution Trace Of Reactive Systems, David Lo, Shahar Maoz, Siau-Cheng Khoo Nov 2011

Mining Modal Scenarios-Based Specifications From Execution Trace Of Reactive Systems, David Lo, Shahar Maoz, Siau-Cheng Khoo

David LO

Specification mining is a dynamic analysis process aimed at automatically inferring suggested specifications of a program from its execution traces. We describe a novel method, framework, and tool, for mining inter-object scenario-based specifications in the form of a UML2-compliant variant of Damm and Harels Live Sequence Charts (LSC). LSC extends the classical partial order semantics of sequence diagrams with temporal liveness and symbolic class level lifelines, in order to generate compact and expressive specifications. The output of our algorithm is a sound and complete set of statistically significant LSCs (i.e., satisfying given thresholds of support and confidence), mined from an …


Mining Patterns And Rules For Software Specification Discovery, David Lo, Siau-Cheng Khoo Nov 2011

Mining Patterns And Rules For Software Specification Discovery, David Lo, Siau-Cheng Khoo

David LO

Software specifications are often lacking, incomplete and outdated in the industry. Lack and incomplete specifications cause various software engineering problems. Studies have shown that program comprehension takes up to 45% of software development costs. One of the root causes of the high cost is the lack-of documented specification. Also, outdated and incomplete specification might potentially cause bugs and compatibility issues. In this paper, we describe novel data mining techniques to mine or reverse engineer these specifications from the pool of software engineering data. A large amount of software data is available for analysis. One form of software data is program …


Mining Specifications In Diversified Formats From Execution Traces, David Lo Nov 2011

Mining Specifications In Diversified Formats From Execution Traces, David Lo

David LO

Software evolves; this phenomenon causes increase in maintenance efforts, problem in comprehending the ever-changing code base and difficulty in verifying software correctness. As software changes, often the documented specification is not updated. Outdated specification adds challenge to the understanding of the code base during maintenance tasks. Also, software changes might induce bugs, anomalies and even security threats. To address the above issues, we propose an array of specification mining techniques to mine software specifications in diversified formats from program execution traces. Case studies on various systems show that the extracted specifications shed light on the behaviors of systems under analysis. …


Data Mining For Software Engineering, Tao Xie, Suresh Thummalapenta, David Lo, Chao Liu Nov 2011

Data Mining For Software Engineering, Tao Xie, Suresh Thummalapenta, David Lo, Chao Liu

David LO

To improve software productivity and quality, software engineers are increasingly applying data mining algorithms to various software engineering tasks. However, mining SE data poses several challenges. The authors present various algorithms to effectively mine sequences, graphs, and text from such data.


Smartic: Towards Building An Accurate, Robust And Scalable Specification Miner, David Lo, Siau-Cheng Khoo Nov 2011

Smartic: Towards Building An Accurate, Robust And Scalable Specification Miner, David Lo, Siau-Cheng Khoo

David LO

Improper management of software evolution, compounded by imprecise, and changing requirements, along with the “short time to market ” requirement, commonly leads to a lack of up-to-date specifications. This can result in software that is characterized by bugs, anomalies and even security threats. Software specification mining is a new technique to address this concern by inferring specifications automatically. In this paper, we propose a novel API specification mining architecture called SMArTIC (Specification Mining Architecture with Trace fIltering and Clustering) to improve the accuracy, robustness and scalability of specification miners. This architecture is constructed based on two hypotheses: (1) Erroneous traces …


Mining Scenario-Based Specifications With Value-Based Invariants, David Lo, Shahar Maoz Nov 2011

Mining Scenario-Based Specifications With Value-Based Invariants, David Lo, Shahar Maoz

David LO

There have been a number of studies on mining candidate specifications from execution traces. Some extract specifications corresponding to value-based invariants, while others work on inferring ordering constraints. In this work, we merge our previous work on mining scenario-based specifications, extracting ordering constraints in the form of live sequence charts (LSC), a visual specification language, with Daikon, a tool for mining value-based invariants. The resulting approach strengthens the expressive power of the mined scenarios by enriching them with scenario-specific value-based invariants. The concept is illustrated using a preliminary case study on a real application.


Quark : Empirical Assessment Of Automaton-Based Specification Miners, David Lo, Siau-Cheng Khoo Nov 2011

Quark : Empirical Assessment Of Automaton-Based Specification Miners, David Lo, Siau-Cheng Khoo

David LO

Software is often built without specification. Tools to automatically extract specification from software are needed and many techniques have been proposed. One type of these specifications - temporal API specification - is often specified in the form of automaton. There has been much work on reverse engineering or mining software temporal specification, using dynamic analysis techniques; i.e., analysis of software program traces. Unfortunately, the issues of scalability, robustness and accuracy of these techniques have not been comprehensively addressed. In this paper, we describe QUARK(QUality Assurance framewoRK) that enables assessments of the performance of a specification miner in generating temporal specification …


Specification Mining: A Concise Introduction, David Lo, Siau-Cheng Khoo, Chao Liu, Jiawei Han Nov 2011

Specification Mining: A Concise Introduction, David Lo, Siau-Cheng Khoo, Chao Liu, Jiawei Han

David LO

No abstract provided.


Leveraging Fragmental Semantic Data To Enhance Services Discovery, Jian Wang, Jia Zhang, Patrick Hung, Zheng Li, Jianxiao Liu, Keqing He Aug 2011

Leveraging Fragmental Semantic Data To Enhance Services Discovery, Jian Wang, Jia Zhang, Patrick Hung, Zheng Li, Jianxiao Liu, Keqing He

Jia Zhang

No abstract provided.