Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 60

Full-Text Articles in Physical Sciences and Mathematics

Nort: Runtime Anomaly-Based Monitoring Of Malicious Behavior For Windows, Narcisa Andrea Milea, Siau-Cheng Khoo, David Lo, Cristi Pop Dec 2011

Nort: Runtime Anomaly-Based Monitoring Of Malicious Behavior For Windows, Narcisa Andrea Milea, Siau-Cheng Khoo, David Lo, Cristi Pop

David LO

Protecting running programs from exploits has been the focus of many host-based intrusion detection systems. To this end various formal methods have been developed that either require manual construction of attack signatures or modelling of normal program behavior to detect exploits. In terms of the ability to discover new attacks before the infection spreads, the former approach has been found to be lacking in flexibility. Consequently, in this paper, we present an anomaly monitoring system, NORT, that verifies on-the-fly whether running programs comply to their expected normal behavior. The model of normal behavior is based on a rich set of …


Search-Based Fault Localization, Shaowei Wang, David Lo, Lingxiao Jiang, - Lucia, Hoong Chuin Lau Dec 2011

Search-Based Fault Localization, Shaowei Wang, David Lo, Lingxiao Jiang, - Lucia, Hoong Chuin Lau

David LO

Many spectrum-based fault localization measures have been proposed in the literature. However, no single fault localization measure completely outperforms others: a measure which is more accurate in localizing some bugs in some programs is less accurate in localizing other bugs in other programs. This paper proposes to compose existing spectrum-based fault localization measures into an improved measure. We model the composition of various measures as an optimization problem and present a search-based approach to explore the space of many possible compositions and output a heuristically near optimal composite measure. We employ two search-based strategies including genetic algorithm and simulated annealing …


Towards More Accurate Retrieval Of Duplicate Bug Reports, Chengnian Sun, David Lo, Siau-Cheng Khoo, Jing Jiang Dec 2011

Towards More Accurate Retrieval Of Duplicate Bug Reports, Chengnian Sun, David Lo, Siau-Cheng Khoo, Jing Jiang

David LO

In a bug tracking system, different testers or users may submit multiple reports on the same bugs, referred to as duplicates, which may cost extra maintenance efforts in triaging and fixing bugs. In order to identify such duplicates accurately, in this paper we propose a retrieval function (REP) to measure the similarity between two bug reports. It fully utilizes the information available in a bug report including not only the similarity of textual content in summary and description fields, but also similarity of non-textual fields such as product, component, version, etc. For more accurate measurement of textual similarity, we extend …


Bug Signature Minimization And Fusion, David Lo, Hong Cheng, Xiaoyin Wang Dec 2011

Bug Signature Minimization And Fusion, David Lo, Hong Cheng, Xiaoyin Wang

David LO

Debugging is a time-consuming activity. To help in debugging, many approaches have been proposed to pinpoint the location of errors given labeled failures and correct executions. While such approaches have been shown to be accurate, at times the location alone is not sufficient in helping programmers understand why the bug happens and how to fix it. Furthermore, a single location might not be powerful enough to discriminate failures from correct executions. To address the above challenges, there have been recent studies on extracting bug signatures which are composed of multiple locations appearing together in a particular order signifying an occurrence …


Mining Top-K Large Structural Patterns In A Massive Network, Feida Zhu, Qiang Qu, David Lo, Xifeng Yan, Jiawei Han, Philip S. Yu Dec 2011

Mining Top-K Large Structural Patterns In A Massive Network, Feida Zhu, Qiang Qu, David Lo, Xifeng Yan, Jiawei Han, Philip S. Yu

David LO

With ever-growing popularity of social networks, web and bio-networks, mining large frequent patterns from a single huge network has become increasingly important. Yet the existing pattern mining methods cannot offer the efficiency desirable for large pattern discovery. We propose Spider- Mine, a novel algorithm to efficiently mine top-K largest frequent patterns from a single massive network with any user-specified probability of 1 − ϵ. Deviating from the existing edge-by-edge (i.e., incremental) pattern-growth framework, SpiderMine achieves its efficiency by unleashing the power of small patterns of a bounded diameter, which we call “spiders”. With the spider structure, our approach adopts a …


Recommending People In Developers' Collaboration Network, Didi Surian, Nian Liu, David Lo, Hanghang Tong, Ee Peng Lim, Christos Faloutsos Dec 2011

Recommending People In Developers' Collaboration Network, Didi Surian, Nian Liu, David Lo, Hanghang Tong, Ee Peng Lim, Christos Faloutsos

David LO

Many software developments involve collaborations of developers across the globe. This is true for both open-source and closed-source development efforts. Developers collaborate on different projects of various types. As with any other teamwork endeavors, finding compatibility among members in a development team is helpful towards the realization of the team’s goal. Compatible members tend to share similar programming style and naming strategy, communicate well with one another, etc. However, finding the right person to work with is not an easy task. In this work, we extract information available from Sourceforge.Net, the largest database of open source software, and build developer …


Towards Succinctness In Mining Scenario-Based Specifications, David Lo, Shahar Maoz Dec 2011

Towards Succinctness In Mining Scenario-Based Specifications, David Lo, Shahar Maoz

David LO

Specification mining methods are used to extract candidate specifications from system execution traces. A major challenge for specification mining is succinctness. That is, in addition to the soundness, completeness, and scalable performance of the specification mining method, one is interested in producing a succinct result, which conveys a lot of information about the system under investigation but uses a short, machine and human-readable representation. In this paper we address the succinctness challenge in the context of scenario-based specification mining, whose target formalism is live sequence charts (LSC), an expressive extension of classical sequence diagrams. We do this by adapting three …


Mining Temporal Rules From Program Execution Traces, David Lo, Siau-Cheng Khoo, Chao Liu Nov 2011

Mining Temporal Rules From Program Execution Traces, David Lo, Siau-Cheng Khoo, Chao Liu

David LO

Specification mining is a process of extracting specifications, often from program execution traces. These specifications can in turn be used to aid program understanding, monitoring and verification. There are a number of dynamic-analysis-based specification mining tools in the literature, however none so far extract past time temporal expressions in the form of rules stating: "whenever a series of events occurs, previously another series of events has happened". Rules of this format are commonly found in practice and useful for various purposes. Most rule-based specification mining tools only mine future-time temporal expression. Many past-time temporal rules like "whenever a resource is …


Efficient Mining Of Iterative Patterns For Software Specification Discovery, David Lo, Siau-Cheng Khoo, Chao Liu Nov 2011

Efficient Mining Of Iterative Patterns For Software Specification Discovery, David Lo, Siau-Cheng Khoo, Chao Liu

David LO

Studies have shown that program comprehension takes up to 45% of software development costs. Such high costs are caused by the lack-of documented specification and further aggravated by the phenomenon of software evolution. There is a need for automated tools to extract specifications to aid program comprehension. In this paper, a novel technique to efficiently mine common software temporal patterns from traces is proposed. These patterns shed light on program behaviors, and are termed iterative patterns. They capture unique characteristic of software traces, typically not found in arbitrary sequences. Specifically, due to loops, interesting iterative patterns can occur multiple times …


Smartic: Specification Mining Architecture With Trace Filtering And Clustering, David Lo, Siau-Cheng Khoo Nov 2011

Smartic: Specification Mining Architecture With Trace Filtering And Clustering, David Lo, Siau-Cheng Khoo

David LO

Improper management of software evolution, compounded by imprecise, and changing requirements, along with the "short time to market" requirement, commonly leads to a lack of up-to-date specifications. This can result in software that is characterized by bugs, anomalies and even security threats. Software specification mining is a new technique to address this concern by inferring specifications automatically. In this paper, we propose a novel API specification mining architecture called SMArTIC Specification Mining Architecture with Trace fIltering and Clustering) to improve the accuracy, robustness and scalability of specification miners. This architecture is constructed based on two hypotheses: (1) Erroneous traces should …


Mining Software Specifications, David Lo, Siau-Cheng Khoo Nov 2011

Mining Software Specifications, David Lo, Siau-Cheng Khoo

David LO

No abstract provided.


Model Checking In The Absence Of Code, Model And Properties, David Lo, Siau-Cheng Khoo Nov 2011

Model Checking In The Absence Of Code, Model And Properties, David Lo, Siau-Cheng Khoo

David LO

Model checking is a major approach in ensuring software correctness. It verifies a model converted from code against some formal properties. However, difficulties and programmers ’ reluctance to formalize formal properties have been some hurdles to its widespread industrial adoption. Also, with the advent of commercial off-the-shelf (COTS) components provided by third party vendors, model checking is further challenged as often only a binary version of the code is provided by vendors. Interestingly, latest instrumentation tools like PIN and Valgrind have enable execution traces to be collected dynamically from a running program. In this preliminary study, we investigate what can …


Mining Past-Time Temporal Rules: A Dynamic Analysis Approach, David Lo, Siau-Cheng Khoo, Chao Liu Nov 2011

Mining Past-Time Temporal Rules: A Dynamic Analysis Approach, David Lo, Siau-Cheng Khoo, Chao Liu

David LO

No abstract provided.


Efficient Mining Of Recurrent Rules From A Sequence Database, David Lo, Siau-Cheng Khoo, Chao Liu Nov 2011

Efficient Mining Of Recurrent Rules From A Sequence Database, David Lo, Siau-Cheng Khoo, Chao Liu

David LO

We study a novel problem of mining significant recurrent rules from a sequence database. Recurrent rules have the form "whenever a series of precedent events occurs, eventually a series of consequent events occurs". Recurrent rules are intuitive and characterize behaviors in many domains. An example is in the domain of software specifications, in which the rules capture a family of program properties beneficial to program verification and bug detection. Recurrent rules generalize existing work on sequential and episode rules by considering repeated occurrences of premise and consequent events within a sequence and across multiple sequences, and by removing the "window" …


Mining Modal Scenarios From Execution Traces, David Lo, Shahar Maoz, Siau-Cheng Khoo Nov 2011

Mining Modal Scenarios From Execution Traces, David Lo, Shahar Maoz, Siau-Cheng Khoo

David LO

Specification mining is a dynamic analysis process aimed at automatically inferring suggested specifications of a program from its execution traces. We describe a method, a framework, and a tool, for mining inter-object scenario-based specifications in the form of a UML2-compliant variant of Damm and Harel's Live Sequence Charts (LSC), which extends the classical partial order semantics of sequence diagrams with temporal liveness and symbolic class level lifelines, in order to generate compact and expressive specifications. Moreover, we use previous research work and tools developed for LSC to visualize, analyze, manipulate, test, and thus evaluate the scenario-based specifications we mine. Our …


Towards Better Quality Specification Miners, David Lo, Siau-Cheng Khoo Nov 2011

Towards Better Quality Specification Miners, David Lo, Siau-Cheng Khoo

David LO

Softwares are often built without specification. Tools to automatically extract specification from software are needed and many techniques have been proposed. One type of these specifications – temporal API specification – is often specified in the form of automaton (i.e., FSA/PFSA). There have been many work on mining software temporal specification using dynamic analysis techniques; i.e., analysis of software program traces. Unfortunately, the issues of scalability, robustness and accuracy of these techniques have not been comprehensively addressed. In this paper, we describe a framework that enables assessments of the performance of a specification miner in generating temporal specification of software …


Mining Modal Scenarios-Based Specifications From Execution Trace Of Reactive Systems, David Lo, Shahar Maoz, Siau-Cheng Khoo Nov 2011

Mining Modal Scenarios-Based Specifications From Execution Trace Of Reactive Systems, David Lo, Shahar Maoz, Siau-Cheng Khoo

David LO

Specification mining is a dynamic analysis process aimed at automatically inferring suggested specifications of a program from its execution traces. We describe a novel method, framework, and tool, for mining inter-object scenario-based specifications in the form of a UML2-compliant variant of Damm and Harels Live Sequence Charts (LSC). LSC extends the classical partial order semantics of sequence diagrams with temporal liveness and symbolic class level lifelines, in order to generate compact and expressive specifications. The output of our algorithm is a sound and complete set of statistically significant LSCs (i.e., satisfying given thresholds of support and confidence), mined from an …


Mining Patterns And Rules For Software Specification Discovery, David Lo, Siau-Cheng Khoo Nov 2011

Mining Patterns And Rules For Software Specification Discovery, David Lo, Siau-Cheng Khoo

David LO

Software specifications are often lacking, incomplete and outdated in the industry. Lack and incomplete specifications cause various software engineering problems. Studies have shown that program comprehension takes up to 45% of software development costs. One of the root causes of the high cost is the lack-of documented specification. Also, outdated and incomplete specification might potentially cause bugs and compatibility issues. In this paper, we describe novel data mining techniques to mine or reverse engineer these specifications from the pool of software engineering data. A large amount of software data is available for analysis. One form of software data is program …


Mining Specifications In Diversified Formats From Execution Traces, David Lo Nov 2011

Mining Specifications In Diversified Formats From Execution Traces, David Lo

David LO

Software evolves; this phenomenon causes increase in maintenance efforts, problem in comprehending the ever-changing code base and difficulty in verifying software correctness. As software changes, often the documented specification is not updated. Outdated specification adds challenge to the understanding of the code base during maintenance tasks. Also, software changes might induce bugs, anomalies and even security threats. To address the above issues, we propose an array of specification mining techniques to mine software specifications in diversified formats from program execution traces. Case studies on various systems show that the extracted specifications shed light on the behaviors of systems under analysis. …


Data Mining For Software Engineering, Tao Xie, Suresh Thummalapenta, David Lo, Chao Liu Nov 2011

Data Mining For Software Engineering, Tao Xie, Suresh Thummalapenta, David Lo, Chao Liu

David LO

To improve software productivity and quality, software engineers are increasingly applying data mining algorithms to various software engineering tasks. However, mining SE data poses several challenges. The authors present various algorithms to effectively mine sequences, graphs, and text from such data.


Smartic: Towards Building An Accurate, Robust And Scalable Specification Miner, David Lo, Siau-Cheng Khoo Nov 2011

Smartic: Towards Building An Accurate, Robust And Scalable Specification Miner, David Lo, Siau-Cheng Khoo

David LO

Improper management of software evolution, compounded by imprecise, and changing requirements, along with the “short time to market ” requirement, commonly leads to a lack of up-to-date specifications. This can result in software that is characterized by bugs, anomalies and even security threats. Software specification mining is a new technique to address this concern by inferring specifications automatically. In this paper, we propose a novel API specification mining architecture called SMArTIC (Specification Mining Architecture with Trace fIltering and Clustering) to improve the accuracy, robustness and scalability of specification miners. This architecture is constructed based on two hypotheses: (1) Erroneous traces …


Mining Scenario-Based Specifications With Value-Based Invariants, David Lo, Shahar Maoz Nov 2011

Mining Scenario-Based Specifications With Value-Based Invariants, David Lo, Shahar Maoz

David LO

There have been a number of studies on mining candidate specifications from execution traces. Some extract specifications corresponding to value-based invariants, while others work on inferring ordering constraints. In this work, we merge our previous work on mining scenario-based specifications, extracting ordering constraints in the form of live sequence charts (LSC), a visual specification language, with Daikon, a tool for mining value-based invariants. The resulting approach strengthens the expressive power of the mined scenarios by enriching them with scenario-specific value-based invariants. The concept is illustrated using a preliminary case study on a real application.


Quark : Empirical Assessment Of Automaton-Based Specification Miners, David Lo, Siau-Cheng Khoo Nov 2011

Quark : Empirical Assessment Of Automaton-Based Specification Miners, David Lo, Siau-Cheng Khoo

David LO

Software is often built without specification. Tools to automatically extract specification from software are needed and many techniques have been proposed. One type of these specifications - temporal API specification - is often specified in the form of automaton. There has been much work on reverse engineering or mining software temporal specification, using dynamic analysis techniques; i.e., analysis of software program traces. Unfortunately, the issues of scalability, robustness and accuracy of these techniques have not been comprehensively addressed. In this paper, we describe QUARK(QUality Assurance framewoRK) that enables assessments of the performance of a specification miner in generating temporal specification …


Specification Mining: A Concise Introduction, David Lo, Siau-Cheng Khoo, Chao Liu, Jiawei Han Nov 2011

Specification Mining: A Concise Introduction, David Lo, Siau-Cheng Khoo, Chao Liu, Jiawei Han

David LO

No abstract provided.


Terapixel Imaging Of Cosmological Simulations, Yu Feng, Rupert Croft, Tiziana Di Matteo, Nishikanta Khandai, Randy Sargent, Illah Nourbakhsh, Paul Dille, Chris Bartley, Volker Springel, Anirban Jana, Jeffrey Gardner Nov 2011

Terapixel Imaging Of Cosmological Simulations, Yu Feng, Rupert Croft, Tiziana Di Matteo, Nishikanta Khandai, Randy Sargent, Illah Nourbakhsh, Paul Dille, Chris Bartley, Volker Springel, Anirban Jana, Jeffrey Gardner

Randy Sargent

The increasing size of cosmological simulations has led to the need for new visualization techniques. We focus on smoothed particle hydrodynamic (SPH) simulations run with the GADGET code and describe methods for visually accessing the entire simulation at full resolution. The simulation snapshots are rastered and processed on supercomputers into images that are ready to be accessed through a Web interface (GigaPan). This allows any scientist with a Web browser to interactively explore simulation data sets in both spatial and temporal dimensions and data sets which in their native format can be hundreds of terabytes in size or more. We …


Tuning The Wcet Of Embedded Applications, Wankang Zhao, Prasad Kulkarni, David Whalley, Christopher Healy, Frank Mueller, Gang-Ryung Uh Sep 2011

Tuning The Wcet Of Embedded Applications, Wankang Zhao, Prasad Kulkarni, David Whalley, Christopher Healy, Frank Mueller, Gang-Ryung Uh

Gang-Ryung Uh

It is advantageous to not only calculate the WCET of an application, but to also perform transformations to reduce the WCET since an application with a lower WCET will be less likely to violate its timing constraints. In this paper we describe an environment consisting of an interactive compilation system and a timing analyzer, where a user can interactively tune the WCET of an application. After each optimization phase is applied, the timing analyzer is automatically invoked to calculate the WCET of the function being tuned. Thus, a user can easily gauge the progress of reducing the WCET. In addition, …


3d Visualization In Elementary Education Astronomy: Teaching Urban Second Graders About Sun, Earth, And The Moon, Zeynep Isik-Ercan, Beomjin Kim, Jeffrey Nowak Sep 2011

3d Visualization In Elementary Education Astronomy: Teaching Urban Second Graders About Sun, Earth, And The Moon, Zeynep Isik-Ercan, Beomjin Kim, Jeffrey Nowak

Zeynep Isik-Ercan

This research-in-progress hypothesizes that urban second graders can have an early understanding about the shape of Sun, Moon, and Earth, how day and night happens, and how Moon appears to change its shape by using three dimensional stereoscopic vision. The 3D stereoscopic vision system might be an effective way to teach subjects like astronomy that explains relationships among objects in space. Currently, Indiana state standards for science teaching do not suggest the teaching of these astronomical concepts explicitly before fourth grade. Yet, we expect our findings to indicate that students can learn these concepts earlier in their educational lives with …


Who's Who In Portfolio Management: An Overview Of Roles, Responsibilities And Practices, Aileen Koh Sep 2011

Who's Who In Portfolio Management: An Overview Of Roles, Responsibilities And Practices, Aileen Koh

Aileen Koh

Portfolio management has been acknowledged by the project management community as a tool for optimizing the organizational returns from project investments by improving the alignment of projects with strategy and ensuring resource sufficiency. It aims to optimize the outcomes from project investment across a portfolio and it is also regarded as the governance method for selection and prioritization of projects or programs. Organizations that do not align their project portfolio with organizational strategies and governance will tend to increase the risks of running projects that are low priority initiatives. As a result, there will be critical resource shortages, and investments …


Preprocessing Strategy For Effective Modulo Scheduling On Multi-Issue Digital Signal Processors, Doosan Cho, Ravi Ayyagari, Gang-Ryung Uh, Yunheung Paek Sep 2011

Preprocessing Strategy For Effective Modulo Scheduling On Multi-Issue Digital Signal Processors, Doosan Cho, Ravi Ayyagari, Gang-Ryung Uh, Yunheung Paek

Gang-Ryung Uh

To achieve high resource utilization for multi-issue Digital Signal Processors (DSPs), production compilers commonly include variants of the iterative modulo scheduling algorithm. However, excessive cyclic data dependences, which exist in communication and media processing loops, often prevent the modulo scheduler from achieving ideal loop initiation intervals. As a result, replicated functional units in multi-issue DSPs are frequently underutilized. In response to this resource underutilization problem, this paper describes a compiler preprocessing strategy that capitalizes on two techniques for effective modulo scheduling, referred to as cloning1 and cloning2. The core of the proposed techniques lies in the direct relaxation of cyclic …


Analyzing Dynamic Binary Instrumentation Overhead, Gang-Ryung Uh, Robert Cohn, Bharadwaj Yadavalli, Ramesh Peri, Ravi Ayyagari Sep 2011

Analyzing Dynamic Binary Instrumentation Overhead, Gang-Ryung Uh, Robert Cohn, Bharadwaj Yadavalli, Ramesh Peri, Ravi Ayyagari

Gang-Ryung Uh

Robust and powerful software instrumentation tools are essential for dynamic program analysis tasks such as profiling, performance evaluation, and bug detection. Dynamic binary instrumentation (DBI) is a general purpose technique that eases the development of program analysis tools by facilitating automatic low-level instrumentation. DBI-based program analysis can introduce high overhead and it is crucial for tool writers to minimize the cost. Analyzing the performance of instrumentation tools is challenging because most systems use a just-in-time compiler (JIT) to dynamically generate code. In this paper, we describe our method for analyzing the performance of instrumentation tools. The instrumented code is itself …