Tao Xie's Research Interests

Also see Tao Xie's Publications by Years  Publications by Types  Selected Publications  DBLP  Google Scholar   Presentations/Posters
  How is our research work related to software industry?

Automated Software Engineering Research Group 
The Yangtse Project on Automated Software Testing in the Absence of Specification
The Mose Project on Mining Open Source Software Engineering Data

Tao Xie's research centers around two major themes: automated software testing and mining software engineering data. His research also focuses on software security testing and analysistesting and analysis of aspect-oriented programs, testing and analysis of web services and applications, testing and analysis of software designs, software verification, and software evolution. Here are the papers about which Tao Xie has published. (Superseded papers are not listed.)
Research Themes Research Subareas

Current funding: NSF SoD (3 yrs), NSF CyberTrust (3 yrs), IBM Faculty Award (gift), Microsoft Research (gift), ABB Research (gift, gift)
Past funding:  NSF CSR (1 yr), ARO STIR (9 mons), CACC (1 yr, 1 yr), NCSU FRPD (1 yr)
Funded projects:
     Improving Software Productivity and Quality via Mining Program Source Code funded by NSF CSR, ARO STIR
      Software Testing and Analysis for Software Evolution funded by NSF SoD
      Testing and Verification of Security Policies funded by NSF CyberTrust


Released Tools: Stabilizer  Jov  Jusc  Romant

Test Generation

DSD-Crasher: A hybrid analysis tool for bug finding (TOSEM 2008, ACM Transactions on Software Engineering and Methodology, 2008)
This paper presents DSD-Crasher, a bug finding tool that follows a three-step approach to program analysis: D. Capture the program's intended execution behavior with dynamic invariant detection. The derived invariants exclude many unwanted values from the program's input domain. S. Statically analyze the program within the restricted input domain to explore many paths. D. Automatically generate test cases that focus on reproducing the predictions of the static analysis. Thereby confirmed results are feasible.This three-step approach yields benefits compared to past two-step combinations in the literature.
Evacon: A Framework for Integrating Evolutionary and Concolic Testing for Object-Oriented Programs (ASE 2007, Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, Short Paper, 2007)
This paper presents a novel framework called Evacon that integrates evolutionary testing (used to search for desirable method sequences) and concolic testing (used to generate desirable method arguments). The experimental results show that the tests generated using our framework can achieve higher branch coverage than evolutionary testing or concolictesting alone.
UnitPlus: Assisting Developer Testing in Eclipse (ETX 2007, Proceedings of the Eclipse Technology eXchange Workshop at OOPSLA 2007, 2007)
This paper presents an Eclipse plugin for JUnit test cases, called UnitPlus, to assist developers in writing test code in unit test cases more efficiently. It runs in the background and recommends test-code pieces for developers to choose (and revise when needed) to put in test oracles or test inputs. The recommendation is based on static analysis ofthe class under test and already written unit test cases.
An Empirical Comparison of Automated Generation and Classification Techniques for Object-Oriented Unit Testing (ASE 2006, Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering, 2006)
This paper presents an empirical study that compares all four combinations of these generation and classification techniques. The results show that the techniques are complementary (i.e., detect different faults) and illustrate their respective strengths and weaknesses.
Tool-Assisted Unit-Test Generation and Selection Based on Operational Abstractions (ASE Journal 2006, Automated Software Engineering Journal, A special issue of selected papers from the ASE 2003 conference, 2006)    
This paper proposes the operational violation approach for unit-test generation and selection, a black-box approach without requiring a priori specifications. The approach dynamically generates operational abstractions from executions of the existing unit test suite,  which guide test generation tools to generate tests to violate them. The approach selects those generated tests violating operational abstractions for inspection.
APTE: Automated Pointcut Testing for AspectJ Programs (WTAOP 2006, Proceedings of the 2nd Workshop on Testing Aspect-Oriented Programs, 2006)
This paper proposes APTE, an automated framework that tests pointcuts in AspectJ programs with the help of AJTE, an existing unit-testing framework without weaving. Our new APTE framework identifies joinpoints that satisfy a pointcut expression and a set of boundary joinpoints.
Substra: A Framework for Automatic Generation of Integration Tests (AST 2006, Proceedings of the 1st Workshop on Automation of Software Test, 2006)
This paper proposes Substra, a framework for automatic integration test generation based on call-sequence constraints inferred from dynamic executions. Constraints are inferred based on two types of information: shared subsystem states and object def-use relationship.
A Framework and Tool Supports for Generating Test Inputs of AspectJ Programs (AOSD 2006, Proceedings of the 5th International Conference on Aspect-Oriented Software Development, 2006)
This paper proposes Aspectra, a framework for generating test inputs to exercise aspectual behavior in AspectJ programs. The framework includes a wrapper mechanism to leverages existing test-generation tools to generate test inputs for the classes woven with aspects. It also defines and measures aspectual branch coverage (branch coverage within aspects) and interaction coverage to guide test generation.
Symstra: A Framework for Generating Object-Oriented Unit Tests using Symbolic Execution (TACAS 2005, Proceedings of the 11th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2005)
Generating object-oriented unit tests involves two tasks: generating method sequences that build relevant receiver object states and generating relevant method arguments. This paper proposes Symstra, a framework that achieves test generation tasks using symbolic execution of method sequences with symbolic arguments.
Rostra: A Framework for Detecting Redundant Object-Oriented Unit Tests (ASE 2004, Proceedings of of the 19th IEEE International Conference on Automated Software Engineering, 2004)
This paper proposes Rostra. a framework for detecting redundant tests out of automatically generated object-oriented tests. Removing redundant tests does not reduce the test suite quality. It also develops a test generation approach based on concrete-state exploration that genreates only non-redundant tests.
Mutually Enhancing Test Generation and Specification Inference (FATES 03, Proceedings of the 3rd International Workshop on Formal Approaches to Testing of Software, 2003)
This paper proposes an approach that integrates dynamic specification inference and test generation. The approach mutually enhances the tests and specifications that are generated by iteratively applying each in a feedback loop. 

Test Selection

Applying Interface-Contract Mutation in Regression Testing of Component-Based Software (ICSM 2007, Proceedings of the 23rd International Conference on Software Maintenance, 2007) 
This paper presents our approach of applying mutation on interface contracts, which can describe the rights and obligations between component users and providers, to simulate the faults that may occur in this way of software development. The mutation adequacy score for killing the mutants of interface contracts can serve as a test adequacy criterion. We performed an experimental study on three subject systems to evaluate the proposed approach together with four other criteria.
An Empirical Comparison of Automated Generation and Classification Techniques for Object-Oriented Unit Testing (ASE 2006, Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering, 2006)
This paper presents an empirical study that compares all four combinations of these generation and classification techniques. The results show that the techniques are complementary (i.e., detect different faults) and illustrate their respective strengths and weaknesses.
Tool-Assisted Unit-Test Generation and Selection Based on Operational Abstractions (ASE Journal 2006, Automated Software Engineering Journal, A special issue of selected papers from the ASE 2003 conference, 2006)    
This paper proposes the operational violation approach for unit-test generation and selection, a black-box approach without requiring a priori specifications. The approach dynamically generates operational abstractions from executions of the existing unit test suite,  which guide test generation tools to generate tests to violate them. The approach selects those generated tests violating operational abstractions for inspection.
APTE: Automated Pointcut Testing for AspectJ Programs (WTAOP 2006, Proceedings of the 2nd Workshop on Testing Aspect-Oriented Programs, 2006)
This paper proposes APTE, an automated framework that tests pointcuts in AspectJ programs with the help of AJTE, an existing unit-testing framework without weaving. Our new APTE framework identifies joinpoints that satisfy a pointcut expression and a set of boundary joinpoints.
Towards Regression Test Selection for Aspect-Oriented Programs (WTAOP 2006, Proceedings of the 2nd Workshop on Testing Aspect-Oriented Programs, 2006)
This paper presents a regression test selection technique for AspectJ programs. The technique is based on various types of control flow graphs that can be used to select from the original test suite test cases that execute changed code for the new version of the AspectJ program.
Automatically Identifying Special and Common Unit Tests for Object-Oriented Programs (ISSRE 2005, Proceedings of the 16th IEEE International Symposium on Software Reliability Engineering, 2005)    
This paper presents a new approach for automatically identifying special and common unit tests for a class without requiring any specification. The approach is based on statistical algebraic abstractions, program properties (in the form of algebraic specifications) dynamically inferred based on a set of predefined abstraction templates.
Rostra: A Framework for Detecting Redundant Object-Oriented Unit Tests (ASE 2004, Proceedings of of the 19th IEEE International Conference on Automated Software Engineering, 2004)
This paper proposes Rostra. a framework for detecting redundant tests out of automatically generated object-oriented tests. Removing redundant tests does not reduce the test suite quality. It also develops a test generation approach based on concrete-state exploration that genreates only non-redundant tests.

Test Abstraction

DSD-Crasher: A hybrid analysis tool for bug finding (TOSEM 2008, ACM Transactions on Software Engineering and Methodology, 2008)    
This paper presents DSD-Crasher, a bug finding tool that follows a three-step approach to program analysis: D. Capture the program's intended execution behavior with dynamic invariant detection. The derived invariants exclude many unwanted values from the program's input domain. S. Statically analyze the program within the restricted input domain to explore many paths. D. Automatically generate test cases that focus on reproducing the predictions of the static analysis. Thereby confirmed results are feasible.This three-step approach yields benefits compared to past two-step combinations in the literature.
Automatic Extraction of Abstract-Object-State Machines from Unit-Test Executions (ICSE 2006 DE, Proceedings of the 28th International Conference on Software Engineering, Research Demonstration, 2006)    
This paper proposes the operational violation approach for unit-test generation and selection, a black-box approach without requiring a priori specifications. The approach dynamically generates operational abstractions from executions of the existing unit test suite,  which guide test generation tools to generate tests to violate them. The approach selects those generated tests violating operational abstractions for inspection.
Substra: A Framework for Automatic Generation of Integration Tests (AST 2006, Proceedings of the 1st Workshop on Automation of Software Test, 2006)
This paper proposes Substra, a framework for automatic integration test generation based on call-sequence constraints inferred from dynamic executions. Constraints are inferred based on two types of information: shared subsystem states and object def-use relationship.
Automatic Extraction of Abstract-Object-State Machines Based on Branch Coverage (RETR 2005, Proceedings of the 1st International Workshop on Reverse Engineering To Requirements, 2005)
This paper proposes a new approach, called Brastra, for extracting object state machines from unit-test executions. Brastra abstracts an object’s concrete state to an abstract state based on the branch coverage information exercised by methods invoked on the object.
Automatic Extraction of Sliced Object State Machines for Component Interfaces (SAVCBS 2004, Proceedings of the 3rd Workshop on Specification and Verification of Component-Based Systems, 2004)
This paper presents an approach that automatically extracts sliced object state machines (OSM) for component interfaces from the execution of generated tests. The approach slices concrete object states by each member field of the component and use sliced states to construct a set of sliced OSMs.
Automatic Extraction of Object-Oriented Observer Abstractions from Unit-Test Executions (ICFEM 2004, Proceedings of the 6th International Conference on Formal Engineering Methods, 2004)
This paper proposes the observer abstraction approach for automatically extracting object-state-transition information of a class from unit-test executions, without requiring a priori specifications. The approach produces the abstract state of an object based on the return values of a set of observers (public methods with non-void returns) invoked on the object.

Test Oracle Augmentation

UnitPlus: Assisting Developer Testing in Eclipse (ETX 2007, Proceedings of the Eclipse Technology eXchange Workshop at OOPSLA 2007, 2007)
This paper presents an Eclipse plugin for JUnit test cases, called UnitPlus, to assist developers in writing test code in unit test cases more efficiently. It runs in the background and recommends test-code pieces for developers to choose (and revise when needed) to put in test oracles or test inputs. The recommendation is based on static analysis ofthe class under test and already written unit test cases.
Towards a Framework for Differential Unit Testing of Object-Oriented Programs (AST 2007, Proceedings of the 2nd International Workshop on Automation of Software Test, 2007)
This paper presents a framework, called Diffut, that enables differential unit testing of object-oriented programs. Diffut enables “simultaneous” execution of the pairs of corresponding methods from the two versions: methods can receive the same inputs (consisting of the object graph reachable from the receiver and method arguments), and Diffut compares their outputs (consisting of the object graph reachable from the receiver and method return values).
An Empirical Comparison of Automated Generation and Classification Techniques for Object-Oriented Unit Testing (ASE 2006, Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering, 2006)
This paper presents an empirical study that compares all four combinations of these generation and classification techniques. The results show that the techniques are complementary (i.e., detect different faults) and illustrate their respective strengths and weaknesses.
Tool-Assisted Unit-Test Generation and Selection Based on Operational Abstractions (ASE Journal 2006, Automated Software Engineering Journal, A special issue of selected papers from the ASE 2003 conference, 2006)    
This paper proposes the operational violation approach for unit-test generation and selection, a black-box approach without requiring a priori specifications. The approach dynamically generates operational abstractions from executions of the existing unit test suite,  which guide test generation tools to generate tests to violate them. The approach selects those generated tests violating operational abstractions for inspection.
Augmenting Automatically Generated Unit-Test Suites with Regression Oracle Checking (ECOOP 2006, Proceedings of the 20th European Conference on Object-Oriented Programming, 2006)    
This paper presents an automatic approach and its supporting tool, called Orstra, for augmenting an automatically generated unit-test suite with regression oracle checking. The augmented test suite has an improved capability of guarding against regression faults.
Checking Inside the Black Box: Regression Testing By Comparing Value Spectra (IEEE Transactions on Software Engineering, A special issue of selected papers from the ICSM 2004 conference, 2005)
This paper presents a new class of program spectra, value spectra, that enriches the existing program spectra family, and a new approach that compares the value spectra of a program’s old version and new version to detect internal behavioral deviations in the new version.

Regression Testing

Applying Interface-Contract Mutation in Regression Testing of Component-Based Software (ICSM 2007, Proceedings of the 23rd International Conference on Software Maintenance, 2007) 
This paper presents our approach of applying mutation on interface contracts, which can describe the rights and obligations between component users and providers, to simulate the faults that may occur in this way of software development. The mutation adequacy score for killing the mutants of interface contracts can serve as a test adequacy criterion. We performed an experimental study on three subject systems to evaluate the proposed approach together with four other criteria.
Towards a Framework for Differential Unit Testing of Object-Oriented Programs (AST 2007, Proceedings of the 2nd International Workshop on Automation of Software Test, 2007) 
This paper presents a framework, called Diffut, that enables differential unit testing of object-oriented programs. Diffut enables “simultaneous” execution of the pairs of corresponding methods from the two automatThenvIn we ersions: methods can receive the same inputs (consisting of the object graph reachable from the receiver and method arguments), and Diffut compares their outputs (consisting of the object graph reachable from the receiver and method return values)
Towards Regression Test Selection for Aspect-Oriented Programs (WTAOP 2006, Proceedings of the 2nd Workshop on Testing Aspect-Oriented Programs, 2006)
This paper presents a regression test selection technique for AspectJ programs. The technique is based on various types of control flow graphs that can be used to select from the original test suite test cases that execute changed code for the new version of the AspectJ program.
Augmenting Automatically Generated Unit-Test Suites with Regression Oracle Checking (ECOOP 2006, Proceedings of the 20th European Conference on Object-Oriented Programming, 2006)    
This paper presents an automatic approach and its supporting tool, called Orstra, for augmenting an automatically generated unit-test suite with regression oracle checking. The augmented test suite has an improved capability of guarding against regression faults.
Checking Inside the Black Box: Regression Testing By Comparing Value Spectra (IEEE Transactions on Software Engineering, A special issue of selected papers from the ICSM 2004 conference, 2005)
This paper presents a new class of program spectra, value spectra, that enriches the existing program spectra family, and a new approach that compares the value spectra of a program’s old version and new version to detect internal behavioral deviations in the new version.

Mining Program Executions

DSD-Crasher: A hybrid analysis tool for bug finding (TOSEM 2008, ACM Transactions on Software Engineering and Methodology, 2008)
This paper presents DSD-Crasher, a bug finding tool that follows a three-step approach to program analysis: D. Capture the program's intended execution behavior with dynamic invariant detection. The derived invariants exclude many unwanted values from the program's input domain. S. Statically analyze the program within the restricted input domain to explore many paths. D. Automatically generate test cases that focus on reproducing the predictions of the static analysis. Thereby confirmed results are feasible.This three-step approach yields benefits compared to past two-step combinations in the literature.
Mining for Software Reliability (ICDM 2007, Tutorial, Proceedings of the 2007 IEEE International Conference on Data Mining, 2007)
This tutorial presents a comprehensive overview of this area, examine representative studies, and lay out challenges to data mining researchers. Especially, every effort will be made to let data mining researchers appreciate the challenges and impact posed by software reliability, and be stimulated to contribute.
Mining Software Engineering Data (ICSE 2007, Tutorial, Proceedings of the 29th International Conference on Software Engineering, 2007)
This tutorial presents the latest research in mining Software Engineering (SE) data, discusses challenges associated with mining SE data, highlights SE data mining success stories, and outlines future research directions. Attendees will acquire the knowledge and skills needed to perform research or conduct practice in the field and to integratedata mining techniques in their own research or practice.
An Empirical Comparison of Automated Generation and Classification Techniques for Object-Oriented Unit Testing (ASE 2006, Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering, 2006)
This paper presents an empirical study that compares all four combinations of these generation and classification techniques. The results show that the techniques are complementary (i.e., detect different faults) and illustrate their respective strengths and weaknesses.
Tool-Assisted Unit-Test Generation and Selection Based on Operational Abstractions (ASE Journal 2006, Automated Software Engineering Journal, A special issue of selected papers from the ASE 2003 conference, 2006)    
This paper proposes the operational violation approach for unit-test generation and selection, a black-box approach without requiring a priori specifications. The approach dynamically generates operational abstractions from executions of the existing unit test suite,  which guide test generation tools to generate tests to violate them. The approach selects those generated tests violating operational abstractions for inspection.
Data Mining for Software Engineering (KDD 2006, Tutorial, Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006)
The tutorial focuses on the inherent challenges of mining software engineering data, offers a shortcut to the current research and development frontier of data mining practice in software engineering, and illustrates a few case studies on data mining applications in software engineering.
Inferring Access-Control Policy Properties via Machine Learning (POLICY 2006, Proceedings of 7th IEEE Workshop on Policies for Distributed Systems and Networks, 2006)
This paper proposes a policy testing approach to facilitate systematic policy testing through automatic request generation, request evaluation, and property inference by applying machine learning on request-response pairs. These inferred properties facilitate the inspection of the policy behavior.
Automatic Extraction of Abstract-Object-State Machines from Unit-Test Executions (ICSE 2006 DE, Proceedings of the 28th International Conference on Software Engineering, Research Demonstration, 2006)    
This paper presents an automatic test abstraction tool, called Abstra, to extract high level object-state-transition information from unit test executions, without requiring a priori specifications. The recovered information can help facilitate correctness inspection, program understanding, fault isolation, and test characterization.
Substra: A Framework for Automatic Generation of Integration Tests (AST 2006, Proceedings of the 1st Workshop on Automation of Software Test, 2006)
This paper proposes Substra, a framework for automatic integration test generation based on call-sequence constraints inferred from dynamic executions. Constraints are inferred based on two types of information: shared subsystem states and object def-use relationship.
Automatic Extraction of Abstract-Object-State Machines Based on Branch Coverage (RETR 2005, Proceedings of the 1st International Workshop on Reverse Engineering To Requirements, 2005)
This paper proposes a new approach, called Brastra, for extracting object state machines from unit-test executions. Brastra abstracts an object’s concrete state to an abstract state based on the branch coverage information exercised by methods invoked on the object.
Automatically Identifying Special and Common Unit Tests for Object-Oriented Programs (ISSRE 2005, Proceedings of the 16th IEEE International Symposium on Software Reliability Engineering, 2005)
This paper presents a new approach for automatically identifying special and common unit tests for a class without requiring any specification. The approach is based on statistical algebraic abstractions, program properties (in the form of algebraic specifications) dynamically inferred based on a set of predefined abstraction templates.
Helping Users Avoid Bugs in GUI Applications (ICSE 2005, Proceedings of the 27th International Conference on Software Engineering, 2005)
This paper proposes a method to help users avoid bugs in GUI applications based on machine learning and collaborative filtering techniques.
Automatic Extraction of Sliced Object State Machines for Component Interfaces (SAVCBS 2004, Proceedings of the 3rd Workshop on Specification and Verification of Component-Based Systems, 2004)
This paper presents an approach that automatically extracts sliced object state machines (OSM) for component interfaces from the execution of generated tests. The approach slices concrete object states by each member field of the component and use sliced states to construct a set of sliced OSMs.
Automatic Extraction of Object-Oriented Observer Abstractions from Unit-Test Executions (ICFEM 2004, Proceedings of the 6th International Conference on Formal Engineering Methods, 2004)
This paper proposes the observer abstraction approach for automatically extracting object-state-transition information of a class from unit-test executions, without requiring a priori specifications. The approach produces the abstract state of an object based on the return values of a set of observers (public methods with non-void returns) invoked on the object.
Mutually Enhancing Test Generation and Specification Inference (FATES 03, Proceedings of the 3rd International Workshop on Formal Approaches to Testing of Software, 2003)
This paper proposes an approach that integrates dynamic specification inference and test generation. The approach mutually enhances the tests and specifications that are generated by iteratively applying each in a feedback loop. 
Software Component Protocol Inference (UW CSE General Exam Report, 2003)
This paper explores the research area of software component protocol inference with a focus on dynamic inference techniques. A framework is proposed to compare the existing dynamic inference techniques.

Mining Software Repositories

An Approach to Detecting Duplicate Bug Reports using Natural Language and Execution Information (ICSE 2008, Proceedings of the 30th International Conference on Software Engineering, 2008)
This paper we presents a new approach that further involves execution information. In this approach, when a new bug report arrives, its natural language information and execution information is compared with those of the existing bug reports. Then, a small number of existing
bug reports are suggested to the triager as the most similar bug reports to the new bug report. Finally, the triager examines the suggested bug reports to determine whether the new bug report duplicates an existing bug report.
SpotWeb: Detecting Framework Hotspots via Mining Open Source Repositories on the Web (MSR 2008, Proceedings of the 5th Working Conference on Mining Software Repositories, 2008)
This paper present a code-search-engine-based approach that tries to detect hotspots in a given framework by mining code examples gathered from open source repositories available on the web; these hotspots are the APIs that are frequently reused. Hotspots can serve as starting points for developers in understanding and reusing the given framework. We developed a tool, called SpotWeb, for frameworks or libraries written in Java and conducted two case studies with two open source frameworks JUnit and Log4j. We also show that the detected hotspots of Log4j and JUnit are consistent with their respective documentations.
Static Detection of API Error-Handling Bugs via Mining Source Code (NCSU CSC 2007, NCSU CSC Technical Report, 2007)
This paper present a novel framework for statically mining specifications directly from software package repositories, without requiring any user input. The framework uses a compile-time push-down model-checker to generate inter-procedural static traces, which approximate run-time API error behaviors. Data mining techniques are used on these static traces to mine specifications that define the correct handling of errors for relevant APIs used in the software packages. The mined specifications arethen formally verified against the same (or other) software packages to uncover API error-handling bugs.
NEGWeb: Static Defect Detection via Searching Billions of Lines of Open Source Code (NCSU CSC 2007, NCSU CSC Technical Report, 2007)
This paper present a novel framework, called NEGWeb, for substantially expanding the mining scope to billions of lines of open source code based on a code search engine. NEGWeb detects violations related to neglected conditions around individual API calls. We evaluated NEGWeb to detect violations in local code basesor open source code bases.
PARSEWeb: A Programmer Assistant for Reusing Open Source Code on the Web (ASE 2007, Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, 2007)
This paper present an approach that takes queries of the form “Source object type --> Destination object type” as input, and suggests relevant method-invocation sequences that can serve as solutions that yield the destination object from the source object given in the query. Our approach interacts with a code search engine (CSE) to gather relevant code samples and performs static analysis over the collected samples to extract required sequences.
Automated Detection of API Refactorings in Libraries (ASE 2007, Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, Short Paper, 2007)
This paper present a technique and its supporting tool, RefacLib, to automatically detect refactorings in libraries. RefacLib uses syntactic analysis in the first phase to quickly detect refactoring candidates across two versions of a library. In the second phase, RefacLib uses various heuristics to refine the results.
Mining for Software Reliability (ICDM 2007, Tutorial, Proceedings of the 2007 IEEE International Conference on Data Mining, 2007)
This tutorial presents a comprehensive overview of this area, examine representative studies, and lay out challenges to data mining researchers. Especially, every effort will be made to let data mining researchers appreciate the challenges and impact posed by software reliability, and be stimulated to contribute.
Mining API Patterns as Partial Orders from Source Code: From Usage Scenarios to Specifications (ESEC/FSE 2007, Proceedings of European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2007)
This paper present a framework to automatically extract usage scenarios among user-specified APIs as partial orders, directly from the source code (API client code). We adapt a model checker to generate interprocedural control-flow-sensitive static traces related to the APIs of interest. Different API usage scenarios are extracted from the static traces by our scenario extraction algorithm and fed to a miner. The miner summarizes different usage scenarios as compact partial orders. Specifications are extracted from the frequent partialorders using our specification extraction algorithm.
Mining Software Engineering Data (ICSE 2007, Tutorial, Proceedings of the 29th International Conference on Software Engineering, 2007)
This tutorial presents the latest research in mining Software Engineering (SE) data, discusses challenges associated with mining SE data, highlights SE data mining success stories, and outlines future research directions. Attendees will acquire the knowledge and skills needed to perform research or conduct practice in the field and to integratedata mining techniques in their own research or practice.
Automated Inference of Poincuts in Aspect-Oriented Refactoring (ICSE 2007, Proceedings of the 29th International Conference on Software Engineering, 2007)
This paper proposes an automated framework that identifies aspect candidates in code and infers pointcut expressions for these aspects. Our framework mines for aspect candidates, identifies the join points for the aspect candidates, clusters the join points, and infers an effective pointcut expression for each cluster of join points.
Mining Interface Specifications for Generating Checkable Robustness Properties (ISSRE 2006, Proceedings of the 17th IEEE International Conference on Software Reliability Engineering, 2006)
This paper proposes a novel framework to automatically infer system-specific interface specifications from program source code. The framework uses a model checker to generate traces related to the interfaces. From these model checking traces, the framework infers interface specification details such as return value on success or failure. Based on these inferred specifications, the framework translates generically specified interface robustness rules to concrete robustness properties verifiable by static checking.
Data Mining for Software Engineering (KDD 2006, Tutorial, Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006)
The tutorial focuses on the inherent challenges of mining software engineering data, offers a shortcut to the current research and development frontier of data mining practice in software engineering, and    illustrates a few case studies on data mining applications in software engineering.
Understanding Software Application Interfaces via String Analysis (ICSE 2006 ER, Proceedings of the 28th International Conference on Software Engineering, Emerging Results Track, 2006)
This paper proposes an approach to understanding software application interfaces through string analysis. The approach can help us understand the characteristics of interactions between software applications such as between database applications and databases.
MAPO: Mining API Usages from Open Source Repositories (MSR 2006, Proceedings of the 3rd International Workshop on Mining Software Repositories, 2006)
This paper proposes an API usage mining framework and its supporting tool called MAPO (for Mining API usages from Open source repositories). Given a query that describes a method, class, or package for an API, MAPO leverages the existing source code search engines to gather relevant source files and conducts data mining.

Testing and Analysis of Security Policies

XEngine: A Fast and Scalable XACML Policy Evaluation Engine (SIGMETRICS 2008, Proceedings of the International Conference on Measurement and Modeling of Computer Systems , 2008)
This paper presents XEngine for efficient XACML policy evaluation. XEngine first converts a textual XACML policy to a numerical policy, second, it converts a numerical policy with complex structures to a numerical policy with a normalized structure, and third, it converts the normalized numerical policy to tree data structures for efficient processing of requests. To evaluate the performance of XEngine, extensive experiments were conducted on both real-life and synthetic XACML policies. The experimental results show that for XACML policies of small sizes, XEngine is one to two orders of magnitude faster than the widely deployed Sun PDP; for XACML policies of large sizes, XEngine is three to five orders of magnitude faster than Sun PDP. 
Assessing Quality of Policy Properties in Verification of Access Control Policies (NCSU CSC 2007, NCSU CSC Technical Report, 2007)
This paper presents a novel approach called Mutaver to assess the quality of properties specified for a policy and, in doing so, the quality of the verification itself. Similar to the way mutation testing is used to assess the quality of a test suite in terms of faultdetection capability, we propose mutation verification to assess thequality of a set of properties.
Conformance Checking of Access Control Policies Specified in XACML (IWSSE 2007, Proceedings of the 1st IEEE International Workshop on Security in Software Engineering, 2007)
Often common properties for specific access control policies may not be satisfied when these policies are specified in XACML, causing the discrepancy between what the policy authors intend to specify and what the actually specified XACML policies reflect. In this position paper, we propose an approach for conducting conformance checking of access control policies specified in XACML based on existing verification and testing tools for XACML policies.
Automated Test Generation for Access Control Policies via Change-Impact Analysis (SESS 2007, Proceedings of the 3rd International Workshop on Software Engineering for Secure Systems, 2007)
This paper presents a novel framework and its supporting tool called Cirg that generates tests based on change-impact analysis. Our experimental results show that Cirg can effectively generate tests to achieve high structural coverage of policies with reasonable runtime cost, and outperforms random test generation in terms of structural coverage.
A Fault Model and Mutation Testing of Access Control Policies (WWW 2007, Proceedings of International Conference on World Wide Web, 2007)
This paper presents a fault model for access control policies and a mutation testing framework to investgiate it. The framework includes mutation operators used to implement the fault model, mutant generation, equivalent-mutant detection, and mutant-killing determination. This framework allows us to investigate our fault model, evaluate coverage criteria for test generation and selection, and determine a relationship between structural coverage and fault-detection effectiveness.
Defining and Measuring Policy Coverage in Testing Access Control Policies (ICICS 2006, Proceedings of the 8th International Conference on Information and Communications Security, 2006)
This paper presents a first step toward systematic policy testing by defining and measuring policy coverage. A coverage-measurement tool is developed to measure policy coverage given a set of XACML policies and a set of requests.
Automated Test Generation for Access Control Policies (ISSRE 2006 Fast Abstract, Supplemental Proceedings of the 17th IEEE International Conference on Software Reliability Engineering, 2006)
This paper presents an efficient test generation approach and its supporting tool called Targen. Targen can effectively generate tests that outperforms the existing random test generation in terms ofstructural coverage and fault-detection capability.
Inferring Access-Control Policy Properties via Machine Learning (POLICY 2006, Proceedings of 7th IEEE Workshop on Policies for Distributed Systems and Networks, 2006)
This paper proposes a policy testing approach to facilitate systematic policy testing through automatic request generation, request evaluation, and property inference by applying machine learning on request-response pairs. These inferred properties facilitate the inspection of the policy behavior.

Application Security Testing

SQLUnitGen: SQL Injection Testing Using Static and Dynamic Analysis (ISSRE 2006 Student Program, Supplemental Proceedings of the 17th IEEE International Conference on Software Reliability Engineering, 2006)
This paper presents an approach to facilitate the identification of true input manipulation vulnerabilities via automated testing based on static analysis. A prototype of SQL injection vulnerability detection tool, called SQLUnitGen, has been implemented for the approach.

Testing and Analysis of Aspect-Oriented Programs

Automated Inference of Pointcuts in Aspect-Oriented Refactoring (ICSE 2007, Proceedings of the 29th International Conference on Software Engineering, 2007)
This paper proposes an automated framework that identifies aspect candidates in code and infers pointcut expressions for these aspects. Our framework mines for aspect candidates, identifies the join points for the aspect candidates, clusters the join points, and infers an effective pointcut expression for each cluster of join points.
Perspectives on Automated Testing of Aspect-Oriented Programs (WTAOP 2007, Proceedings of the 3rd Workshop on Testing Aspect-Oriented Programs, 2007)
This position paper presents our perspectives on automated testing techniques from three dimensions: testing aspectual behavior or aspectual composition, unit tests or integration tests, and test-input generation or test-behavior checking. We illustrate automated testing techniques primarily through the last dimension in the perspectives. By classifying these automated testing techniques in the perspectives, we provide better understanding of these techniques and identify future directions for automated testing of aspect-oriented programs. 
Detecting Redundant Unit Tests for AspectJ Programs (ISSRE 2006, Proceedings of the 17th IEEE International Conference on Software Reliability Engineering, 2006)
This paper proposes Aspectra, a framework for detecting redundant unit tests for AspectJ programs. The framework introduces three levels of units in testing AspectJ programs: advised methods, advice, and intertype methods, and detects at each level redundant tests that do not exercise new behavior.
Efficient Mutant Generation for Mutation Testing of Pointcuts in Aspect-Oriented Programs (MUTATION 2006, Proceedings of the 2nd Workshop on Mutation Analysis, 2006)
This paper proposes a framework that automatically generates mutants for a pointcut expression and identifies mutants that resemble closely the original expression. Then the developerscould apply test data against these mutants to perform mutation testing.
APTE: Automated Pointcut Testing for AspectJ Programs (WTAOP 2006, Proceedings of the 2nd Workshop on Testing Aspect-Oriented Programs, 2006)
This paper proposes APTE, an automated framework that tests pointcuts in AspectJ programs with the help of AJTE, an existing unit-testing framework without weaving. Our new APTE framework identifies joinpoints that satisfy a pointcut expression and a set of boundary joinpoints.
Towards Regression Test Selection for Aspect-Oriented Programs (WTAOP 2006, Proceedings of the 2nd Workshop on Testing Aspect-Oriented Programs, 2006)
This paper presents a regression test selection technique for AspectJ programs. The technique is based on various types of control flow graphs that can be used to select from the original test suite test cases that execute changed code for the new version of the AspectJ program.
A Framework and Tool Supports for Generating Test Inputs of AspectJ Programs (AOSD 2006, Proceedings of the 5th International Conference on Aspect-Oriented Software Development, 2006)
This paper proposes Aspectra, a framework for generating test inputs to exercise aspectual behavior in AspectJ programs. The framework includes a wrapper mechanism to leverages existing test-generation tools to generate test inputs for the classes woven with aspects. It also defines and measures aspectual branch coverage (branch coverage within aspects) and interaction coverage to guide test generation.

Testing and Analysis of Web Services and Applications

Automated Testing and Response Analysis of Web Services (ICWS 2007 Industry, Proceedings of the IEEE International Conference on Web Services, Research Demonstration, 2007)
This paper presents a framework and its supporting tool for automatically generating and executing web-service requests and analyzing the subsequent request-response pairs. Given a service provider's Web Service Description Language (WSDL) specification, we automatically generate necessary Java code to implement a client (service requestor), leverage automated unit test generation tools for Java to generate unit tests, execute the generated unit tests, and analyze the large number of request-response pairs toidentify robustness problems.
WebSob: A Tool for Robustness Testing of Web Services (ICSE 2007 DE, Proceedings of the 29th International Conference on Software Engineering, Research Demonstration, 2007)
This paper presents WebSob, a tool for automatically generating and executing web-service requests. Given a service provider’s Web Service Description Language (WSDL) specification, WebSob first automatically generates necessary Java code to implement a client (service requestor). WebSob then leverages existing automated unit test generation tools for Java to generate unit tests and finally execute the generated unit tests, which in turn invoke the service under test.

Testing and Analysis of Software Designs

A Framework and Tool Supports for Testing Modularity of Software Design (ASE 2007, Proceedings of the 22nd ACM/IEEE International Conference on Automated Software Engineering, Short Paper, 2007)
This paper presents a novel framework for testing modularity of a software design and tool supports for conducting modularity testing. In our framework, the software artifact under test is a software design. A test input is a potential change to the design. The test output is a modularity vector, which precisely captures quantitative capability extents of the design for accommodating thetest input (the potential change).

Software Verification

Static Detection of API Error-Handling Bugs via Mining Source Code (NCSU CSC 2007, NCSU CSC Technical Report, 2007)
This paper present a novel framework for statically mining specifications directly from software package repositories, without requiring any user input. The framework uses a compile-time push-down model-checker to generate inter-procedural static traces, which approximate run-time API error behaviors. Data mining techniques are used on these static traces to mine specifications that define the correct handling of errors for relevant APIs used in the software packages. The mined specifications arethen formally verified against the same (or other) software packages to uncover API error-handling bugs.
NEGWeb: Static Defect Detection via Searching Billions of Lines of Open Source Code (NCSU CSC 2007, NCSU CSC Technical Report, 2007)
This paper present a novel framework, called NEGWeb, for substantially expanding the mining scope to billions of lines of open source code based on a code search engine. NEGWeb detects violations related to neglected conditions around individual API calls. We evaluated NEGWeb to detect violations in local code basesor open source code bases.
Mining for Software Reliability (ICDM 2007, Tutorial, Proceedings of the 2007 IEEE International Conference on Data Mining, 2007)
This tutorial presents a comprehensive overview of this area, examine representative studies, and lay out challenges to data mining researchers. Especially, every effort will be made to let data mining researchers appreciate the challenges and impact posed by software reliability, and be stimulated to contribute.
Mining API Patterns as Partial Orders from Source Code: From Usage Scenarios to Specifications (ESEC/FSE 2007, Proceedings of European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2007)
This paper present a framework to automatically extract usage scenarios among user-specified APIs as partial orders, directly from the source code (API client code). We adapt a model checker to generate interprocedural control-flow-sensitive static traces related to the APIs of interest. Different API usage scenarios are extracted from the static traces by our scenario extraction algorithm and fed to a miner. The miner summarizes different usage scenarios as compact partial orders. Specifications are extracted from the frequent partialorders using our specification extraction algorithm.
Mining Interface Specifications for Generating Checkable Robustness Properties (ISSRE 2006, Proceedings of the 17th IEEE International Conference on Software Reliability Engineering, 2006)
This paper proposes a novel framework to automatically infer system-specific interface specifications from program source code. The framework uses a model checker to generate traces related to the interfaces. From these model checking traces, the framework infers interface specification details such as return value on success or failure. Based on these inferred specifications, the framework translates generically specified interface robustness rules to concrete robustness properties verifiable by static checking.
Effective Generation of Interface Robustness Properties for Static Analysis (ASE 2006, Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering, 2006)
This paper proposes a novel framework that effectively generates interface properties from a few generic, high level robustness rules that capture interface behavior. We implement our framework for an existing static analyzer with our data flow extensions and apply it to the well known POSIX-API system interfaces.

Dynamic Analysis for Program Comprehension

Automatic Extraction of Abstract-Object-State Machines from Unit-Test Executions (ICSE 2006 DE, Proceedings of the 28th International Conference on Software Engineering, Research Demonstration, 2006)    
This paper presents an automatic test abstraction tool, called Abstra, to extract high level object-state-transition information from unit test executions, without requiring a priori specifications. The recovered information can help facilitate correctness inspection, program understanding, fault isolation, and test characterization.
Substra: A Framework for Automatic Generation of Integration Tests (AST 2006, Proceedings of the 1st Workshop on Automation of Software Test, 2006)
This paper proposes Substra, a framework for automatic integration test generation based on call-sequence constraints inferred from dynamic executions. Constraints are inferred based on two types of information: shared subsystem states and object def-use relationship.
Automatic Extraction of Abstract-Object-State Machines Based on Branch Coverage (RETR 2005, Proceedings of the 1st International Workshop on Reverse Engineering To Requirements, 2005)
This paper proposes a new approach, called Brastra, for extracting object state machines from unit-test executions. Brastra abstracts an object’s concrete state to an abstract state based on the branch coverage information exercised by methods invoked on the object.
Automatic Extraction of Sliced Object State Machines for Component Interfaces (SAVCBS 2004, Proceedings of the 3rd Workshop on Specification and Verification of Component-Based Systems, 2004)
This paper presents an approach that automatically extracts sliced object state machines (OSM) for component interfaces from the execution of generated tests. The approach slices concrete object states by each member field of the component and use sliced states to construct a set of sliced OSMs.
Automatic Extraction of Object-Oriented Observer Abstractions from Unit-Test Executions (ICFEM 2004, Proceedings of the 6th International Conference on Formal Engineering Methods, 2004)
This paper proposes the observer abstraction approach for automatically extracting object-state-transition information of a class from unit-test executions, without requiring a priori specifications. The approach produces the abstract state of an object based on the return values of a set of observers (public methods with non-void returns) invoked on the object.

Static Analysis for Program Comprehension

Automated Detection of API Refactorings in Libraries (ASE 2007, Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, Short Paper, 2007)
This paper present a technique and its supporting tool, RefacLib, to automatically detect refactorings in libraries. RefacLib uses syntactic analysis in the first phase to quickly detect refactoring
candidates across two versions of a library. In the second phase, RefacLib uses various heuristics to refine the results.
Understanding Software Application Interfaces via String Analysis (ICSE 2006 ER, Proceedings of the 28th International Conference on Software Engineering, Emerging Results Track, 2006)
This paper proposes an approach to understanding software application interfaces through string analysis. The approach can help us understand the characteristics of interactions between software applications such as between database applications and databases.
MAPO: Mining API Usages from Open Source Repositories (MSR 2006, Proceedings of the 3rd International Workshop on Mining Software Repositories, 2006)
This paper proposes an API usage mining framework and its supporting tool called MAPO (for Mining API usages from Open source repositories). Given a query that describes a method, class, or package for an API, MAPO leverages the existing source code search engines to gather relevant source files and conducts data mining.
A Model-based Approach to Object-Oriented Software Metrics (Journal of Computer Science and Technology, 2002)
This paper proposes a model-based approach to object-oriented software metrics is proposed in this paper. This approach guides the metrics users to adopt the quality metrics model to measure the object-oriented software products.
JBOORET: an Automated Tool to Recover OO Design and Source Models (COMPSAC 2001, Proceedings of the 25th Anniversary Annual International Computer Software and Applications Conference, 2001)
This paper presents a reverse engineering tool, JBOORET (Jade Bird Object-Oriented Reverse Engineering Tool). The tool implements a parser-based approach to assist the activity of extracting high level design and source models from system artifacts.
JBOOMT: Jade Bird Object-Oriented Metrics Tool (Chinese Journal of Electronics, 2000)
This paper presents the Jade Bird Object-Oriented Metrics Tool (JBOOMT), which provides an automated software metrics support for users and managers to measure the design or source code of the object-oriented program.
C++ Program Information Database for Analysis Tools (TOOLS 1998, Proceedings of the 1998 Conference on Technology of Object-Oriented Languages and Systems, 1998) 
This paper presents a C++ program information database, which is comprehensive enough to support many analysis tools based on common program information.

 


  How is our research work related to software industry?


Maintained by