CSC News

August 19, 2015

NSF Supports Gu’s Research on Intelligent Production System Debugging

Dr. Xiaohui (Helen) Gu, associate professor of computer science at NC State University, has been awarded $800,000 by the National Science Foundation (NSF) to support her research proposal entitled “CSR: Medium: Collaborative Research: Holistic, Cross-site, Hybrid System Anomaly Debugging for Large Scale Hosting Infrastructures.”  This project is in collaboration with Dr. Shan Lu, associate professor of computer science at the University of Chicago.  NC State is the lead institution, and its share is $518,000.
The award will run from August 1, 2015 to July 31, 2019. 
Abstract –Large scale shared hosting infrastructures such as multi-tenant cloud systems have become increasingly popular by offering a cost-effective resource leasing solution. However, due to their inherent complexity and sharing nature, hosting infrastructures are prone to various system anomalies caused by either external environment issues (e.g., interference from co-located applications) or internal software bugs. This problem is further exacerbated by the fact that effective system anomaly diagnosis requires knowledge from both the infrastructure environment and the applications running inside the hosting infrastructure. 
Existing system anomaly diagnosis work can be broadly classified into two categories: 1) the black-box schemes which do not require source code; and 2) the white-box schemes which require source code and expensive code instrumentation. The black-box schemes are light-weight and non-intrusive, which makes them suitable for online production-site diagnosis. However, the black-box schemes can only provide coarse-grained fault localization and limited cause analysis. In contrast, the white-box schemes can provide more powerful debugging support but are only suitable for development-site, offline diagnosis due to high overhead and source code requirements. 
The overarching objective of this proposed project is to explore an innovative cross-site system anomaly debugging approach that intelligently integrates production-site black-box diagnosis with development-site white-box debugging into a more powerful hosting infrastructure debugging framework. We will focus on diagnosing non-crashing system anomalies (e.g., performance degradation, service outage, software hang, unexpected halt) that are common in real world hosting infrastructures but are difficult to debug using existing techniques. 
Techniques developed in this project will generate significant impact on improving the diagnosability and robustness of real world hosting infrastructures. The PIs will develop new course modules on the hosting infrastructure debugging for both graduate and undergraduate classes they regularly teach. The PIs will develop programming courseware based on the research prototypes developed in this project. The PIs will use their power of role model and a set of outreach activities to recruit more female students to pursue systems research. The PIs will disseminate their results and collected data broadly through publication and technology transfer. Developed software artifacts and experimental datasets will be released for public use.
For more information on Dr. Gu, click here.

Return To News Homepage