CSC News

May 17, 2013

Li and Xie Win IEEE Software Award for Best SEIP Paper

Congratulations to PhD student Sihan Li and Dr. Tao Xie, associate professor of computer science in the NC State Department of Computer Science, for winning the 2013 IEEE Software Best Software Engineering in Practice (SEIP) Paper Award at the International Conference on Software Engineering (ICSE 2013) to be held in San Francisco, CA on May 18-26, 2013.
 
The winning paper, titled “A Characteristic Study on Failures of Production Distributed Data-Parallel Programs,” will appear in Proceedings of the 35th International Conference on Software Engineering (ICSE 2013), Software Engineering in Practice (SEIP), San Francisco, CA, May 2013. 
 
The paper was co-authored with researchers from Microsoft Research Asia:  Hucheng Zhou, Haodiang Lin, Tian Xiao, Haibo Lin, and Wei Lin.  To read the award-winning paper, click here.
 
Abstract:  SCOPE is adopted by thousands of developers from tens of different product teams in Microsoft Bing for daily web-scale data processing, including index building, search ranking, and advertisement display. A SCOPE job is composed of declarative SQL-like queries and imperative C# user-defined functions (UDFs), which are executed in pipeline by thousands of machines. There are tens of thousands of SCOPE jobs executed on Microsoft clusters per day, while some of them fail after a long execution time and thus waste tremendous resources. Reducing SCOPE failures would save significant resources.
 
This paper presents a comprehensive characteristic study on 200 SCOPE failures/fixes and 50 SCOPE failures with debugging statistics from Microsoft Bing, investigating not only major failure types, failure sources, and fixes, but also current debugging practice. Our major findings include (1) most of the failures (84.5%) are caused by defects in data processing rather than defects in code logic; (2) table-level failures (22.5%) are mainly caused by programmers’ mistakes and frequent data-schema changes while row-level failures (62%) are mainly caused by exceptional data; (3) 93% fixes do not change data processing logic; (4) there are 8% failures with root cause not at the failure-exposing stage, making current debugging practice insufficient in this case. Our study results provide valuable guidelines for future development of data-parallel programs. We believe that these guidelines are not limited to SCOPE, but can also be generalized to other similar data-parallel platforms.
 
About ICSE 2013: The International Conference on Software Engineering (ICSE) provides programs where researchers, practitioners, and educators present, discuss, and debate the most recent innovations, trends, experiences, and challenges in the field of software engineering. ICSE 2013, the 35th in the conference series, encourages contributors from academia, industry, and government to share leading-edge software engineering ideas with inspirational leaders in the field.

~coates~ 

Return To News Homepage