Seminars & Colloquia
IBM T. J. Watson Research Center
"System support for large-scale on-line data analysis "
Friday April 13, 2007 10:00 AM
Location: 3211, EB II NCSU Centennial Campus
(Visitor parking instructions)
This talk is part of the System Research Seminar series
Abstract: On-line data analysis has become increasingly important as many emerging applications call for sophisticated real-time processing over data streams, such as stock trading surveillance, sensor data analysis, network traffic monitoring, and video surveillance. One major challenge for large-scale on-line data analysis is to provide a scalable and resilient system infrastructure that can support computation-intensive continuous queries over high-volume and time-varying data streams. In this talk, I will present several new system management techniques to address the challenge. First, I will present an adaptive load diffusion system for multi-way windowed stream join (MWSJ), a core operator in continuous query systems. The MWSJ operator can be used to discover correlations across different streaming sources, which has many important applications in video surveillance, network intrusion detection, and sensor monitoring. The load diffusion system can adaptively scale-up stream join processing based on stream rate changes while preserving the accuracy of stream join results. Second, I will present a distributed stream processing overlay system that can achieve quality-aware, load balanced, and self-healing stream processing application composition over wide-area networks in a fully distributed and self-organizing fashion. Our experiment results via real prototype implementation and extensive simulations show the feasibility and efficiency of our systems. Finally, I will also briefly mention my other research work in large-scale networked systems and mobile systems including adaptive multi-source stream dissemination, multi-attribute range query in overlay networks, software failure predictions for cluster systems, and adaptive application offloading for resource-constrained mobile devices.
Short Bio: Xiaohui Gu is currently a research staff member at IBM T. J. Watson Research Center, New York. Her general research interests include distributed systems, operating systems, and computer networks with a current focus on large-scale data stream processing, autonomic system management using machine learning methods, peer-to-peer systems, and mobile systems. She received ILLIAC fellowship, David J. Kuck Best Master Thesis Award, and Saburo Muroga Fellowship from University of Illinois at Urbana-Champaign. She received IBM first Invention Award on 2004, and first Invention Plateau Award on 2006. She has served program committee and/ organizing committee in PerCom 2006-07, RTSS 2006, ICPS 2005-07, ACM Multimedia 2005 service composition workshop, ACM Multimedia Modeling Conference 2006, ICDE 2007 AIMS workshop. She received her PhD degree in 2004 and MS degree in 2001 from the Department of Computer Science, University of Illinois at Urbana-Champaign. She received her BS degree in computer science from Peking University, Beijing, China.
Host: Xiaosong Ma, Department of Computer Science