Seminars & Colloquia

Ali Butt

Virginia Tech University

"A Toolkit for Evaluating MapReduce Cluster Design"

Friday September 17, 2010 01:00 PM
Location: 3211, EB2 NCSU Centennial Campus
This talk is part of the System Research Seminar series


MapReduce has emerged as a model of choice for supporting modern data-intensive applications, and is a key enabler for cloud computing. Setting up and operating a large MapReduce cluster entails careful evaluation of various design choices and run-time parameters to achieve high efficiency. However, this design space has not been explored in detail. 
In this talk, I will discuss a simulation approach to systematically understanding the performance of MapReduce setups. I will present MRPerf, a toolkit that captures such aspects of MapReduce setups as node, rack and network configurations, disk parameters and performance, data layout and application I/O characteristics, among others, and uses this information to predict expected application performance. I will also discuss the challenges faced in obtaining realistic traces to drive our simulations, and present tricks and tips we have used. The overall goal is to realize a tool for optimizing existing MapReduce setups as well as designing new ones. 
Short Bio:

Ali R. Butt is an Assistant Professor of Computer Science at Virginia Tech. Ali received his Ph.D. in Electrical and Computer Engineering from Purdue University in 2006. His research interests are in experimental computer systems, especially in data-intensive high-performance computing. His current work focuses on I/O and storage issues faced in modern HPC systems. Ali is a recipient of the NSF CAREER Award (2008), an IBM Faculty Award (2008), an IBM SUR Award (2009), and a Virginia Tech College of Engineering Outstanding New Assistant Professor Award (2009). Ali was also an invited participant (2009) and an organizer (2010) for the NAE's US Frontiers of Engineering Symposium.

Host: Frank Mueller, Computer Science, NCSU

