Seminars & Colloquia
"CRA-W Distinguished Lecture: Programming at scale: making it easier and why it's still hard"
Thursday November 17, 2011 10:00 AM
Location: 3211, EBII NCSU Centennial Campus
(Visitor parking instructions)
This talk is part of the System Research Seminar series
Developing computer programs to process large amounts of data efficiently is a major challenge. Many important applications, ranging from scientific analysis of environment sensor data to the indexing of the world wide web, need to run over terabytes or petabytes of data. Over the last 7-8 years, frameworks like MapReduce and Dryad have greatly simplified distributed data-parallel computing and enabled programs to run at immense scale.
In this talk I will describe the key ideas behind these frameworks and explain how they provide the programmer with an elegant, if constrained, programming model. Under the hood, the framework automatically deals with the low-level details of parallelism, data distribution and fault tolerance that are very difficult for the programmer to get right.
Recently, researchers are extending these frameworks with a richer programming model, for example, to support streaming and iteration. I will also describe some of these advances and the challenges that remain.
Rebecca Isaacs is a researcher at Microsoft Research, where she spent over 9 years in Cambridge, UK, before moving to MSR Silicon Valley in 2010. Her interests span systems performance analysis and debugging, operating systems and distributed systems. She received a PhD on network resource management from Cambridge University and a computer science BSc from the University of Glasgow.
Host: Xiaohui (Helen) Gu, Computer Science