Seminars & Colloquia
EECS, University of Michigan, Ann Arbor
"Sirius: An Open End-to-End Voice and Vision Personal Assistant (Like Siri) and Its Implications for Future Warehouse Scale Computers"
Friday May 01, 2015 11:00 AM
Location: 3211, EBII NCSU Centennial Campus
(Visitor parking instructions)
This talk is part of the System Research Seminar series
As user demand scales for intelligent personal assistants (IPAs) such as Appleâ€™s Siri, Googleâ€™s Google Now, and Microsoftâ€™s Cortana, we are approaching the computational limits of current datacenter architectures. It is an open question how future server architectures should evolve to enable this emerging class of applications, and the lack of an open-source IPA workload is an obstacle in addressing this question.
In this presentation, I present the design of Sirius, an open end-to-end IPA web-service application that accepts queries in the form of voice and images, and responds with natural language. This workload is then used to investigate the implications of four points in the design space of future accelerator-based server architectures spanning traditional CPUs, GPUs, manycore throughput co-processors, and FPGAs. To investigate future server designs for Sirius, we decompose Sirius into a suite of 7 benchmarks (Sirius Suite) comprising the computationally intensive bottlenecks of Sirius. We port Sirius Suite to a spectrum of accelerator platforms and use the performance and power trade-offs across these platforms to perform a total cost of ownership (TCO) analysis of various server design points. In our study, we find that accelerators are critical for the future scalability of IPA services. Our results show that GPU and FPGA accelerated servers improve the query latency on average by 10x and 16x. For a given throughput, GPU and FPGA accelerated servers can reduce the TCO of datacenters by 2.6x and 1.4x, respectively.
Lingjia Tang is an assistant professor of CSE at the University of Michigan. Prior to joining the University of Michigan, she was a research faculty member at UCSD CSE Department 2012-2013. Her research focuses on computer architecture and software systems, especially such systems for large scale datacenters. Recently, her publications at HPCA â€™ 13 and ISCA â€™13 on optimizing datacenter infrastructures are reported in Wired, The Registers and ACM tech news. Her publication at Micro'11 is selected as IEEE Micro Top Picks. She received a best paper award at IEEE/ACM International Conference of Code Generation and Optimization (CGO) 2012. In addition, her publication at International Symposium of Computer Architecture is selected as one of the excellence papers 2011 by Research at Google. More information can be found at: www.lingjia.org and http://clarity-lab.org. Jason Mars received his Ph.D. from University of Virginia in 2012 and recently joined the CSE department at the University of Michigan as an assistant professor in July of 2013. Marsâ€™ work focuses on the optimization of warehouse scale computers (WSCs). Such systems are increasingly becoming popular: companies like Google, Apple, Facebook, Amazon, IBM and Microsoft are deploying hundreds of such systems throughout the world in order to power the next generation of internet services. Such systems now form the backbone of the emerging area of â€˜cloud computing.â€™ Marsâ€™ work is at the intersection of architectures, compilers and runtime systems. Specifically, his work proposes a set of techniques that to optimize the interaction and sharing of resources in datacenter systems while maintaining high levels of performance. More information can be found at: www.jasonmars.org and http://clarity-lab.org.
Special Instructions: The talk will be given by two speakers together. The form was filled with both information.
Host: Xipeng Shen, CSC