Seminars & Colloquia

Yong Zhao

Department of Computer Science, University of Chicago

"A Virtual Data Language and System for Scientific Workflow Management in Data Grid Environments"

Wednesday February 28, 2007 10:00 AM
Location: 3211, EBII NCSU Centennial Campus
(Visitor parking instructions)


Abstract: With the advances in scientific instrumenting and simulation, scientific data is growing fast in data size and data analysis complexity. So-called Data Grids aim to provide high performance, distributed data analysis infrastructure for data-intensive sciences, where scientists distributed worldwide need to extract scientific information from large collections of data, and they would like to share both data products and the resources needed to produce and store them.

However, the description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with 'messy' issues like heterogeneous storage formats and ad-hoc file system structures. We show how these difficulties can be overcome via a typed workflow notation called virtual data language, within which issues of physical representation are cleanly separated from logical typing, and by the implementation of this notation within the context of a powerful runtime system that supports distributed execution. The resulting virtual data language and system are capable of expressing complex workflows in a simple compact form, enacting those workflows in distributed environments, monitoring and recording the execution processes, and tracing the derivation history of data products.

We describe the motivation, design, implementation, and evaluation of the virtual data language and system, and the application of the virtual data paradigm to various science disciplines including astronomy, cognitive neuroscience, high energy physics and science education.

Short Bio: Yong Zhao is a doctoral student in the Department of Computer Science at the University of Chicago. His research interests include scientific workflow and data management, high performance computing, Web services composition, Semantic Web and knowledge discovery. He holds an M.S. in Computer Science from the University of Chicago.

Host: Rada Chirkova, Computer Science, NCSU

Back to Seminar Listings
Back to Colloquia Home Page