Seminars & Colloquia
Computer Science & Engineering, University of Washington
"Probabilistic Answers to Queries "
Friday October 29, 2004 11:00 AM
Location: 313, EGRC NCSU Centennial Campus
(Visitor parking instructions)
This talk is part of the Taming the Data Seminar series
Abstract: This talk promotes a new query paradigm, in which incomplete and conflicting evidences are collected from a variety of data sources and used to answer complex queries. The system assigns a probability to each tuple in the answer and uses it to rank the results. The first part of the talk contains several examples showing how inexact matches between data values, inexact matches between schema components, lack of sufficient evidence to answer a given query, and violations of constraints across data sources, can all be modeled into a single, unified probabilistic framework. It also illustrates how the same probabilistic framework can be used to reason about information disclosure. The second half of the talk describes a new theoretical result that is a key piece in this new probabilistic framework: how to compute the asymptotic conditional probabilities of queries. The connection to random graphs will also be explained.
Short Bio: Dan Suciu is an associate professor in the Department of Computer Science and Engineering at the University of Washington. Before that he was a researcher at AT&T Shannon Laboratories. His research is on data management for large scale data exchange. Previous achievements include a first query language for XML, a public XML compressor (XMill), and a toolkit for stream processing of XML data. His current research is on data security and query answering in the presence of uncertainties.
Host: Rada Chirkova, Computer Science Department, NCSU