Seminars & Colloquia
"Data Integration Systems and the Plethora of Standards: Joining Research and Pragmatics"
Thursday March 23, 2017 10:30 AM
Location: 3211, EB2 NCSU Centennial Campus
(Visitor parking instructions)
This talk is part of the Taming the Data Seminar series
Abstract: This talk will have two parts, both of which try to crystalize general, pragmatic lessons.
• Extending and targeting research to improve adoptability: Over the years, we have seen much excellent algorithmic research that has failed to make it to product or routine practice. This phenomenon has certainly harmed data integration (to which, we give a brief introduction). Using simple examples from data integration research, we identify general patterns by which research choices could increase the chances for adoption and generate additional research questions.
• Disrupting today’s limits on interoperability: The government organizations we know rarely use modern tools – The default tool for data engineering is Microsoft Office, and artifacts rapidly become shelfware. We explain why that organizations such as CDC or SEC that run submission hubs –collecting and forwarding data from many sources may be best placed to break the logjam – but report that even they are too conservative.
Data standards are the alternative to integration tools – more powerful in that they guide organizations in deciding what information they need to collect. However, each release takes years, requires agreement on a broad range of data, and offers little flexibility to serve urgent needs or niche needs. We sketch an alternative mode that is attuned to the fundamental challenges–independent actors, diverse preferences, and a need for local simplicity. Our approach is incremental and radically decentralized -- a web of (overlapping, loosely-coupled) topic-ontologies, with automated mediation and provision for incremental change. We close by identifying the research and practical challenge in carrying out such an approach.
Short Bio: Arnon Rosenthal has consulted and published in data sharing and administration, databases, clouds, data security, policy based systems, and graph algorithms. He has (according to ResearchGate) 150+ publications, and 4000+ citations. His work tries to address many sides of a problem simultaneously, clarify and decompose the challenges, understand the pragmatics, simplify, and generalize components of a solution, for a realistically imperfect world. He has worked at The MITRE Corporation, and Computer Corporation of America, and Sperry Research, was a visiting researcher at IBM Almaden Research and ETH Zurich, and a faculty member at University of Michigan (Ann Arbor). He holds a Ph.D. from the University of California, Berkeley.
Host: Rada Chirkova, CSC
No media files available at this time
Host is responsible for requesting video recording by filling out this Web form. For other technical issues, contact us at firstname.lastname@example.org.
No streaming video available at this time