May 31, 2011

Anyanwu-Ogan Receives Faculty Research Award

Dr. Kemafor Anyanwu-Ogan, assistant professor of computer science at NC State University, has received a Faculty Research and Professional Development (FRPD) Award in the amount of $7,000 to support her research proposal titled “Optimization Techniques for Large Scale Analytics of Graph Structured Data on MapReduce Platforms.” 

Abstract - The phrase "Big Data" is commonly used to refer to data environments experiencing exponential growth. An important problem is how to efficiently convert very large amounts of data into "actionable knowledge" that can be used for decision-making, i.e. "Big Data" analytics.  While data analytics has traditionally been supported by relational database and warehousing and data mining systems, such systems were not designed to scale to the size of data sets in today’s data rich environments. Current trends to deal with this problem focus on using massively parallel software running on tens, hundreds, or even thousands of commodity grade servers.
A recent proposal by Google called MapReduce is fast becoming the de-facto standard computational paradigm for large scale analytics because of its ease-of-use and easily-scalable characteristics. Currently, systems for large-scale data processing based on MapReduce are in their nascent stages and only support tasks of limited complexity on either structured or unstructured data. However, a growing amount of data that is available is not fully structured but rather partially or semi-structured. Examples include data represented in XML documents commonly used by business organizations to enable interoperability, social networking and other Web generated content enabled by Web2.0 technologies. Additional examples include the growing number of communities that are adopting Semantic Web tenets for representing their data using W3C standard for Web data representation called RDF. These include general-purpose communities (e.g.,Wikipedia, GeoNames), scientific research communities like the biomedical communities (e.g., DrugBank, Linked Clinical Trials), government organizations (e.g.,, and corporations (e.g., BBC, New York Times, Freebase).
Graph models can be used to provide a unifying data abstraction for most semi-structured data, therefore, considering analytics for graph models provides solutions for a variety of available data. However, graph analytics on Map Reduce platforms leads to significant overhead in disk I/O and communication costs due the nature of the MapReduce computational model. This proposal aims to advance the state-of-the-art in graph-structured analytics using MapReduce platforms by investigating and developing different optimization techniques that address these I/O and communication bottlenecks.  
The FRPD program is a funding partnership between the NC State University Office of Research and Innovation and the 10 academic colleges. It is intended to assist faculty in initiating research and professional development activities. A primary objective from the award should be to use the funding as “seed” money leading to support from outside granting agencies. For more information on Faculty Research and Professional Development funding, click here.
