Current Research Projects (by faculty)

The funded projects listed below are active projects and the funded running total for the active projects is on the left navigational menu.

EAGER: Collaborative Research: Enhancing Impact of Broadening Participation in Computing Efforts through the STARS Cohort Conference Attendance Program
Tiffany Barnes

$47,432 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2020

In a recent Dear Colleagues Letter [cite], NSF outlined their commitment to “enhance the community's awareness of and barriers to broadening participation in computing (BPC) as well as to provide information and resources to principal investigators (PIs) so that they can develop interest, skills, and activities in support of BPC at all levels”. As part of this new initiative, NSF has launched a small pilot of a new requirement that PIs include plans for meaningful BPC activities in their proposals. In March 2018, NSF sponsored the BPCNet workshop to bring together BPC researchers and practitioners, including those from the NSF-funded BPC Alliances program, to identify approaches to support new PIs to engage in BPC efforts in meaningful ways. Our experience in leading the STARS Computing Corps demonstrates the value of cohort-based efforts to broaden participation in computing for individual students and computing departments. In response to the call for action presented to attendees at the NSF BPCNet workshop, the goal of this proposal is to provide a scaffolded approach to engaging faculty in cohort-based BPC efforts. As a starting point, the STARS Computing Corps proposes to prepare faculty to support diverse cohorts of students to attend diversity-oriented conferences through scholarships and targeted activities that build community and a sense of belonging for the attending students and ignite efforts to broaden participation within their own computing depart

Collaborative Research: Integrating Computing in Stem: Designing, Developing and Investigating a Team-based Professional Development Model for Middle-and High-school Teachers
Tiffany Barnes

$861,773 by National Science Foundation (NSF)
09/ 1/2017 - 08/31/2020

Integrated Snap! is a comparison study between the traditional single teacher professional development model and a community of practice professional development where teachers attend as a group. CTE/CS, Math, Engineering, and Science teachers that would foster the integration of Computer Science Principles Curriculum and Science Curriculum in high school classrooms. The professional development will take place using Borko’s (2004) phases. We will develop a professional development model centered on helping content teachers learn best practices for integrating computing in their classroom. That PD will be piloted, replicated, and then modified for other sites and contexts. Teachers will work together to build simulation and programming tools and corresponding activities for their classrooms to use to explore computational concepts in the context of their discipline. An integral part of Integrated Snap is to have teams of teachers from the same school or district attend the PD together and build a community among the teachers and their students doing similar work across their classrooms.

REU Site: Socially Relevant Computing and Analytics
Tiffany Barnes

$359,997 by National Science Foundation
03/ 1/2017 - 02/29/2020

The REU Site at NC State University will immerse a diverse group of undergraduates in a vibrant research community of faculty and graduate students building and analyzing cutting-edge human-centric applications including games, tutors, and mobile apps. We will recruit students from underrepresented groups and colleges and universities with limited research opportunities through the STARS Computing Corps, an NSF-funded national consortium of institutions dedicated to broadening participation in computing. Using the Affinity Research Groups and STARS Training for REUs models, we will engage faculty and graduate student mentors with undergraduates to create a supportive culture of collaboration while promoting individual contributions to research through just-in-time training for both mentors and students throughout the summer.

REU Site: Interactive and Intelligent Media
Tiffany Barnes

$359,999 by National Science Foundation
04/ 1/2013 - 03/31/2019

The REU Site at NC State University will immerse a diverse group of undergraduates in a vibrant research community of faculty and graduate students working on cutting-edge games, intelligent tutors, and mobile applications. We will recruit students from underrepresented groups and colleges and universities with limited research opportunities through the STARS Alliance, an NSF-funded national consortium of institutions dedicated to broadening participation in computing. Using the Affinity Research Groups and STARS Training for REUs models, we will engage faculty and graduate student mentors with undergraduates to create a supportive culture of collaboration while promoting individual contributions to research through just-in-time training for both mentors and students throughout the summer.

EXP: Data-driven Support for Novice Programmers
Tiffany Barnes ; Min Chi

$549,874 by National Science Foundation
09/ 1/2016 - 08/31/2019

While intelligent tutors have been shown to increase student learning in programming and other domains, and creative, exploratory programming environments are assumed to promote novice interest and motivation to learn to program, there are no environments that provide both creative tasks and intelligent support. We propose to extend our methods for data-driven hint generation, model tracing, and knowledge tracing to augment Snap and Java programming environments to be more supportive for novice programmers doing open-ended creative tasks.

Scaling a Rigorous CS Principles Curriculum: A Supplement to Beauty and Joy of Computing in New York City STEM-C MSP
Tiffany Barnes ; Glenn Kleiman

$568,967 by Educational Development Center via National Science Foundation
01/ 1/2017 - 12/31/2018

In recent decades, coding has evolved from a professional activity of a few million developers to a near universally-needed skill. However, there are still fewer than 2000 teachers prepared to teach computer science to high school students. In this supplement we propose to engage 162 teachers in BJC professional development and Train the Trainer workshops.

GRIP: Computer Science for All K-12 Students
Tiffany Barnes, Co-PI ; James Lester, Co-PI ; Glenn Kleiman, PI (Friday ; Eric Weibe, Co-PI (Friday

$547,187 by Game-Changing Research Incentive Program (GRIP
02/ 1/2017 - 01/31/2019

This project addresses the critical national need to “Empower all American students from kindergarten through high school to learn computer science and be equipped with the computational thinking skills they need to be creators in the digital economy, not just consumers, and to be active citizens in our technology-driven world” (President Obama, 2016 State of the Union Address). We take a systemic approach in which we will: -Conduct research to determine how students develop the key concepts and processes of computer science and the effectiveness of alternative teaching approaches. -Design learning resources that guide and support the teaching and learning of computer science, using emerging technologies such as artificial intelligent-driven tutors and interactive, game-like, virtual learning environments. -Influence the practice of teaching computer science in K-12 schools, through providing both professional learning and mentoring opportunities for teachers. -Inform policy decisions, at the state, district and school levels, through providing information for principals, district leaders, local school board members and state policymakers to inform their decisions about adding computer science in the K-12 curriculum

Evaluation For Actionable Change: A Data-Driven Approach
Tiffany Barnes, Co-PI ; Collin Lynch, Co-PI

$799,837 by National Science Foundation
01/ 1/2016 - 12/31/2018

RCTs are expensive and often show small effects. Even RCTs of widely adopted digital learning platforms can show disappointing results and these results have little impact on subsequent adoptions of programs already entrenched in the educational landscape. New methods are needed to both estimate effects and to indicate ways of improving outcomes for already-adopted digital learning tools. With platforms currently in wide-scale use, novel approaches to assessing use patterns and their relations with outcomes can both evaluate maximal effectiveness and provide means for improved effectiveness. Our study will use a data-driven approach to identify patterns of both student use and teacher implementation in the widely-adopted software Spatial Temporal Mathematics (ST Math). By linking these patterns with important learning and motivational outcomes we can form recommendations regarding promising actions teachers and administrators can take in implementing ST Math and refinements program developers can make in guiding students toward successful patterns. This work has the potential to not only transform use and success of the platform studied, but to create methods that can be refined and transferred to the study and implementation of other platforms.

CAREER: Improving Adaptive Decision Making in Interactive Learning Environments
Min Chi

$547,810 by National Science Foundation
03/ 1/2017 - 02/28/2022

For many forms of interactive environments, the system's behaviors can be viewed as a sequential decision process wherein, at each discrete step, the system is responsible for selecting the next action to take from a set of alternatives. The objective of this CAREER proposal is to learn robust interaction strategies that will lead to desirable outcomes in complex interactive environments. The central idea of this project is that strategies should not only be effective in complex interactive environments but they should also be efficient, focusing solely on the key features of the domain and the crucial decision points. These are the features and decisions that are not only associated with desirable outcomes, but without which the desirable outcomes are unlikely to occur.

DIP: Integrated Data-driven Technologies for Individualized Instruction in STEM Learning Environments
Min Chi ; Tiffany Barnes

$1,999,438 by National Science Foundation
08/15/2017 - 07/30/2022

In this proposed work, our goal is to automatically design effective personalized ITSs directly from log data. We will combine Co-PI Dr. Barnes data-driven approach on learning what to teach with PI Dr. Chi’s data-driven work on learning how to teach. More specifically, we will explore across three important undergraduate stem domains: discrete math, probability, and programming.

PFI: BIC: FEEED: Flexible, Equitable, Efficient and Effective Distribution
Min Chi ; Julie Ivy

$400,000 by UNC - NC A & T State University
08/ 1/2017 - 07/31/2020

Nonprofit hunger relief organizations operate in a complex environment consisting of a large and diverse donor base and a dynamic distribution network. These organizations generate a large amount of unstructured and complex data on food collection, inventory management, and distribution activities. However, existing information systems lack the infrastructure to interpret this large-scale data to provide real-time policy recommendations and support operational and strategic decision-making. The proposed smart service system will synthesize data from disparate sources to create a real-time perspective of the environment and learn from the actions of the decision maker. Specifically, this system will automatically predict, visualize, and recommend decisions that will advance operational effectiveness of food collection, distribution, and resource management in a way that is efficient and equitable and will identify opportunities to improve a food bank’s capability to satisfy hunger need.

Using Real-Time Multichannel Self-Regulated Learning Data to Enhance Student Learning and Teachers' Decision-Making with MetaDash
Min Chi - Co PI ; Roger Azevedo (Psychology) ; Soonhye Pakr - co-PI EDUCA

$914,585 by National Science Foundation
04/ 1/2017 - 03/31/2020

This 3-year project will focus on laboratory and classroom research in the Raleigh-Chapel-Hill, and Durham areas. A team of interdisciplinary researchers from NCSU's Department of Psychology (Dr. Roger Azevedo), Computer Science (Dr. Min Chi), and STEM Education (Dr. Soonhye Park) will conduct empirical and observational research aimed at improving teachers' decision-making based on their analyses of students' real-time, multi-channel self-regulated learning data. We will use multi-channel data to understand the nature of self-regulatory processes in students while using MetaTutor to understand challenging science topics (e.g., human biological systems). This will be accomplished by aligning and conducting complex computational and statistical analyses of a multitude of trace data (e.g., log-files, eye-tracking, facial expression of emotions), behavioral (e.g., human-pedagogical agent dialogue moves), and physiological measures (e.g., EDA), learning outcomes, and classroom data (e.g., teacher-student interactions, gaze behavior of teachers’ attention and use of data presented by the visualization tool). The proposed research, in the context of using MetaTutor and a visualization tool for teachers, is extremely challenging and will help us to better understand the nature and temporal dynamics of these processes in classroom contexts, how they contribute to various types of learning and use of self-regulatory skills, and provide empirical basis for designing an intelligent teacher analytics tool. The results from this grant will contribute significantly to models of social and cognitive bases of student-machine-teacher interactions; statistical and computational methods used to make inferences from complex multi-channel data; theoretical and conceptual understanding of temporally-aligned data streams; enhancing students’ understanding of complex science topics by making more sensitive and intelligent advanced learning technologies; and, enhanced understanding of how teachers use real-time student data to enhance their instructional decision-making, based on data presented in teacher analytic tools.

IUCRC Pre-proposal Phase I NC State University: Center for Accelerated Real Time Analytics (CARTA)
Rada Chirkova

$747,647 by National Science Foundation (NSF)
06/ 1/2018 - 05/31/2023

Real-time analytics is the leading edge of a smart data revolution, pushed by Internet advances in sensor hardware on one side, and AI/ML streaming acceleration on the other. We propose creation of a Center of Accelerated Real Time Analytics (CARTA) to explore the realm streaming applications of analytics. This center will be lead by University of Maryland, Baltimore County with partners from NCSU, Rutgers, and other affiliated universities. The proposed center will work with next generation hardware technologies, like the IBM Minsky with on board GPU accelerated processors and Flash RAM, a Smart Cyber Physical Sensor Systems to build Cognitive Analytics systems and Active storage devices for real time analytics. This will lead to the automated ingestion and simultaneous analytics of Big Datasets generated in various domains including Cyberspace, Healthcare, Internet of Things (IoT) and the Scientific arena, and the creation of self learning, self correcting “smart” systems. At the core of these technologies are the techniques of data wrangling that enable this end-to-end real-time data processing and the infrastructure of the next generation of high-performance analytics systems.

I/UCRC: Site application to join I/UCRC known as CHMPR
Rada Chirkova

$298,533 by National Science Foundation
09/ 1/2016 - 08/31/2019

Abstract: The objective of this proposal is to indicate that North Carolina State University (NCSU) will join, as a site, the Center of Hybrid Multicore Productivity Research (CHMPR) in Year 2 of its phase II I/UCRC renewal. The focus of NCSU within the center will be the science of technologies for end-to-end enablement of data. The research at NCSU complements well the work being done at the other I/UCRC centers and at the other sites of the CHMPR center. NCSU has had extensive positive experience with I/UCRC centers over the years, and is very comfortable with the model.

Membership in Center of Hybrid Multicore Productivity Research (CHMPR) - Full Member
Rada Chirkova

$80,000 by Merck & Company
01/ 1/2017 - 12/31/2018

Exploring and analyzing the available data is key to making the right decisions. It is well known that “data wrangling,” which includes many kinds of end-to-end data enablement, makes up 60-80% of the total effort in analytics on large-scale data. We look to address the challenge of maximizing the usefulness of the available data, by providing tools, science, and talent for next-generation technologies and infrastructure. We focus on empowering organizations that wish to unlock the value of decisions based on their data, and envision a future where technologies and tools for data enablement provide significant business advantages to such organizations. At NCSU, we will lead national and international efforts in this space, by developing and providing technologies and tools for bridging the time gap between the acquisition of data and real-time and long-term decision making.

Membership in Center of Hybrid Multicore Productivity Research (CHMPR) - Full Member
Rada Chirkova

$80,000 by SAS Institute, Inc
01/ 1/2017 - 12/31/2018

Exploring and analyzing the available data is key to making the right decisions. It is well known that “data wrangling,” which includes many kinds of end-to-end data enablement, makes up 60-80% of the total effort in analytics on large-scale data. We look to address the challenge of maximizing the usefulness of the available data, by providing tools, science, and talent for next-generation technologies and infrastructure. We focus on empowering organizations that wish to unlock the value of decisions based on their data, and envision a future where technologies and tools for data enablement provide significant business advantages to such organizations. At NCSU, we will lead national and international efforts in this space, by developing and providing technologies and tools for bridging the time gap between the acquisition of data and real-time and long-term decision making.

Development of a Nearly Autonomous Management and Control System for Advanced Reactors
Nam Dinh ; Maria Avramova ; Abhinav Gupta ; Min Chi

$2,686,834 by US Dept. of Energy (DOE) - Advanced Research Projects Agency - Energy (ARPA-E)
10/ 1/2018 - 03/31/2021

The proposed project seeks to establish a technical basis for, and preliminary development of, a Nearly Autonomous Management and Control (NAMAC) system in advanced reactors. The system is intended to provide recommendations to operators during all modes of plant operation except shutdown operations: plant evolutions ranging from normal operation to accident management. These recommendations are to be derived within a modern, artificial-intelligence (AI) guided system, making use of continuous extensive monitoring of plant status, knowledge of current component status, and plant parameter trends; the system will continuously predict near-term evolution of the plant state, and recommend a course of action to plant personnel.

CSR: Medium:Collaborative Research: Holistic, Cross-Site, Hybrid System Anomaly Debugging for Large Scale Hosting Infrastructures
Xiaohui (Helen) Gu

$518,000 by National Science Foundation
08/ 1/2015 - 07/31/2019

Hosting infrastructures provide users with cost-effective computing solutions by obviating the need for users to maintain complex computing infrastructures themselves. Unfortunately, due to their inherent complexity and sharing nature, hosting infrastructures are prone to various system anomalies caused by various external or internal faults.The goal of this project is to investigate a holistic,cross-site, hybrid system anomaly debugging framework that intelligently integrates production-site black-box diagnosis and developer-site white-box analysis into a more powerful hosting infrastructure anomaly debugging system.

Identification of Translational Hormone-Response Gene Networks and Cis-Regulatory Elements
Steffen Heber(co-PI) ; Jose Alonso(Lead PI-CALS) ; Anna Stepanova (CALS) ; Cranos Williams (ECE)

$897,637 by National Science Foundation
08/ 1/2015 - 07/31/2020

Plants, as sessile organisms, need to constantly adjust their intrinsic growth and developmental programs to the environmental conditions. These environmentally triggered “adjustments“ often involve changes in the developmentally predefined patterns of one or more hormone activities. In turn, these hormonal changes result in alterations at the gene expression level and the concurrent alterations of the cellular activities. In general, these hormone-mediated regulatory functions are achieved, at least in part, by modulating the transcriptional activity of hundreds of genes. The study of these transcriptional regulatory networks not only provides a conceptual framework to understand the fundamental biology behind these hormone-mediated processes, but also the molecular tools needed to accelerate the progress of modern agriculture. Although often overlooked, understanding of the translational regulatory networks behind complex biological processes has the potential to empower similar advances in both basic and applied plant biology arenas. By taking advantage of the recently developed ribosome footprinting technology, genome-wide changes in translation activity in response to ethylene were quantified at codon resolution, and new translational regulatory elements have been identified in Arabidopsis. Importantly, the detailed characterization of one of the regulatory elements identified indicates that this regulation is NOT miRNA dependent, and that the identified regulatory element is also responsive to the plant hormone auxin, suggesting a role in the interaction between these two plant hormones. These findings not only confirm the basic biological importance of translational regulation and its potential as a signal integration mechanism, but also open new avenues to identifying, characterizing and utilizing additional regulatory modules in plants species of economic importance. Towards that general goal, a plant-optimized ribosome footprinting methodology will be deployed to examine the translation landscape of two plant species, tomato and Arabidopsis, in response to two plant hormones, ethylene and auxin. A time-course experiment will be performed to maximize the detection sensitivity (strong vs. weak) and diversity (early vs. late activation) of additional translational regulatory elements. The large amount and dynamic nature of the generated data will be also utilized to generate hierarchical transcriptional and translational interaction networks between these two hormones and to explore the possible use of these types of diverse information to identify key regulatory nodes. Finally, the comparison between two plant species will provide critical information on the conservation of the regulatory elements identified and, thus, inform research on future practical applications. Intellectual merit: The identification and characterization of signal integration hubs and cis-regulatory elements of translation will allow not only to better understand how information from different origins (environment and developmental programs) are integrated, but also to devise new strategies to control this flow for the advance of agriculture. Broader Impacts: A new outreach program to promote interest among middle and high school kids in combining biology, computers, and engineering. We will use our current NSF-supported Plants4kids platform (ref) with a web-based bilingual divulgation tools, monthly demos at the science museum and local schools to implement this new outreach program. Examples of demonstration modules will include comparison between simple electronic and genetic circuits.

Collaborative Research: Transforming Computer Science Education Research Through Use of Appropriate Empirical Research Methods: Mentoring and Tutorials
Sarah Heckman

$406,557 by National Science Foundation
09/ 1/2015 - 08/31/2020

The computer science education (CSEd) research community consists of a large group of passionate CS educators who often contribute to other disciplines of CS research. There has been a trend in other disciplines toward more rigorous and empirical evaluation of various hypotheses. However, many of the practices that we apply to demonstrate rigor in our discipline research are ignored or actively avoided when performing research in CSEd. This suggests that CSEd is “theory scarce” because most publications are not research and do not provide the evidence or replication required for meta-analysis and theory building . An increase in empiricism in CSEd research will move the field from “scholarly teaching” to the “scholarship of teaching and learning” (SoTL) providing the foundation for meta-analysis and the generation of theories about teaching and learning in computer science. We propose the creation of training workshops and tutorials to educate the educators about appropriate research design and evaluation of educational interventions. The creation of laboratory packages, “research-in-a-box,” will support sound evaluation and replication leading to meta-analysis and theory building in the CSEd community.

SaTC: CORE: Medium: Collaborative: Taming Web Content Through Automated Reduction in Browser Functionality
Alexandros Kapravelos

$406,609 by National Science Foundation
09/ 1/2017 - 08/31/2021

The browser is constantly evolving to meet the demands of Web applications. Although this evolution supports the innovation that we see on the internet, there are security implications that we need to consider, such as attacks against the browser that leverage bugs that occur from the rapid development. In this project, we plan to examine how certain web applications work and associate their behavior directly with the corresponding browser functionality. Our goal is to be able to characterize what functionality is need from the browser when rendering a page and certain components. By building a system like this we will be able to identify for example what is needed from the browser to render a web advertisement. To better protect the internet users, we are going to leverage that information so that we can identify when web applications diverge from their expected behavior and attack the users' browser. We will use this information to limit the exposed functionality to the web applications and eliminate this way multiple classes of attacks, such as browser fingerprinting and drive-by downloads.

XS-Shredder: A Cross-Layer Framework for Removing Code Bloat in Web Applications
Alexandros Kapravelos

$300,000 by Arizona State University via Office of Naval Research
07/ 1/2017 - 06/30/2019

Modern web applications are incredibly complex pieces of software, with frameworks and libraries that assist web developers to write their applications quickly. However, these frameworks and libraries increase the attack surface of the web application. In this proposal, we present the design of a framework, called XS-Shredder, which is able to debloat all layers of the web application software stack: client-side code, server-side code, database, and operating system. This framework will perform analysis inter- and intra-layer, ultimately resulting in a web application that is semantically identically, yet with a significantly reduced attack surface.

Collaborative Research: Big Data from Small Groups: Learning Analytics and Adaptive Support in Game-based Collaborative Learning
James Lester

$1,249,611 by National Science Foundation
10/ 1/2016 - 09/30/2021

The proposed project focuses on integrating models of game-based and problem-based learning in a computer-supported collaborative learning environment (CSCL). As groups of students solve problems in these environments, their actions generate rich and dynamic streams of fine-grained multi-channel data that can be instrumented for investigating students' learning processes and outcomes. Using the big data generated by small groups, we will leverage learning analytics to provide adaptive support for collaboration that will allow these models to be used at larger scales in real classrooms. The project will study CSCL in the context of an environmental-science-based digital game that will employ specific strategies to support the problem-based learning goals of helping students construct explanations, reason effectively, and become self-directed learners. In problem-based learning, students are active, intentional learners who collaboratively negotiate meaning. The project will embed models induced using learning analytic techniques inside of a digital game environment to enable students to cultivate collaborative learning competencies that translate to non-digital classroom settings.

Learning Environments Across Disciplines LEADS: Supporting Technology Rich Learning Across Disciplines: Affect Generation and Regulation During Co-Regulated Learning in Game-Based Learning Environments (Supplement
James Lester

$114,672 by McGill University/Social Sciences and Humanities Research Council of Canada
04/ 1/2012 - 02/28/2020

Contemporary research on multi-agent learning environments has focused on self-regulated learning (SRL) while relatively little effort has been made to use co-regulated learning as a guiding theoretical framework (Hadwin et al., 2011). This oversight needs to be addressed given the complex nature that self-and other-regulatory processes play when human learners and artificial pedagogical agents (APAs) interact to support learners? internalization of cognitive, affective, and metacognitive (CAM) SRL processes. We will use the Crystal Island learning environment to investigate these issues.

REFLECT: Improving Science Problem Solving with Adaptive Game-Based Reflection Tools
James Lester ; Roger Azevedo (Psychology)

$1,300,000 by National Science Foundation
04/15/2017 - 03/31/2020

Reflection has long been recognized as a central component of effective learning. With the overarching goal of improving middle school students' science problem solving and learning outcomes, the REFLECT project has the objective of investigating a suite of theoretically grounded, adaptive game-based reflection tools to scaffold students' cognitive and metacognitive processes. The project will center on the design, development, and investigation of game-based learning tools for science education that adaptively scaffold students’ reflection through both embedded and retrospective support. It will culminate in a classroom experiment to study the impact of the adaptive reflection tools on both problem solving and learning. The results from this project will contribute significantly to theoretical and computational models of reflection, and produce both design principles and learning technologies that support the creation of effective learning environments.

ENGAGE: A Game-Based Curricular Strategy for Infusing Computational Thinking into Middle School Science.
James Lester ; Brad Mott ; Eric Wiebe (Friday Instit

$2,498,862 by National Science Foundation
08/15/2016 - 07/31/2019

Recent years have seen a growing recognition that computer science is vital for scientific inquiry. The middle school grade band is critical for shaping students’ aspirations and skills, and many issues relating to workforce underproduction and underrepresentation of diverse students in computer science can be traced back to middle school. To address this problem, the project will deeply integrate computer science into middle school science classrooms. Centered on a game-based learning environment that features collaborative learning, the project will have a specific focus on addressing gender issues in middle school computer science education with the goal of creating learning interactions that are both effective and engaging for all students.

Collaborative Research: PRIME: Engaging STEM Undergraduate Students in Computer Science with Intelligent Tutoring Systems
James Lester ; Bradford Mott ; Eric Wiebe (Friday Instit

$1,499,828 by National Science Foundation
09/ 1/2016 - 08/31/2020

Significant advances in intelligent tutoring systems have paved the way for engaging STEM undergraduates in computer science. This research has spawned a new generation of personalized learning environments that offer significant promise for providing students with adaptive learning experiences that are crafted to their individual needs. Spurred by this significant promise and building on a research infrastructure developed by the project team in a series of NSF-supported projects, the PRIME project will transform introductory computer science education with state-of-the-art intelligent tutoring systems technologies.

Guiding Understanding via Information from Digital Environments (GUIDE)
James Lester Co-PI ; Eric Wiebe Lead PI

$1,238,549 by Concord Consortium via National Science Foundation
09/15/2015 - 08/31/2019

This project will utilize research and development groups at the Concord Consortium and NC State University. Educational software for teaching high school multi-level genetics developed by the Concord Consortium will be enhanced by intelligent agents and machine-based tutoring system technologies developed at NC State to help enhance the learning experience for students. These groups will collaborate closely to develop and research a hybrid system that combines technological intervention and teacher pedagogical expertise to illuminate and guide student learning in deeply digital curricula and classrooms.

Health Quest: Engaging Adolescents in Health Careers with Technology-Rich Personalized Learning
James Lester, II

$1,301,820 by National Institutes of Health (NIH)
08/ 1/2017 - 07/31/2022

Leveraging intelligent game-based learning technologies, the Health Quest project focuses on developing and disseminating technology-rich resources to broaden the interests of adolescents in biomedical, behavioral and clinical research careers. The project centers on the development of technology-rich learning resources. These include a game-based learning environment featuring health careers as well as an online community that includes a speaker series featuring a broad range of health professionals. The final year of the project will see a full evaluation of the Health Quest program and its impact on students’ interest in biomedical, behavioral and clinical research careers.

Supporting Student Planning with Open Learner Models in Middle Grades Science
James Lester, II

$1,499,183 by National Science Foundation (NSF)
08/15/2018 - 07/31/2021

The ability to plan is a key element of learning. With the objective of improving middle school students' science learning, the project will investigate open learner models to scaffold student planning. The project will see the design, development, and investigation of an open learner model for student goal setting and planning. In contrast to the "classic" student models of intelligent tutoring systems, which are opaque, open learner models are inspectable: they enable students to inspect a learning environment's representation of their knowledge and competencies. Using the Future Worlds learning environment, the project will feature classroom studies that will investigate the impact of open learner models on both problem solving and learning in middle grades science.

Collaborative Research: FW-HTF: Augmented Cognition for Teaching: Transforming Teacher Work with Intelligent Cognitive Assistants
James Lester, II ; Bradford Mott

$1,499,736 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2022

Effective teaching is the cornerstone of K-12 education. However, effective teaching occurs in complex workplaces that require teachers to cope with the real-time demands of providing effective learning experiences for large classrooms of students by skillfully bringing to bear their expertise in pedagogy and classroom management. Although there is enormous potential for enhancing teaching with technology-rich support that leverages artificial intelligence (AI), limited work has been done to investigate how emerging AI technologies can bring about fundamental improvements to the teaching profession. With recent advances in AI technologies for natural language processing, machine learning, and user-adaptive support, the time is ripe for transforming the professional lives of teachers. The objective of the proposed research is to design, develop, and evaluate the Intelligent-Augmented Cognition for Teaching (I-ACT) framework featuring intelligent cognitive assistants for K-12 teachers. A unique feature of I-ACT afforded by recent advances in machine learning will be its ability to optimize teacher support for collaborative learning at the individual student, group, and classroom levels

DeepGen: Dynamic Generation of Training Simulation Scenarios with Deep Reinforcement Learning (W911NF-17-S-0003: CCE-HS-3: Training)
James Lester, II ; Bradford Mott ; Jonathan Rowe

$1,398,004 by US Army - Army Research Laboratory
12/ 8/2017 - 12/ 7/2021

Automated scenario generation offers considerable promise for addressing the needs of simulation-based training. Given recent advances in machine learning, including artificial neural networks and deep learning, data-driven approaches to automatically generating scenarios that are customized to the cognitive and affective characteristics of individual learners hold great potential. This project will investigate a critical research question for adaptive simulation-based training: how can we devise generalizable, data-driven scenario generation models that dynamically adapt training events to achieve target learning objectives in simulation-based virtual training environments? To answer this question, the project will investigate a data-driven framework for dynamic scenario generation that formalizes the task as a deep reinforcement learning problem. We will demonstrate the generalizability of the approach by investigating its implementation in multiple distinct simulation-based virtual training environments.

Investigating Emergency Response Performance with VR-Based Intelligent User Interfaces
James Lester, II ; Bradford Mott ; Randall Spain

$1,112,175 by National Institute of Standards & Technology
06/ 1/2018 - 05/31/2020

First responders are seeing a significant increase in the amount and types of data available when responding to emergencies. To maximize the value of these data, user interfaces need to be designed that provide first responders with critical real-time information. Intelligent user interface design, in which the data and information presented to the user is adapted and tailored to the needs of individual users based on analytic information (e.g., expertise, task state, location), offers significant potential for improving performance, reducing mental workload, and facilitating effective decision-making. This project builds on a decade of research by the project team in developing intelligent game-based virtual learning environments. The goal of the project is to develop a virtual reality emergency response scenario that will serve as a test bed for evaluating the impact of intelligent user interfaces on first responder performance. In addition, the project will investigate the impact of providing adaptive support on task proficiency and whether alternative interaction methods (gesture-based vs. voice-based) reduce cognitive load and improve system interaction.

Multimodal Visitor Analytics: Investigating Naturalistic Engagement with Interactive Tabletop Science Exhibits
James Lester, II ; Jonathan Rowe ; James Minogue

$1,951,956 by National Science Foundation (NSF)
03/ 1/2018 - 02/28/2021

Recent advances in multimodal learning analytics present new opportunities for investigating learning and engagement in informal education settings. In this project, we will investigate visitors’ learning experiences in science museums using multimodal visitor analytics, which marry the rich multi-channel data streams produced by fully-instrumented exhibit spaces and the data-driven modeling functionalities afforded by recent advances in machine learning. The project will leverage Future Worlds, a fully-instrumented prototype digital interactive exhibit about sustainability, which was developed and piloted by the project team in a previously funded NSF Informal Science Education proof-of-concept project. The research team will conduct a series of museum studies to investigate how learners interact with Future Worlds and other exhibits in a science museum, and will utilize learning analytic techniques to model visitors’ cognitive, affective, and behavioral components of learning and engagement. The project will produce a detailed empirical account of visitors’ learning experiences in a science museum, as well as an open-source software platform for conducting multimodal visitor analytics, which will help other informal education researchers utilize learning analytics with their own datasets

Developing Integrated Teaching Platforms to Enhance Blended Learning in STEM
Collin Lynch ; Tiffany Barnes ; Sarah Heckman

$599,992 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

Modern computer science classrooms support student learning by integrating a large number of educational tools and platforms from forums to intelligent tutors and automated tests. This proliferation of tools can overwhelm students who have difficulty navigating across platforms or connecting the information between platforms. We propose to develop an open platform for student and instructor guidance that supports data integration and analysis. The system will be designed to synthesize student interaction data from a rich set of educational tools. The integrated data and will provide the basis for automated analysis of students' work and work habits allowing for automated and instructor-driven interventions to support student learning. Additionally, the integrated system will provide pedagogical guidance on how students can best use classroom learning tools not individually, but in concert.

Collaborative Research: Fostering Collaborative Computer Science Learning with Intelligent Virtual Companions for Upper Elementary Students
Collin Lynch (co-PI) ; Eric Wiebe

$1,399,088 by National Science Foundation
08/15/2017 - 07/31/2021

The University of Florida and North Carolina State University jointly propose FLECKS, a Design and Development proposal for the NSF's Discovery Research PreK-12 (DRK-12) program. FLECKS (Friendly Learning Environment for Kids' Computer Science) addresses the pressing need for the development of fundamental computer science competencies in upper elementary-school children. The goal of the proposed project is to design, develop, and investigate FLECKS, an intelligent learning environment to teach collaborative computer science problem solving. Collaboration is a central academic and professional practice in computational thinking, yet it presents many challenges for elementary school students. Students often struggle to collaborate successfully due to individual differences in academic status; gender; cultural background; personality; attitudes toward collaboration; or attitudes toward learning. In order to address these challenges, FLECKS will provide dyads of students with a rich, scaffolded environment where they use an interactive online coding environment to engage in computer science challenges related to their STEM subject areas. Central to the innovation is the way in which the dyads are supported. FLECKS are animated virtual characters that take a rich set of multimodal features as input, and then adapt to students’ patterns of collaboration, including who has control of the keyboard and mouse; who speaks when; and the problem-solving actions the students take in the online environment.

CRII: SHF: Supporting Domain-Specific Inquiry with Rule-Based Modeling
Christopher Martens

$161,142 by National Science Foundation (NSF)
03/ 1/2018 - 02/29/2020

An increasingly common method for communicating and critiquing the emergent behavior of complex systems is interactive simulation, which can teach interactors about the way a system works by revealing system-level properties like feedback loops and tension between objectives. The Ceptre programming language provides a way to author interactive simulations in a rule-based way, amenable to both intuitive understanding and analysis. We propose to expand Ceptre’s audience by implementing a user interface that enforces syntax-level and type-level correctness of programs, which can be run and deployed in the browser for rapid prototyping.

EAGER: Empirical Software ENGINEERING for Computational Science
Timothy Menzies

$124,628 by National Science Foundation (NSF)
05/ 1/2018 - 05/31/2019

If we could improve the methods of Computational Science, then this would also improve our understanding and utilization of the physical process studied by Computational Scientists (such as molecular dynamics, quantum chemistry, and computational materials). Much of the work in Computational Science is related to the software that implements it. Hence, in this work, we apply state of the art empirical software engineering methods to Computational Science

SHF:Medium:Scalable Holistic Autotuning for Software Analytics
Timothy Menzies ; Xipeng Shen

$898,349 by National Science Foundation
07/ 1/2017 - 06/30/2021

This research proposes to advance the state of the art to holistic scalable autotuners, which tunes all levels of options for multiple optimization objectives at the same time. It will achieve this ambitious goal through the development of a set of novel techniques that efficiently handles the tremendous tuning space. These techniques take advantage of the synergies between all those options and goals by exploiting relevancy filtering (to quickly dispose of unhelpful options), locality of inference (that enables faster updates to out- dated tunings) and redundancy reduction (that reduces the search space for better tunings). This new autotuner will be a faster method for finding better tunings that satisfy more goals. To test this claim, this research will assess if this new tool can reduce the total computational resources required for effective SE data analytics by orders of magnitude.

Large-Scale Automatic Analysis of the OAI Magnetic Resonance Image Dataset
Frank Mueller

$331,603 by UNC - UNC Chapel Hill
08/15/2017 - 07/31/2022

The goal of this proposal is to optimize and to openly provide to the OA community a new technology to rapidly and automatically measure cartilage thickness, appearance and changes on magnetic resonance images (MRI) of the knee for huge image databases. This will allow assessment of trajectories of cartilage loss over time and associations with clinical outcomes on an unprecedented scale; future work will focus on incorporating additional disease markers, ranging from MRI-derived biomarkers for bone and synovial lesions, to biochemical biomarkers, to genetic information.

SaTC: CORE: Small: Enhanced Security and Reliability for Embedded Control Systems
Frank Mueller

$500,000 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

CPS and IoT devices are inherently networked, which exposes them to malware attacks. We propose to significantly increase the cyber security specifically of CPS and IoT computing devices by developing real-time monitoring techniques that defeat cyber-attacks.

Software Support for Heterogeneous Memories in HPC
Frank Mueller

$315,429 by TRIAD National Security, LLC Formerly Los Alamos National Laboratory (LANL)
11/ 9/2018 - 09/30/2021

This project proposes to explore different solutions of software support for heterogeneous memory architectures for future supercomputers. Its primary aim will be to seamlessly integrate the new memory technologies in to the existing architecture support while taking full advantage of its characteristics for existing HPC applications and making the programmers job easier for developing new applications for the future systems.

HPC Power Modeling and Active Control
Frank Mueller

$386,290 by Lawrence Livermore National Laboratory via US Department of Energy
10/25/2016 - 09/30/2019

As we approach the exascale era, power has become a primary bottleneck. The US Department of Energy has set a power constraint of 20MW on each exascale machine. To be able achieve one exaflop in 20MW,it is necessary that we use power intelligently to maximize performance under a power constraint. In this work, we propose to alleviate the shortcomings of current HPC systems in addressing power constraints by (1) power-aware machine partitioning, (2) power-constrained job scheduling, (3) systematic provisioning and procurement of hardware under a power cap, (4)modeling of network, deep memories, and storage, as well as (5)investigating the inter-dependence between power and cooling.

Auto-Tuned Per-Loop Compilation (Phase 2)
Frank Mueller

$21,202 by Lawrence Livermore National Laboratory
10/ 4/2018 - 08/30/2019

HPC applications require careful tuning to exploit close to peak performance on cutting-edge hardware platforms. This work hypothesizes that traditional per-module optimizations fall short of fully exploiting a compiler's capabilities, even when interprocedural optimization complement local and global ones. This project proposes to investigate the viability to separately compile major loops in an auto-tuning effort. Such an ensemble of loop units, when linked together, has the potential to improve not only single-loop but also overall application performance, thereby edging closer to peak performance for a given platform.

Auto-Tuned Per-Loop Compilation
Frank Mueller

$50,000 by Lawrence Livermore National Laboratory
01/24/2018 - 01/31/2019

HPC applications require careful tuning to exploit close to peak performance on cutting-edge hardware platforms. This work hypothesizes that traditional per-module optimizations fall short of fully exploiting a compiler's capabilities, even when interprocedural optimization complement local and global ones. This project proposes to investigate the viability to separately compile major loops in an auto-tuning effort. Such an ensemble of loop units, when linked together, has the potential to improve not only single-loop but also overall application performance, thereby edging closer to peak performance for a given platform.

SHF: Small: Retrospective and Prospective Studies of the Effects of Gender Bias in Software Engineering
Emerson Murphy-Hill

$498,461 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

Gender bias in the workplace has been documented in a variety of studies, and recent work in software engineering has likewise revealed how bias effects the tech industry. This project seeks to better understand the causes and effects of gender bias in software engineering, by studying the people and artifacts who practice it.

SHF:Small: Enabling Scalable and Expressive Program Analysis Notifications
Emerson Murphy-Hill ; Sarah Heckman

$265,853 by National Science Foundation (NSF)
08/15/2017 - 07/31/2020

Previous research shows that existing notifications produced by integrated development environments are poorly understood and overwhelming to programmers. We propose building on our prior work to create a new architecture for program analysis notifications that enables toolsmiths to create scalable and understandable notifications for a variety of program analysis tools.

CSR:Medium:SmartChainDB - Enabling Smart Marketplaces With A Scalable Semantically-Enhanced Blockchain Platform
Kemafor Ogan ; Alessandra Scafuro ; Binil Starly

$499,773 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

This project seeks to develop a platform-SmartChainDB for supporting Smart Marketplaces in trustless environments. Such marketplaces should enable efficient assessment of bids in response to service requests without a-priori trust establishment between parties. In domains like Digital Manufacturing, job bid assessments are very time consuming efforts that take order of months. The platform will be developed by extending a BlockChain database for managing trust, with transaction types necessary to support a protocol for service requests and response bids. Another key extension will be semantic-enablement of the BlockChain database. A proof-of-concept prototype in Smart Manufacturing will be developed using SmartChainDB.

SHF: SMALL: DockerizeME: Automatic Inference and Repair of Computing Environments
Christopher Parnin

$345,875 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

Data scientists perform analysis that society increasingly relies on. As data and analysis grow in complexity, so does the computing environments required to run that analysis. A wide variety of tools have been created to help data scientists do their jobs, yet the computing environments used for these analyses are often difficult to reproduce and vary significantly from tool to tool. As a result, data scientists may waste time trying to share, reproduce, and scale these computing environments, instead of making their analysis more thorough and reliable. In this project, we describe how we can infer configurations from existing code snippets and computing environments and automatically create application containers for running and scaling the computations. Further, based on our infrastructure, we can provide the ability to repair code and code environments in order to perform automated maintenance of computing environments.

CRII: SHF: Building Visibility into the Cognitive Processes of Software Engineers via Biosensors
Christopher Parnin

$159,662 by National Science Foundation (NSF)
02/ 1/2018 - 01/31/2020

Despite its vast capacity and associative powers, the human brain does not deal well with interruptions. Particularly in situations where information density is high, such as during a programming task, recovering from an interruption requires extensive time and effort. Although researchers recognize this problem, no programming tool takes into account the brain’s structure and limitations in its design. In this project, we measure cognitive load of programmers during different programming tasks. To measure cognitive load, we collect both biometrics and metrics collected from sensors and brain imaging devices. We apply our measures to applications in the software engineering domain: 1) Measuring cognitive load during technical interviews, and 2) Correlating complexity measures of code with higher measures of cognitive load.

REU Site: Science of Software
Christopher Parnin ; Emerson Murphy-Hill ; Sarah Heckmen

$355,365 by National Science Foundation
01/ 1/2016 - 01/31/2019

There are not enough skilled data science researchers, particularly in software engineering. Hence, this REU Site in Science of Software (SOS) will engage undergraduates as data scientists studying exciting and meaningful SE research problems. Students work side-by-side with faculty mentors to gain experience in qualitative and quantitative research methods in SOS. Activities include dataset challenges, pair research, literature reviews, and presentations. Ultimately, each student works independently toward a published research result with their faculty mentors.

Wearable Physiological Monitors for Measuring Canine Stimulus Responses
David Roberts ; Alper Bozkurt ; Margaret Gruen

$191,859 by Oak Ridge National Laboratories - UT-Battelle LLC
07/ 1/2018 - 12/31/2018

Investigators at Oak Ridge National Lab (ORNL) are interested in measuring physiological characteristics of dogs as they are exposed to environmental stimuli. We will develop 1) custom wearable sensing units capable of aggregating measurements from working dogs in controlled environments during experiments, 2) corpora of data from experiments conducted by ORNL in a format amenable for statistical analysis from ORNL, and 3) software to enable ORNL to extract and process the data describing those physiological measurements automatically. We will provide input into experimental design to facilitate effective use of sensing devices for analyzing physiological responses.

Graduate Industrial Traineeship-Genworth-Risk Modeling Intern-ML Focus
George Rouskas

$32,692 by Genworth Mortgage Insurance Corporation
08/22/2018 - 12/31/2018

NCSU through the Genworth GA will provide research and analysis to Genworth as set forth in this Agreement. Such research and analysis shall include, but is not limited to, research, generation, testing, and documentation of risk modeling software. Genworth GA will provide such services for Genworth's offices in Raleigh, North Carolina, at such times as have been mutually agreed upon by the parties.

Consortium for Nonproliferation Enabling Capabilities
Nagiza Samatova, co-PI ; Robin Gardner (Nuclear Eng

$9,744,249 by US Department of Energy
07/31/2014 - 07/30/2019

NC State University, in partnership with University of Michigan, Purdue University, University of Illinois at Urbana Champaign, Kansas State University, Georgia Institute of Technology, NC A&T State University, Los Alamos National Lab, Oak Ridge National Lab, and Pacific Northwest National lab, proposes to establish a Consortium for Nonproliferation Enabling Capabilities (CNEC). The vision of CNEC is to be a pre-eminent research and education hub dedicated to the development of enabling technologies and technical talent for meeting the grand challenges of nuclear nonproliferation in the next decade. CNEC research activities are divided into four thrust areas: 1) Signatures and Observables (S&O); 2) Simulation, Analysis, and Modeling (SAM); 3) Multi-source Data Fusion and Analytic Techniques (DFAT); and 4) Replacements for Potentially Dangerous Industrial and Medical Radiological Sources (RDRS). The goals are: 1) Identify and directly exploit signatures and observables (S&O) associated with special nuclear material (SNM) production, storage, and movement; 2) Develop simulation, analysis, and modeling (SAM) methods to identify and characterize SNM and facilities processing SNM; 3) Apply multi-source data fusion and analytic techniques to detect nuclear proliferation activities; and 4) Develop viable replacements for potentially dangerous existing industrial and medical radiological sources. In addition to research and development activities, CNEC will implement educational activities with the goal to develop a pool of future nuclear non-proliferation and other nuclear security professionals and researchers.

SaTC: CORE: Small: Collaborative: A Broad Treatment of Privacy in Blockchains
Alessandra Scafuro

$249,922 by National Science Foundation (NSF)
09/ 1/2017 - 08/31/2020

A blockchain is a public, distributed, append-only database whose consistency is maintained by the combined work of users across the world rather than a single party, thus avoiding single points of failure and trust. The public nature of the blockchain, however, raises important privacy concerns. Existing work partially addressed privacy concerns for the restricted case of blockchains used for financial transactions. As blockchains are set to be used in a variety of contexts, proposed work will initiate a broad treatment of privacy definition and provide constructions achieving new privacy goals that can be implemented across different blockchains.

Analysis of Hash-Based Signatures Schemes in the QROM
Alessandra Scafuro

$96,752 by Silicon Valley Community Foundation
12/ 1/2017 - 12/31/2018

Digital signatures are fundamental cryptographic building blocks that guarantee the authenticity and integrity of digital communications. Currently adopted signature schemes (such as ECDSA) leverage number-theoretic properties, and their security relies on the intractability of solving number-theoretic problems such as integer factorization and the discrete logarithm problem. Shor's algorithm however shows that such problems are practically solvable on quantum computers. The threat of quantum attacks triggered NIST's competition for the development and standardization of signature schemes that are secure in presence of quantum adversaries. There exist several candidates for such competition, and CISCO research team is participating with the LMS scheme. The advantage of the LMS scheme over the other candidates is in its simplicity. The goal of the proposed research is to develop a formal treatment of the post-quantum security of the LMS scheme.

NeTS: Small: Fine-grained Measurement of Performance Metrics in the Internet of Things
Muhammad Shahzad

$449,999 by National Science Foundation
10/ 1/2016 - 09/30/2019

PI proposes to develop a framework for passive and fine-grained measurements of the performance metrics in the Internet of Things, which include both Quality of Service metrics such as latency, loss, and throughput and Resource Utilization metrics such as power consumption, storage utilization, and radio on time etc. Measurements of these performance metrics can be used reactively by network operators to perform tasks such as detecting and localizing offending flows that are responsible for causing delay bursts, throughput deterioration, or even power surges. These measurements can also be used proactively by network operators to locate and preemptively update any potential bottlenecks.

CSR:Small:Collaborative Research: Scalable Fine-Grained Cloud Monitoring for Empowering IoT
Muhammad Shahzad

$257,996 by National Science Foundation
09/15/2016 - 08/31/2019

Due to the rapid adoption of the cloud computing model, the size of the data centers and the variety of the cloud services is increasing at an unprecedented rate. Due to this, fine-grained monitoring of the health and the usage of data center resources is becoming increasingly important and challenging. In this work, we address the problem of efficiently acquiring and transporting cloud management and monitoring data. For data acquisition, we address the crucial challenge of controlling data size. For data transportation, we focus on efficiently moving the data from the point it is collected inside the data center to the point it needs to be stored for processing.

XPlacer: Extensible and Portable Optimization of Data Placement in Memory
Xipeng Shen

$235,398 by Lawrence Livermore National Laboratory
01/29/2018 - 09/30/2020

Modern supercomputers with heterogeneous components (e.g., GPUs) feature complex memory systems to meet the ever growing demands for data by processors. Putting data into the proper part of a memory system is essential for program performance, but is difficult to do. To address this challenge, we propose a new paradigm featuring three interacting components: 1) an extensible memory specification language to describe memory properties, 2) a compiler for analyzing data access patterns and transforming code for runtime adaptation, and 3) a data placement runtime to find and materialize the best data placements on the fly. The result will be a software framework (named XPlacer) that transforms OpenMP code to automatically place its data in memory in a way best suiting the GPU architecture, inputs, and program phases.

CSR:Small:Supporting Position Independence and Reusability of Data on Byte-Addressable Non-Volatile Memory
Xipeng Shen

$499,998 by National Science Foundation (NSF)
08/16/2017 - 07/31/2020

Byte-Addressable Non-Volatile Memory (NVM) is the upcoming next generation of memory with tremendous potential benefits. This proposal is about offering programming system-level support of persistency on NVM. Particularly, it focuses on effective support of the usage of dynamic data structures on NVM.

CCF:SHF:Small: Non-Uniformity-Centric Program Optimizations for Dynamic Computations on Chip Multiprocessors
Xipeng Shen

$404,956 by National Science Foundation
06/16/2014 - 12/31/2018

In this project, Dr. Xipeng Shen and his team are building a new paradigm of program optimizations. It is motivated by a growing gap between trends in processor development and needs of modern data-intensive dynamic applications. This class of applications, ranging from differential equation solvers to data mining tools to particle dynamics simulations, play an essential role in science and humanity. But they feature tremendous data accesses and complex patterns in data accesses or control flows. The properties make them a great challenge for modern processors which are evolving exactly opposite to these applications' needs: A chip's aggregate computing power is rapidly outgrowing memory bandwidth; the rise of throughput-oriented manycores makes system throughput even more sensitive to irregular computations. The paradigm being built by Dr. Shen's team, namely "non-uniformity--centric optimizations", distinctively takes the non-uniform inter-core relations in modern systems as the first-order constraint for program optimizations. Specifically, they are developing a framework named PipeReg as a new way to reorganize data accesses and threads during run time to reduce the influence of irregular computations on the throughput of massively parallel processors. Meanwhile, they are investigating a novel kind of program transformations called neighborhood-aware transformations, which exploits the non-uniform interactions among threads in on-chip storage (e.g., shared cache) in a multi-socket multicore system. Together, the two techniques will synergistically remove some important barriers for data-intensive dynamic applications to tap into the full power of future computing systems. The outcome from this research will provide essential support for enhancing the computing efficiency of data-intensive dynamic applications in the era of heterogeneous parallel systems. Because of the critical roles of these applications, this research will help foster sustained advancement in science, commerce, health, and so on. Beyond its technical content, this project stresses technology transfer, develops new teaching materials and tools, emphasizes demographic diversity, and improves the training of both graduate and undergraduate students.

Workshop on Inter-Disciplinary Research Challenges in Computer Systems
Xipeng Shen ; James Tuck

$44,954 by National Science Foundation (NSF)
03/ 1/2018 - 02/28/2019

This proposal requests travel support funds to enable invited participants to defray the costs of traveling to and attending the first ASPLOS Grand Challenges and Synergy Workshop (GCSW) collocated with ASPLOS-23. The PIs will ensure the creation and release of a comprehensive report to capture the discussions at the workshop.

Realizing Cyber Inception: Toward a Science of Personalized Deception for Cyber Defense
Munindar Singh

$375,360 by University of Southern California via US Army Research Office
09/ 1/2017 - 12/31/2018

Frequent security breaches have highlighted both the growing importance of cybersecurity and weaknesses of traditional methods such as firewalls, malware detection, intrusion detection, and prevention technologies. To leap ahead of attackers, we must move beyond passive defense strategies toward a new science of interactive personalized deception for cyberdefense. Our proposed approach involves (1) building models of attackers and their propensities and (2) characterizing computers, networks, users, and their relationships and interactions so as to enable realistic deception. We will develop a modular framework for evaluation of the key deception techniques consisting of a pluggable game-based scaffolding.

CAREER: On the Foundations of Semantic Code Search
Kathryn Stolee

$500,000 by National Science Foundation (NSF)
08/ 1/2018 - 07/31/2023

Semantic code search uses behavioral specifications, such as input/output examples, to identify code in a repository that matches the specification. Challenges include handling scenarios when 1) there are too few solutions, 2) it is difficult to understand how solutions differ, and 3) there are too many solutions. I propose techniques to 1) expand the scope of code that can be modeled and find approximate solutions when an exact one does not exist, 2) determine the differences between two code fragments, and 3) navigate a large space of possible solutions are needed by selecting inputs that maximally divide the solution space.

SHF: Small: Supporting Regular Expression Testing, Search, Repair, Comprehension, and Maintenance
Kathryn Stolee

$499,996 by National Science Foundation (NSF)
08/15/2017 - 07/31/2020

Regular expressions (regexes) are responsible for numerous faults in many software products, and yet, static bug finders and automated program repair techniques generally ignore this common language feature. First, I propose to explore and characterize regex-related bugs in bug repositories. From there, I propose to develop approaches for detecting regex-related bugs using static analysis and patching regex-related bugs using automated program repair. The proposed detection and patching techniques both depend on similarity analysis of regexes. The expected research outcomes include a publicly available data set of regex-related faults, new regex-related bug patterns for static bug finders like FindBugs and PMD, and in the best case, an open source tool for automated patch generation for regular expressions.

SHF: Medium: Collaborative Research: Semi and Fully Automated Program Repair and Synthesis via Semantic Code Search
Kathryn Stolee

$387,661 by National Science Foundation
07/ 1/2016 - 06/30/2020

Software plays an integral role in our society. However, software bugs are common, routinely cause security breaches, and cost our economy billions of dollars annually. The software industry struggles to overcome this challenge: Software is so inherently complex, and mistakes so common, that new bugs are typically reported faster than developers can fix them. Recent research has demonstrated the potential of automated program repair techniques to address this challenge. However, these techniques often produce low-quality repairs that break existing functionality. In this research, we develop new techniques to fix bugs and implement new features automatically, producing high-quality code.

Algorithms for Exploiting Approximate Network Structure Research Area 10: Network Science
Vida Blair Sullivan

$538,199 by US Army-Army Research Office
05/15/2017 - 05/14/2020

We propose a new framework for efficient, robust, and noise-tolerant network algorithms that guarantee near-optimal solutions to NP-hard problems by exploiting structure inherent in real-world networks. We model networks as consisting of a majority that belongs to a structural graph class, plus a few deviations resulting from measurement errors, unusual behaviors, and/or unexplained exceptions. We will develop algorithms which exploit this more approximate form of graph structure and guarantee near-optimal solutions and polynomial running time for any network that is ``close'' to a structural graph class, initially focusing on hierarchical/tree-like networks (e.g. those arising in biology and social behavior).

Moore Foundation Data-Driven Discovery Investigator
Vida Blair Sullivan

$1,500,000 by Gordon and Betty Moore Foundation
11/10/2014 - 12/ 1/2019

Understanding and identifying intermediate-scale structure is key to designing robust tools for data analysis, just as the interdependence of local interactions and global behavior is key in many science domains. We thus focus on constructing a theory and tools for using this structure to improve analysis and identification of relationships in massive graph data. Through careful integration of tools from graph theory, computational complexity, statistics, and parallel algorithm design, the proposed work will derive novel measures of graph similarity based on structural representations and application-inspired features of interest. We will design efficient, scalable sampling algorithms which leverage inherent sparsity and structure to de-noise and improve accuracy of parameter estimation. As a specific example of science domain impact, we focus on improving understanding of the brain. Applying our new tools for characterizing graph-theoretic structure in such networks, scientists will be able to build higher fidelity models of brain network formation and evolution. Additionally, efficient algorithms from the associated parameterized framework will enable rapid comparison of regions and identification of discrepancies, abnormalities, and influential components for specific tasks.

CSR: Small: IOQL: an I/O Interface for Near-Data Processing
Hung-Wei Tseng

$499,515 by National Science Foundation (NSF)
08/15/2018 - 07/31/2021

As datasets grow, the overhead of moving data around different system components becomes the new performance bottleneck. The fundamental solution is avoid moving data by equipping computing resources near data storage or I/O devices. However, existing near-data processing approaches are all ad hoc and highly device dependent, increasing the difficulty of applying NDP to applications.

CRII: CSR: Rethinking the FTL in SSDs -- a file translation layer instead of a flash translation layer
Hung-Wei Tseng

$174,998 by National Science Foundation
03/15/2017 - 02/28/2019

SSDs (solid state drives) nowadays become popular in all kinds of computing systems. However, these systems still leverage the existing block interface to manage SSDs, resulting in multiple layers of indirections, under-utilized parallelism inside the SSD, overheads for in-storage computing,and difficulties in sharing file among heterogeneous computing devices. This work will reshape the current system stack and simplifies the software/hardware interface for SSDs by using the SSD to directly map files into physical block addresses on the SSD. This work will demonstrate the effect of the proposed system in applications ranging from file systems, datacenter storage, and virtualized machines.

Deep Learning on Remote Sensing Imagery
Ranga Vatsavai

$50,000 by Lenovo
02/ 1/2018 - 12/31/2018

Massive amounts of remote sensing data are being collected and archived from satellites and airborne platforms (including drones) on daily basis. This data supports a wide range of applications of national importance. Examples of applications include crop type mapping, forest mapping, urban neighborhood mapping, damages due to flooding, hailstorms, and forest fires, impacts of climate change on crops, unusual crop detection (e.g., poppy plantations), changes in biomass, understanding complex interaction between food, energy, and water, etc. Classification of these high-resolution images requires object and arbitrary patch based classification to capture relevant spatial context. The advent of multiple instance learning and deep learning took the natural image processing community by storm. However, its application to satellite images has been slow due to training data and computational requirements. In this project, we develop deep learning algorithms for classification of satellite images and scale these algorithms on Lenovo/Intel’s new architectures and software infrastructure (e.g., Neon, Caffe, Theano, and MXNet).

Big Data & Machine Learning
Ben Watson

$5,000 by Caterpillar, Inc.
12/15/2017 - 12/31/2018

Analyze the data and video provided by Caterpillar to develop possible measures of user experience such as frustration, engagement, or flow; compare results to the survey results University collects as part of the funded project. Surveys will ideally be well-established instruments in the public domain, or custom built for the project. If necessary, surveys may be licensed by the University, as long as they do not restrict use of the resulting data. The University will provide Caterpillar with the results of this comparison.

Science of Security Lablet: Impact through Research, Scientific Methods, and Community Development
Laurie Williams ; Munindar Singh

$467,750 by US Dept. of Defense (DOD)
04/ 4/2018 - 09/14/2022

This project proposes the continuation of the Science of Security Lablet at NC State University. Science of Security refers to the study of cybersecurity from an explicitly scientific perspective. Cybersecurity encompasses elements of technology, human behavior, and policy. Science of Security seeks to identify and apply the appropriate scientific principles on cybersecurity problems, enhancing rigor and reproducibility, thereby improving the transfer of research to practice. This Lablet provides a home for investigations into diverse topics pertaining to a Science of Security. The Lablet will support the three major elements of a Science of Security: research, scientific methods, and community engagement.