Department of Computer Science at North Carolina State University

Research Projects 2020 (by faculty)

The funded projects listed below are/were active projects in the 2020 calendar year and the funded running total for that year is on the left navigational menu.

Analysis of a Simple, Low-cost Intervention's Impact on Retention of Women in Computer Science
Bita Akram ; Tiffany Barnes ; Thomason Price ; Tzvetelina Battestilli
$174,938 by National Science Foundation (NSF)
07/ 1/2020 - 06/30/2022

Existing research suggests that institutions may be able to increase the persistence of women in STEM by increasing their self-assessed STEM ability. We propose conducting both a longitudinal field experiment (in Computer Science [CS] classes) and a lab experiment (with novice programmers) to assess the impact of unambiguous, direct performance feedback on women’s and men’s self-assessed CS ability and CS persistence. Beyond the support for our research provided by social-psychological theory, mediation analysis of pilot data from a field experiment found the predicted causal chain: the intervention increased women’s self-assessed CS ability, which then increased women’s CS persistence intentions.

Collaborative Research: Beyond CS Principles:Engaging Female High School Students in New Frontiers of Computing
Tiffany Barnes
$555,000 by National Science Foundation (NSF)
05/ 1/2020 - 04/30/2023

There is a real need for a follow-on course once high school students, and especially girls, to take after their interest in computing has been elevated by the new Computer Science principles course. We proposed to design and study Beyond CSP, a new course focused on CS concepts that have broad appeal but are traditionally considered advanced and are only taught to CS majors in college. The course topics will include distributed computing, computer networking, cybersecurity, machine learning, the Internet of Things and others. We theorize that a course that teaches these advanced computational methods in disciplinary contexts across a variety of STEAM fields, will make the connection to skills that a modern workforce needs readily apparent. Moreover, it will also have a much broader appeal to young learners. In fact, we propose to tailor the curriculum especially to appeal to girls by focusing on specific disciplines such as healthcare and climate change, and emphasizing collaboration and team work.

REU Site: Socially Relevant Computing and Analytics
Tiffany Barnes
$405,000 by National Science Foundation (NSF)
03/ 1/2020 - 02/28/2023

The REU Site at NC State University will immerse a diverse group of undergraduates in a vibrant research community of faculty and graduate students building and analyzing cutting-edge human-centric applications including games, tutors, and mobile apps. We will recruit students from underrepresented groups and colleges and universities with limited research opportunities through the STARS Computing Corps, an NSF-funded national consortium of institutions dedicated to broadening participation in computing. Using the Affinity Research Groups and STARS Training for REUs models, we will engage faculty and graduate student mentors with undergraduates to create a supportive culture of collaboration while promoting individual contributions to research through just-in-time training for both mentors and students throughout the summer.

Collaborative: The STARS Aligned: How the STARS Computing Corps Broadens Participation in Computing
Tiffany Barnes
$199,390 by National Science Foundation (NSF)
10/ 1/2020 - 09/30/2021

The goal of this STARS proposal is to demonstrate the impact of the STARS Computing Corps on broadening the participation of women and underrepresented minorities in computing, at a national scale. To do so, STARS will research its cohort-based model that has engaged departments, faculty, and students in an inclusive community of practice, continue its thriving RESPECT BPC research and STARS Celebration BPC leadership conferences and improve recognition and the research base for BPC and STARS efforts. Collectively, these efforts will demonstrate how STARS has served as a national resource that catalyzes, develops, and celebrates BPC leadership for computing students, alumni, and faculty.

Collaborative: The STARS Aligned: How the STARS Computing Corps Broadens Participation in Computing
Tiffany Barnes
$199,390 by National Science Foundation (NSF)
10/ 1/2020 - 09/30/2021

EAGER: Collaborative Research: Enhancing Impact of Broadening Participation in Computing Efforts through the STARS Cohort Conference Attendance Program
Tiffany Barnes
$47,432 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2020

In a recent Dear Colleagues Letter [cite], NSF outlined their commitment to “enhance the community's awareness of and barriers to broadening participation in computing (BPC) as well as to provide information and resources to principal investigators (PIs) so that they can develop interest, skills, and activities in support of BPC at all levels”. As part of this new initiative, NSF has launched a small pilot of a new requirement that PIs include plans for meaningful BPC activities in their proposals. In March 2018, NSF sponsored the BPCNet workshop to bring together BPC researchers and practitioners, including those from the NSF-funded BPC Alliances program, to identify approaches to support new PIs to engage in BPC efforts in meaningful ways. Our experience in leading the STARS Computing Corps demonstrates the value of cohort-based efforts to broaden participation in computing for individual students and computing departments. In response to the call for action presented to attendees at the NSF BPCNet workshop, the goal of this proposal is to provide a scaffolded approach to engaging faculty in cohort-based BPC efforts. As a starting point, the STARS Computing Corps proposes to prepare faculty to support diverse cohorts of students to attend diversity-oriented conferences through scholarships and targeted activities that build community and a sense of belonging for the attending students and ignite efforts to broaden participation within their own computing depart

Collaborative Research: Integrating Computing in Stem: Designing, Developing and Investigating a Team-based Professional Development Model for Middle-and High-school Teachers
Tiffany Barnes
$861,773 by National Science Foundation (NSF)
09/ 1/2017 - 08/31/2020

Integrated Snap! is a comparison study between the traditional single teacher professional development model and a community of practice professional development where teachers attend as a group. CTE/CS, Math, Engineering, and Science teachers that would foster the integration of Computer Science Principles Curriculum and Science Curriculum in high school classrooms. The professional development will take place using Borko’s (2004) phases. We will develop a professional development model centered on helping content teachers learn best practices for integrating computing in their classroom. That PD will be piloted, replicated, and then modified for other sites and contexts. Teachers will work together to build simulation and programming tools and corresponding activities for their classrooms to use to explore computational concepts in the context of their discipline. An integral part of Integrated Snap is to have teams of teachers from the same school or district attend the PD together and build a community among the teachers and their students doing similar work across their classrooms.

REU Site: Socially Relevant Computing and Analytics
Tiffany Barnes
$359,997 by National Science Foundation
03/ 1/2017 - 02/29/2020

Collaborative Research: Scaling the Early Research Scholars Program
Veronica Catete ; Bita Akram ; Sarah Heckman ; Tiffany Barnes ; Tzvetelina Battestilli
$20,000 by University of California - San Diego
09/21/2020 - 08/31/2023

The Early Research Scholars Program (ERSP) is a group-based, dual-mentored research structure designed to provide a supportive and inclusive research experience using equity-based practices to grow the confidence and foundational skills of early-career students, particularly African Americans, Hispanics, Native Americans and women. For this NSF subaward from UC San Diego, we plan to add ERSP to our course catalog and start implementing it in Fall 2021. As part of their full-year apprenticeship, teams of students will learn about graduate school, be matched to research mentors, observe the mentor's lab, participate in the ERSP course, and propose an independent research project.

CAREER: Improving Adaptive Decision Making in Interactive Learning Environments
Min Chi
$547,810 by National Science Foundation
03/ 1/2017 - 02/28/2022

For many forms of interactive environments, the system's behaviors can be viewed as a sequential decision process wherein, at each discrete step, the system is responsible for selecting the next action to take from a set of alternatives. The objective of this CAREER proposal is to learn robust interaction strategies that will lead to desirable outcomes in complex interactive environments. The central idea of this project is that strategies should not only be effective in complex interactive environments but they should also be efficient, focusing solely on the key features of the domain and the crucial decision points. These are the features and decisions that are not only associated with desirable outcomes, but without which the desirable outcomes are unlikely to occur.

DIP: Integrated Data-driven Technologies for Individualized Instruction in STEM Learning Environments
Min Chi ; Tiffany Barnes
$1,999,438 by National Science Foundation
08/15/2017 - 07/30/2022

In this proposed work, our goal is to automatically design effective personalized ITSs directly from log data. We will combine Co-PI Dr. Barnes data-driven approach on learning what to teach with PI Dr. Chi’s data-driven work on learning how to teach. More specifically, we will explore across three important undergraduate stem domains: discrete math, probability, and programming.

Generalizing Data-Driven Technologies to Improve Individualized STEM Instruction by Intelligent Tutors
Min Chi ; Tiffany Barnes ; Thomason Price
$1,999,578 by National Science Foundation (NSF)
08/15/2020 - 07/31/2025

This project will develop generalizable data-driven tools that addresses the conceptually and practically complex activity of constructing adaptive support for individualized learning in STEM domains.

PFI: BIC: FEEED: Flexible, Equitable, Efficient and Effective Distribution
Min Chi ; Julie Ivy
$400,000 by UNC - NC A & T State University
08/ 1/2017 - 07/31/2021

Nonprofit hunger relief organizations operate in a complex environment consisting of a large and diverse donor base and a dynamic distribution network. These organizations generate a large amount of unstructured and complex data on food collection, inventory management, and distribution activities. However, existing information systems lack the infrastructure to interpret this large-scale data to provide real-time policy recommendations and support operational and strategic decision-making. The proposed smart service system will synthesize data from disparate sources to create a real-time perspective of the environment and learn from the actions of the decision maker. Specifically, this system will automatically predict, visualize, and recommend decisions that will advance operational effectiveness of food collection, distribution, and resource management in a way that is efficient and equitable and will identify opportunities to improve a food bank’s capability to satisfy hunger need.

Analytics-Based Platform for Diabetic Retinopathy Care Management
Min Chi ; Julie Ivy ; Maria Mayorga
$121,305 by Retinal Care
06/ 1/2019 - 04/30/2020

Analytics-Based Platform for Diabetic Retinopathy Care Management This proposal will demonstrate the feasibility and necessity of integrating data analytics and systems science into a cost-effective, end-to-end solution (Retinal Care DR™) for diabetic retinopathy care. Over 30 million people in the US have diabetes. Of these, 25% will develop diabetic retinopathy (DR) and 4% will develop vision-threatening diabetic retinopathy (VTDR). Ninety percent of all blindness from DR is preventable. However, prevention requires timely VTDR diagnosis and patient compliance with treatment recommendations. Currently, only 50% of patients receive an annual dilated eye exam, leaving half of those at risk for vision loss undiagnosed and untreated. Further, of those diagnosed with VTDR, only 30% undergo timely treatment. These two barriers have established DR as the leading cause of blindness in working-age Americans and highlight the need for a comprehensive solution that integrates the entire care path, from accurate, accessible diagnosis to effective care coordination and evidence-based treatment. Currently, Retinal Care Inc. can identify 65% of a population of patients with diabetes as not having VTDR with 99% negative predictive value using a handheld electroretinography (ERG) and pupillography device (RETEval, LKC Technologies) in a primary care setting as part of the Retinal Care DR program. This 2-minute test reduces the proportion of patients requiring the care coordination component of the service to 35% of the initial population. However, according to the published literature, only 4% of these patients should have VTDR, while the remaining 31% will undergo costly, time-consuming, and ultimately unnecessary care coordination to ensure follow-up with an eye care provider. Given the critical role of care coordination in reducing blindness from DR, minimizing the cost and maximizing the effectiveness of this effort is required to optimize the patient experience, their outcomes, and the commercial viability of the company. The following specific aims are designed to achieve these goals: Aim 1: Improve the ability to accurately identify patients at increased risk by including patient health and demographic attributes in a predictive algorithm for VTDR diagnosis. The accuracy of the RETeval device in identifying VTDR was established in a 468-patient study using the universally accepted ETDRS gold standard as truth. However, additional patient variables that are potentially correlated to DR severity are available but not yet incorporated into the prediction algorithm, including gender, medication, type of diabetes, diagnosis date, and additional features of the ERG and pupillary waveforms. Improving diagnostic accuracy will be achieved through machine learning algorithms for patient classification techniques, including multiple logistic regression and Generalized Linear Models. All patient variables will be candidates for independent variables. Variables for final inclusion in the model will be selected based on model fit while accounting for variable interactions. Varying cutoff values will be chosen to maximize accuracy (positive and negative predictive values). As classification techniques, principal component (PCA) and variable cluster analyses will be used to identify features of patients with similar attributes, potentially resulting in subpopulation-specific models. Aim 2: Optimize care coordination, from diagnosis to treatment, for patients at increased risk for VTDR. An ongoing, multi-year study of a diverse set of 2,000 patients with diabetes conducted and self-funded by Retinal Care (ClinicalTrials.gov: NCT03094819) has reinforced the critical role of care coordination in effective DR care, and the importance of minimizing the cost associated with care coordination efforts. Information collected from 500 currently-enrolled patients includes diagnostic results, primary care visits, care coordination (e.g. phone calls, text messages), follow-up and treatment compliance, sociodemographic information, and other pote

Development of a Nearly Autonomous Management and Control System for Advanced Reactors
Min Chi - (Co-PI) ; Maria Avramova ; Abhinav Gupta ; Nam Dinh
$2,686,834 by US Dept. of Energy (DOE) - Advanced Research Projects Agency - Energy (ARPA-E)
10/ 1/2018 - 03/31/2021

The proposed project seeks to establish a technical basis for, and preliminary development of, a Nearly Autonomous Management and Control (NAMAC) system in advanced reactors. The system is intended to provide recommendations to operators during all modes of plant operation except shutdown operations: plant evolutions ranging from normal operation to accident management. These recommendations are to be derived within a modern, artificial-intelligence (AI) guided system, making use of continuous extensive monitoring of plant status, knowledge of current component status, and plant parameter trends; the system will continuously predict near-term evolution of the plant state, and recommend a course of action to plant personnel.

Using Real-Time Multichannel Self-Regulated Learning Data to Enhance Student Learning and Teachers' Decision-Making with MetaDash
Min Chi - Co PI ; Roger Azevedo (Psychology) ; Soonhye Pakr - co-PI EDUCA
$914,585 by National Science Foundation
04/ 1/2017 - 03/31/2020

This 3-year project will focus on laboratory and classroom research in the Raleigh-Chapel-Hill, and Durham areas. A team of interdisciplinary researchers from NCSU's Department of Psychology (Dr. Roger Azevedo), Computer Science (Dr. Min Chi), and STEM Education (Dr. Soonhye Park) will conduct empirical and observational research aimed at improving teachers' decision-making based on their analyses of students' real-time, multi-channel self-regulated learning data. We will use multi-channel data to understand the nature of self-regulatory processes in students while using MetaTutor to understand challenging science topics (e.g., human biological systems). This will be accomplished by aligning and conducting complex computational and statistical analyses of a multitude of trace data (e.g., log-files, eye-tracking, facial expression of emotions), behavioral (e.g., human-pedagogical agent dialogue moves), and physiological measures (e.g., EDA), learning outcomes, and classroom data (e.g., teacher-student interactions, gaze behavior of teachers’ attention and use of data presented by the visualization tool). The proposed research, in the context of using MetaTutor and a visualization tool for teachers, is extremely challenging and will help us to better understand the nature and temporal dynamics of these processes in classroom contexts, how they contribute to various types of learning and use of self-regulatory skills, and provide empirical basis for designing an intelligent teacher analytics tool. The results from this grant will contribute significantly to models of social and cognitive bases of student-machine-teacher interactions; statistical and computational methods used to make inferences from complex multi-channel data; theoretical and conceptual understanding of temporally-aligned data streams; enhancing students’ understanding of complex science topics by making more sensitive and intelligent advanced learning technologies; and, enhanced understanding of how teachers use real-time student data to enhance their instructional decision-making, based on data presented in teacher analytic tools.

IUCRC Phase I NC State University: Center for Accelerated Real Time Analytics (CARTA)
Rada Chirkova
$747,647 by National Science Foundation (NSF)
06/ 1/2018 - 05/31/2023

Real-time analytics is the leading edge of a smart data revolution, pushed by Internet advances in sensor hardware on one side, and AI/ML streaming acceleration on the other. We propose creation of a Center of Accelerated Real Time Analytics (CARTA) to explore the realm streaming applications of analytics. This center will be lead by University of Maryland, Baltimore County with partners from NCSU, Rutgers, and other affiliated universities. The proposed center will work with next generation hardware technologies, like the IBM Minsky with on board GPU accelerated processors and Flash RAM, a Smart Cyber Physical Sensor Systems to build Cognitive Analytics systems and Active storage devices for real time analytics. This will lead to the automated ingestion and simultaneous analytics of Big Datasets generated in various domains including Cyberspace, Healthcare, Internet of Things (IoT) and the Scientific arena, and the creation of self learning, self correcting “smart” systems. At the core of these technologies are the techniques of data wrangling that enable this end-to-end real-time data processing and the infrastructure of the next generation of high-performance analytics systems.

Phase I IUCRC NC State University: Center for Accelerated Real Time Analytics (CARTA)
Rada Chirkova
$32,000 by National Science Foundation (NSF)
06/ 1/2018 - 05/31/2023

LAS DO1 Chirkova - 3.3 Computational Social Science
Rada Chirkova
$186,023 by Laboratory for Analytic Sciences
01/ 1/2019 - 12/31/2020

LAS DO1 Chirkova - 3.3 Computational Social Science

CRII: SaTC: Analyzing Information Leak in Smart Homes
Anupam Das
$174,995 by National Science Foundation (NSF)
06/ 1/2019 - 05/31/2021

We live in an increasingly connected world where we spend a large part of our time interacting with a wide range of Internet of Things (IoT) devices. While all these IoT devices provide convenience through automation of appliances, such conveniences often come at the cost of sharing very personal data about our lifestyles. This personal data can then not only be used to serve targeted ads, but can also be misused by repressive governments and cybercriminals. In this project, the PI proposes to analyze the extent to which IoT devices, commonly found in smart homes, leak sensitive information about ourselves.

Challenges and Opportunities in Noise-Aware Implementations of Quantum Field Theories on Near-Term Quantum Computing Hardware
Patrick Dreher ; Alexander Kemper
$385,000 by Oak Ridge National Laboratories - UT-Battelle LLC
10/21/2019 - 10/31/2022

Quantum computing (QC) offers the potential to explore how recent advances in lattice field theories (LFT) can potentially explore aspects of HEP that have been inaccessible using digital computers. Unfortunately these quantum computers have noise and systematic errors that can complicate the performance of basic quantum field theory (QFT) implementations. Very little effort has been directed to understand how these factors impact LFT simulations on QC platforms. This proposal will explore how noise and systematic errors may be identified and mitigated to extend the coherence lifetimes of the qubits and capabilities of HEP LFT simulations.

SaTC: CORE: Small: Detecting Vulnerabilities and Remediations in Software Dependencies
William Enck ; Bradley Reaves
$499,928 by National Science Foundation (NSF)
10/ 1/2020 - 09/30/2023

The goal of this work is to detect, measure, and remediate a software project's use of external, open source software dependencies with security flaws. First,we will introduce two new static analysis primitives: a global dependency graph (GDG) and a global vulnerable-dependency graph (GVDG) to simplify the detection and measurement of the extent and effects of vulnerable dependencies. We will then create novel techniques for analyzing code and textual artifacts of software projects to identify when a new version has fixed a vulnerability,even if a security advisory has not been announced. In doing so, we will help developers know when dependencies must be updated, ultimately leading to more secure software.

Defining Security Policy in Distributed Environments using Network Views
William Enck ; Bradley Reaves
$1,033,306 by US Navy-Office Of Naval Research
12/ 1/2019 - 11/30/2022

Existing networking technologies are primarily focused on functionality, not security. Consequently, requirements of these technologies, such as fixed network topologies, lead to rigid architectures that fail to enable the network access control requirements of current and future computing environments. We propose the creation of a novel primitive called network views that allows a physical or virtual host to have a different set of accessible peers,regardless of network address or topological placement of those peers. We seek to explore and characterize the utility and practicality of network views in different network environments, ranging from traditional LANs to multi-site, multi-tenant networks such as those emerging in cloud and cellular networks. Our proposed design combines concepts from software-defined networking (SDN),operating systems access control, and distributed consensus protocols. Through these efforts, we seek to provide a new security foundation for the growing security needs of both public and private sector network operations.

CSR: Medium:Collaborative Research: Holistic, Cross-Site, Hybrid System Anomaly Debugging for Large Scale Hosting Infrastructures
Xiaohui (Helen) Gu
$518,000 by National Science Foundation
08/ 1/2015 - 07/31/2020

Hosting infrastructures provide users with cost-effective computing solutions by obviating the need for users to maintain complex computing infrastructures themselves. Unfortunately, due to their inherent complexity and sharing nature, hosting infrastructures are prone to various system anomalies caused by various external or internal faults.The goal of this project is to investigate a holistic,cross-site, hybrid system anomaly debugging framework that intelligently integrates production-site black-box diagnosis and developer-site white-box analysis into a more powerful hosting infrastructure anomaly debugging system.

AERPAW: Aerial Experimentation and Research Platform for Advanced Wireless
Ismail Guvenc ; Rudra Dutta ; Mihail Sichitiu ; Brian Floyd ; Thomas Zajkowski
$6,000,290 by PAWR, LLC.
12/16/2019 - 09/30/2020

We propose AERPAW: Aerial Experimentation and Research Platform for Advanced Wireless, a first-of-its-kind aerial wireless experimentation platform to be developed in close partnership between NCSU, Wireless Research Center of North Carolina (WRCNC), Mississippi State University (MSU), University of South Carolina (USC), City of Raleigh, Town of Cary, Town of Holly Springs, North Carolina Department of Transportation (NCDOT), and numerous other project partners. With a major focus being on aerial communications within low altitude airspace, AERPAW will develop a software defined, reproducible, and open-access advanced wireless platform with experimentation features spanning 5G technologies and beyond. NCSU, USC, and MSU researchers have existing UAS experimentation capabilities and ongoing experimental research activities involving wireless technologies spanning software defined radios (SDRs), LTE, WiFi, ultra-wideband (UWB), IoT, and millimeter wave (mmWave), which will form the initial baseline framework for the AERPAW platform. To deploy AERPAW, NCSU will work closely with NCDOT’s Integration Pilot Program, a three-year FAA project that allows BVLOS UAS experimentation for medical supply delivery in North Carolina, in close collaboration with NCSU, several UAS companies, municipalities, and a medical institution. Initial flight tests have already started within the Raleigh area, and will be expanding to other parts of the state in 2019 and beyond. Any additional FAA permits, as necessary, will be secured by AERPAW team in close collaboration with NCDOT.

Identification of Translational Hormone-Response Gene Networks and Cis-Regulatory Elements
Steffen Heber(co-PI) ; Jose Alonso(Lead PI-CALS) ; Anna Stepanova (CALS) ; Cranos Williams (ECE)
$897,637 by National Science Foundation
08/ 1/2015 - 07/31/2020

Plants, as sessile organisms, need to constantly adjust their intrinsic growth and developmental programs to the environmental conditions. These environmentally triggered â€œadjustmentsâ€œ often involve changes in the developmentally predefined patterns of one or more hormone activities. In turn, these hormonal changes result in alterations at the gene expression level and the concurrent alterations of the cellular activities. In general, these hormone-mediated regulatory functions are achieved, at least in part, by modulating the transcriptional activity of hundreds of genes. The study of these transcriptional regulatory networks not only provides a conceptual framework to understand the fundamental biology behind these hormone-mediated processes, but also the molecular tools needed to accelerate the progress of modern agriculture. Although often overlooked, understanding of the translational regulatory networks behind complex biological processes has the potential to empower similar advances in both basic and applied plant biology arenas. By taking advantage of the recently developed ribosome footprinting technology, genome-wide changes in translation activity in response to ethylene were quantified at codon resolution, and new translational regulatory elements have been identified in Arabidopsis. Importantly, the detailed characterization of one of the regulatory elements identified indicates that this regulation is NOT miRNA dependent, and that the identified regulatory element is also responsive to the plant hormone auxin, suggesting a role in the interaction between these two plant hormones. These findings not only confirm the basic biological importance of translational regulation and its potential as a signal integration mechanism, but also open new avenues to identifying, characterizing and utilizing additional regulatory modules in plants species of economic importance. Towards that general goal, a plant-optimized ribosome footprinting methodology will be deployed to examine the translation landscape of two plant species, tomato and Arabidopsis, in response to two plant hormones, ethylene and auxin. A time-course experiment will be performed to maximize the detection sensitivity (strong vs. weak) and diversity (early vs. late activation) of additional translational regulatory elements. The large amount and dynamic nature of the generated data will be also utilized to generate hierarchical transcriptional and translational interaction networks between these two hormones and to explore the possible use of these types of diverse information to identify key regulatory nodes. Finally, the comparison between two plant species will provide critical information on the conservation of the regulatory elements identified and, thus, inform research on future practical applications. Intellectual merit: The identification and characterization of signal integration hubs and cis-regulatory elements of translation will allow not only to better understand how information from different origins (environment and developmental programs) are integrated, but also to devise new strategies to control this flow for the advance of agriculture. Broader Impacts: A new outreach program to promote interest among middle and high school kids in combining biology, computers, and engineering. We will use our current NSF-supported Plants4kids platform (ref) with a web-based bilingual divulgation tools, monthly demos at the science museum and local schools to implement this new outreach program. Examples of demonstration modules will include comparison between simple electronic and genetic circuits.

Collaborative Research: Transforming Computer Science Education Research Through Use of Appropriate Empirical Research Methods: Mentoring and Tutorials
Sarah Heckman
$406,557 by National Science Foundation
09/ 1/2015 - 08/31/2022

The computer science education (CSEd) research community consists of a large group of passionate CS educators who often contribute to other disciplines of CS research. There has been a trend in other disciplines toward more rigorous and empirical evaluation of various hypotheses. However, many of the practices that we apply to demonstrate rigor in our discipline research are ignored or actively avoided when performing research in CSEd. This suggests that CSEd is â€œtheory scarceâ€ because most publications are not research and do not provide the evidence or replication required for meta-analysis and theory building . An increase in empiricism in CSEd research will move the field from â€œscholarly teachingâ€ to the â€œscholarship of teaching and learningâ€ (SoTL) providing the foundation for meta-analysis and the generation of theories about teaching and learning in computer science. We propose the creation of training workshops and tutorials to educate the educators about appropriate research design and evaluation of educational interventions. The creation of laboratory packages, â€œresearch-in-a-box,â€ will support sound evaluation and replication leading to meta-analysis and theory building in the CSEd community.

CUE: Collaborative Research: Effective Peer Teaching Across Computing Pathways
Sarah Heckman ; Tzvetelina Battestilli ; Anna Howard
$98,987 by National Science Foundation (NSF)
01/ 1/2020 - 06/30/2022

Demand for computing is increasing across pathways; majors, minors, and computing in discipline. Peer teachers are critically needed to support student learning outcomes. This proposal builds on an existing NIC in peer teaching to expand support across computing pathways. We will assess the impact of peer teaching, particularly related to support with debugging, on 1) departmental and non-major course culture, student learning and support and 2) broadening participation of underrepresented groups in computing courses and as peer teachers. We will share our results and build a larger community through a 2nd offering of the Peer Teaching Summit.

SHF:Small: Enabling Scalable and Expressive Program Analysis Notifications
Sarah Heckman ; Emerson Murphy-Hill
$265,853 by National Science Foundation (NSF)
08/15/2017 - 07/31/2021

Previous research shows that existing notifications produced by integrated development environments are poorly understood and overwhelming to programmers. We propose building on our prior work to create a new architecture for program analysis notifications that enables toolsmiths to create scalable and understandable notifications for a variety of program analysis tools.

LAS DO1 OY- Jhala 4.0 Triage
Arnav Jhala
$64,712 by Laboratory for Analytic Sciences
05/ 1/2020 - 12/31/2020

We are interested in projects that will develop and demonstrate methods and tools to identify data relevant to a given analytic question within diverse and potentially large data sets. The process of finding the “right” data to support an analysis could potentially introduce significant errors and biases, and so methods and tools for data triage must be characterized in a way that makes clear their appropriate use in rigorous analytic processes. Finally, data increasingly must be discovered within less traditional formats, such as audio or speech recordings or video, that challenge more common triage and search techniques.

SHF: Small: Inter-Request Workflow and Dataflow in Web Applications: a Modeling Framework and its Applications
Guoliang Jin
$350,000 by National Science Foundation (NSF)
08/15/2020 - 07/31/2023

Web applications play an important role in the current software ecosystem, and these web applications are usually built with certain supporting frameworks. While these frameworks ease the development of web applications, they bring several challenges to the analysis of web applications. Existing techniques analyze each request independently leading to suboptimal results. In this project, we propose inter-request analysis to go beyond the boundaries of individual requests, design a framework that can capture and express inter-request data and control dependencies, and develop several program analyses leveraging the framework for performance bug diagnosis, performance optimization, and flow integrity monitoring.

CHECRS: Cognitive Human Enhancements for Cyber Reasoning Systems
Alexandros Kapravelos
$884,817 by Arizona State University/DARPA
11/29/2018 - 05/29/2022

The recent Cyber Grand Challenge (CGC) showed progress in the ability of computers to discover and patch vulnerabilities, but these programs are still far from being able to compete against human players. In order to cope with the state-explosion problem that is now limiting our ability to automatically analyze binary programs, we need to design a new class of solutions inspired by expert humans behavior. In this project, instead of blindly analyzing as many nodes as possible trying to explore the search space exhaustively, we are going to develop new techniques to explore it more intelligently.

SaTC: CORE: Medium: Collaborative: Taming Web Content Through Automated Reduction in Browser Functionality
Alexandros Kapravelos
$406,609 by National Science Foundation
09/ 1/2017 - 08/31/2021

The browser is constantly evolving to meet the demands of Web applications. Although this evolution supports the innovation that we see on the internet, there are security implications that we need to consider, such as attacks against the browser that leverage bugs that occur from the rapid development. In this project, we plan to examine how certain web applications work and associate their behavior directly with the corresponding browser functionality. Our goal is to be able to characterize what functionality is need from the browser when rendering a page and certain components. By building a system like this we will be able to identify for example what is needed from the browser to render a web advertisement. To better protect the internet users, we are going to leverage that information so that we can identify when web applications diverge from their expected behavior and attack the users' browser. We will use this information to limit the exposed functionality to the web applications and eliminate this way multiple classes of attacks, such as browser fingerprinting and drive-by downloads.

XS-Shredder: A Cross-Layer Framework for Removing Code Bloat in Web Applications
Alexandros Kapravelos
$300,000 by Arizona State University via Office of Naval Research
07/ 1/2017 - 06/28/2020

Modern web applications are incredibly complex pieces of software, with frameworks and libraries that assist web developers to write their applications quickly. However, these frameworks and libraries increase the attack surface of the web application. In this proposal, we present the design of a framework, called XS-Shredder, which is able to debloat all layers of the web application software stack: client-side code, server-side code, database, and operating system. This framework will perform analysis inter- and intra-layer, ultimately resulting in a web application that is semantically identically, yet with a significantly reduced attack surface.

Collaborative Research: Big Data from Small Groups: Learning Analytics and Adaptive Support in Game-based Collaborative Learning
James Lester
$1,249,611 by National Science Foundation
10/ 1/2016 - 09/30/2021

The proposed project focuses on integrating models of game-based and problem-based learning in a computer-supported collaborative learning environment (CSCL). As groups of students solve problems in these environments, their actions generate rich and dynamic streams of fine-grained multi-channel data that can be instrumented for investigating students' learning processes and outcomes. Using the big data generated by small groups, we will leverage learning analytics to provide adaptive support for collaboration that will allow these models to be used at larger scales in real classrooms. The project will study CSCL in the context of an environmental-science-based digital game that will employ specific strategies to support the problem-based learning goals of helping students construct explanations, reason effectively, and become self-directed learners. In problem-based learning, students are active, intentional learners who collaboratively negotiate meaning. The project will embed models induced using learning analytic techniques inside of a digital game environment to enable students to cultivate collaborative learning competencies that translate to non-digital classroom settings.

Learning Environments Across Disciplines LEADS: Supporting Technology Rich Learning Across Disciplines: Affect Generation and Regulation During Co-Regulated Learning in Game-Based Learning Environments (Supplement
James Lester
$114,672 by McGill University/Social Sciences and Humanities Research Council of Canada
04/ 1/2012 - 02/28/2020

Contemporary research on multi-agent learning environments has focused on self-regulated learning (SRL) while relatively little effort has been made to use co-regulated learning as a guiding theoretical framework (Hadwin et al., 2011). This oversight needs to be addressed given the complex nature that self-and other-regulatory processes play when human learners and artificial pedagogical agents (APAs) interact to support learners? internalization of cognitive, affective, and metacognitive (CAM) SRL processes. We will use the Crystal Island learning environment to investigate these issues.

REFLECT: Improving Science Problem Solving with Adaptive Game-Based Reflection Tools
James Lester ; Roger Azevedo (Psychology)
$1,300,000 by National Science Foundation
04/15/2017 - 03/31/2022

Reflection has long been recognized as a central component of effective learning. With the overarching goal of improving middle school students' science problem solving and learning outcomes, the REFLECT project has the objective of investigating a suite of theoretically grounded, adaptive game-based reflection tools to scaffold students' cognitive and metacognitive processes. The project will center on the design, development, and investigation of game-based learning tools for science education that adaptively scaffold students’ reflection through both embedded and retrospective support. It will culminate in a classroom experiment to study the impact of the adaptive reflection tools on both problem solving and learning. The results from this project will contribute significantly to theoretical and computational models of reflection, and produce both design principles and learning technologies that support the creation of effective learning environments.

ENGAGE: A Game-Based Curricular Strategy for Infusing Computational Thinking into Middle School Science.
James Lester ; Brad Mott ; Eric Wiebe (Friday Instit
$2,498,862 by National Science Foundation
08/15/2016 - 10/31/2020

Recent years have seen a growing recognition that computer science is vital for scientific inquiry. The middle school grade band is critical for shaping students’ aspirations and skills, and many issues relating to workforce underproduction and underrepresentation of diverse students in computer science can be traced back to middle school. To address this problem, the project will deeply integrate computer science into middle school science classrooms. Centered on a game-based learning environment that features collaborative learning, the project will have a specific focus on addressing gender issues in middle school computer science education with the goal of creating learning interactions that are both effective and engaging for all students.

Collaborative Research: PRIME: Engaging STEM Undergraduate Students in Computer Science with Intelligent Tutoring Systems
James Lester ; Bradford Mott ; Eric Wiebe (Friday Instit
$1,499,828 by National Science Foundation
09/ 1/2016 - 08/31/2021

Significant advances in intelligent tutoring systems have paved the way for engaging STEM undergraduates in computer science. This research has spawned a new generation of personalized learning environments that offer significant promise for providing students with adaptive learning experiences that are crafted to their individual needs. Spurred by this significant promise and building on a research infrastructure developed by the project team in a series of NSF-supported projects, the PRIME project will transform introductory computer science education with state-of-the-art intelligent tutoring systems technologies.

Health Quest: Engaging Adolescents in Health Careers with Technology-Rich Personalized Learning
James Lester, II
$1,378,755 by National Institutes of Health (NIH)
08/ 1/2017 - 07/31/2022

Leveraging intelligent game-based learning technologies, the Health Quest project focuses on developing and disseminating technology-rich resources to broaden the interests of adolescents in biomedical, behavioral and clinical research careers. The project centers on the development of technology-rich learning resources. These include a game-based learning environment featuring health careers as well as an online community that includes a speaker series featuring a broad range of health professionals. The final year of the project will see a full evaluation of the Health Quest program and its impact on students’ interest in biomedical, behavioral and clinical research careers.

Supporting Student Planning with Open Learner Models in Middle Grades Science
James Lester, II
$1,499,183 by National Science Foundation (NSF)
08/15/2018 - 07/31/2022

The ability to plan is a key element of learning. With the objective of improving middle school students' science learning, the project will investigate open learner models to scaffold student planning. The project will see the design, development, and investigation of an open learner model for student goal setting and planning. In contrast to the "classic" student models of intelligent tutoring systems, which are opaque, open learner models are inspectable: they enable students to inspect a learning environment's representation of their knowledge and competencies. Using the Future Worlds learning environment, the project will feature classroom studies that will investigate the impact of open learner models on both problem solving and learning in middle grades science.

EAGER: Collaborative Research: Building Capacity for K-12 Artificial Intelligence Education Research
James Lester, II
$99,976 by National Science Foundation (NSF)
08/15/2019 - 01/31/2022

This goal of this project is to build capacity for education research for K-12 artificial intelligence education. In particular, it will bring together experts in AI and learning sciences to develop a shared understanding and create a research agenda to bring evidence-based AI education to K-12 classrooms. We will organize and facilitate a series of two workshops focused on answering what and how to teach AI for K-12, with findings from the first workshop informing the second. We will also conduct broad analyses of how K-12 AI education research should respond to AI’s far-reaching societal impact.

Collaborative Research: FW-HTF: Augmented Cognition for Teaching: Transforming Teacher Work with Intelligent Cognitive Assistants
James Lester, II ; Bradford Mott
$1,499,736 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2022

Effective teaching is the cornerstone of K-12 education. However, effective teaching occurs in complex workplaces that require teachers to cope with the real-time demands of providing effective learning experiences for large classrooms of students by skillfully bringing to bear their expertise in pedagogy and classroom management. Although there is enormous potential for enhancing teaching with technology-rich support that leverages artificial intelligence (AI), limited work has been done to investigate how emerging AI technologies can bring about fundamental improvements to the teaching profession. With recent advances in AI technologies for natural language processing, machine learning, and user-adaptive support, the time is ripe for transforming the professional lives of teachers. The objective of the proposed research is to design, develop, and evaluate the Intelligent-Augmented Cognition for Teaching (I-ACT) framework featuring intelligent cognitive assistants for K-12 teachers. A unique feature of I-ACT afforded by recent advances in machine learning will be its ability to optimize teacher support for collaborative learning at the individual student, group, and classroom levels

Collaborative Research: PrimaryAI: Integrating Artificial Intelligence into Upper Elementary Science with Immersive Problem-Based Learning
James Lester, II ; Bradford Mott
$985,585 by National Science Foundation (NSF)
09/ 1/2019 - 08/31/2022

Artificial intelligence has emerged as a technology that promises to have unprecedented societal impact. Integrating AI into the science curriculum holds significant potential for introducing students to deep science inquiry while simultaneously providing them with an experiential understanding of the role that AI can play in science problem solving. The proposed project will center on the design, development, and investigation of PrimaryAI, a curricular framework that integrates science and AI for upper elementary science education. Featuring an immersive game-based learning environment, PrimaryAI will use problem-based learning as the foundation for science inquiry in which students grades 3-5 will utilize AI tools to solve complex ecosystem problems within an immersive science adventure. Students will engage in scientific problem solving tightly integrating AI and science to learn about ecosystems phenomena, mechanisms, and components that comprise a system, and make inferences about change over time for biological systems. The project will use design-based research to understand how best to integrate AI and science in upper-elementary science classrooms.

TCAT and TeamCoach: Tools for Natural Language-Based Team Communication Assessment and Team Feedback in Collective Synthetic Training Environments
James Lester, II ; Bradford Mott ; Jonathan Rowe ; Randall Spain
$2,018,810 by US Army - Army Research Laboratory
09/ 5/2019 - 01/ 4/2024

Developing adaptive instruction for teams requires a new generation of Adaptive Instructional Systems that can accurately assess team behaviors in real-time. To effectively adapt tutoring to the complex dynamics of teams calls for the creation of computational models that can operationalize and assess team performance and deliver coaching and feedback to team members as they complete simulated training events. Recent advancements in deep learning-driven natural language processing and reinforcement learning offer significant promise for achieving these capabilities. The goal of this project is to develop tools and methods that can be used by team training researchers to automatically analyzing team communication data and devise tutorial planners that can deliver run time feedback during team training tasks in synthetic environments. In particular, the project will (1) investigate how advances in deep learning-driven natural language processing can be leveraged to analyze team discourse in order to help researchers automatically assess team communication and team performance and (2) investigate how data-driven machine learning approaches can be leveraged to devise tutorial planning models that can automatically deliver run-time feedback during team training tasks in simulated environments.

Investigating Emergency Response Performance with VR-Based Intelligent User Interfaces
James Lester, II ; Bradford Mott ; Randall Spain
$1,112,175 by National Institute of Standards & Technology
06/ 1/2018 - 12/31/2021

First responders are seeing a significant increase in the amount and types of data available when responding to emergencies. To maximize the value of these data, user interfaces need to be designed that provide first responders with critical real-time information. Intelligent user interface design, in which the data and information presented to the user is adapted and tailored to the needs of individual users based on analytic information (e.g., expertise, task state, location), offers significant potential for improving performance, reducing mental workload, and facilitating effective decision-making. This project builds on a decade of research by the project team in developing intelligent game-based virtual learning environments. The goal of the project is to develop a virtual reality emergency response scenario that will serve as a test bed for evaluating the impact of intelligent user interfaces on first responder performance. In addition, the project will investigate the impact of providing adaptive support on task proficiency and whether alternative interaction methods (gesture-based vs. voice-based) reduce cognitive load and improve system interaction.

SCH: ChangeGradients: Promoting Adolescent Health Behavior Change with Clinically Integrated Sample-Efficient Policy Gradient Methods
James Lester, II ; Jonathan Rowe
$224,021 by University of California - San Francisco
12/13/2019 - 11/30/2021

The objective of the proposed research is to design, implement, and investigate ChangeGradients, a clinically integrated health behavior change system for adolescents. In a partnership with the University of California, San Francisco School of Medicine, we will create a computational behavior change framework based on sample-efficient policy gradient methods for reinforcement learning. The project will investigate a critical research question in health behavior change: how can a computational framework produce dynamically tailored interactive narratives that promote health behavior change for adolescents? ChangeGradients will support behavior change by generating personalized interactive narratives and delivering analytics to healthcare providers in a data-driven clinical intervention. ChangeGradients’ impact on health behavior change will be evaluated in a clinical study at the UCSF Benioff Children's Hospital.

Multimodal Visitor Analytics: Investigating Naturalistic Engagement with Interactive Tabletop Science Exhibits
James Lester, II ; Jonathan Rowe ; James Minogue
$1,951,956 by National Science Foundation (NSF)
03/ 1/2018 - 02/28/2022

Recent advances in multimodal learning analytics present new opportunities for investigating learning and engagement in informal education settings. In this project, we will investigate visitors’ learning experiences in science museums using multimodal visitor analytics, which marry the rich multi-channel data streams produced by fully-instrumented exhibit spaces and the data-driven modeling functionalities afforded by recent advances in machine learning. The project will leverage Future Worlds, a fully-instrumented prototype digital interactive exhibit about sustainability, which was developed and piloted by the project team in a previously funded NSF Informal Science Education proof-of-concept project. The research team will conduct a series of museum studies to investigate how learners interact with Future Worlds and other exhibits in a science museum, and will utilize learning analytic techniques to model visitors’ cognitive, affective, and behavioral components of learning and engagement. The project will produce a detailed empirical account of visitors’ learning experiences in a science museum, as well as an open-source software platform for conducting multimodal visitor analytics, which will help other informal education researchers utilize learning analytics with their own datasets

Collaborative Research:CNS Core:Small:Towards Efficient Cloud Services
Xu Liu
$249,840 by National Science Foundation (NSF)
08/19/2020 - 09/30/2023

Cloud environments employ various microservices and serverless functions to handle web or database requests. Although cloud provides a uniformed infrastructure for resource management, it can easily suffer from performance inefficiency in the entire cloud software stack. To address this issue, we will develop CloudProf. This project has the following goals. First, it will break the abstraction introduced by the runtime systems of managed languages for intra-application optimization. Second, it will identify problematic interactions across microservices for inter-service optimization. Third, it will break the abstraction introduced by virtual machines and containers for the optimization of the entire cloud software stack.

Developing Integrated Teaching Platforms to Enhance Blended Learning in STEM
Collin Lynch ; Tiffany Barnes ; Sarah Heckman
$599,992 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

Modern computer science classrooms support student learning by integrating a large number of educational tools and platforms from forums to intelligent tutors and automated tests. This proliferation of tools can overwhelm students who have difficulty navigating across platforms or connecting the information between platforms. We propose to develop an open platform for student and instructor guidance that supports data integration and analysis. The system will be designed to synthesize student interaction data from a rich set of educational tools. The integrated data and will provide the basis for automated analysis of students' work and work habits allowing for automated and instructor-driven interventions to support student learning. Additionally, the integrated system will provide pedagogical guidance on how students can best use classroom learning tools not individually, but in concert.

Collaborative Research: Fostering Collaborative Computer Science Learning with Intelligent Virtual Companions for Upper Elementary Students
Collin Lynch (co-PI) ; Eric Wiebe
$1,399,088 by National Science Foundation
08/15/2017 - 07/31/2021

The University of Florida and North Carolina State University jointly propose FLECKS, a Design and Development proposal for the NSF's Discovery Research PreK-12 (DRK-12) program. FLECKS (Friendly Learning Environment for Kids' Computer Science) addresses the pressing need for the development of fundamental computer science competencies in upper elementary-school children. The goal of the proposed project is to design, develop, and investigate FLECKS, an intelligent learning environment to teach collaborative computer science problem solving. Collaboration is a central academic and professional practice in computational thinking, yet it presents many challenges for elementary school students. Students often struggle to collaborate successfully due to individual differences in academic status; gender; cultural background; personality; attitudes toward collaboration; or attitudes toward learning. In order to address these challenges, FLECKS will provide dyads of students with a rich, scaffolded environment where they use an interactive online coding environment to engage in computer science challenges related to their STEM subject areas. Central to the innovation is the way in which the dyads are supported. FLECKS are animated virtual characters that take a rich set of multimodal features as input, and then adapt to students’ patterns of collaboration, including who has control of the keyboard and mouse; who speaks when; and the problem-solving actions the students take in the online environment.

CAREER: Explorable Formal Models of Privacy Policies and Regulations
Christopher Martens
$555,000 by National Science Foundation (NSF)
06/15/2019 - 05/31/2024

Regulatory policies, especially those governing data privacy, must satisfy seemingly contradictory requirements of precision and transparency. Prior research on usable privacy has led to annotating policies with information designed to assist user understanding; meanwhile, the desire for provable guarantees generated efforts on encoding policies in formal logic to answer questions about specific scenarios. The PI proposes to unify these approaches through formalizations amenable to analysis, interactive exploration, and question-answering. This work will enable stakeholders to formulate and answer questions about regulations, protocols, and scenarios, to generate counterexamples and recommendations for policy repair, leading to improved understanding and minimized risk.Impact Tabs/Community

CAREER: Explorable Formal Models of Privacy Policies and Regulations (Supplement)
Christopher Martens
$16,000 by National Science Foundation (NSF)
06/15/2019 - 05/31/2024

Simulating Social Influence Based on Real-World Geographic Data: Emergent Narratives and Interactive Hypothesis Testing
Christopher Martens
$577,574 by US Air Force - Office of Scientific Research (AFOSR)
09/15/2020 - 09/14/2023

Computers are increasingly being used to simulate and analyze complex social phenomena, but do not account geographical, cultural, economic, and sociopolitical systems that influence social relationships. We identify the need to account for real-world, localized information in social simulation. Our research objectives are to create computational models of social influence and opinion change that support believable social simulation and facilitate novel insights for experts through scaffolded interaction. This project, if successful, will contribute fundamental advances in computational social science, including advances individually in both computer science and social science as well as bidirectional exchange of ideas across disciplines.

CRII: SHF: Supporting Domain-Specific Inquiry with Rule-Based Modeling (Supplement)
Christopher Martens
$16,000 by National Science Foundation (NSF)
03/ 1/2018 - 02/28/2022

An increasingly common method for communicating and critiquing the emergent behavior of complex systems is interactive simulation, which can teach interactors about the way a system works by revealing system-level properties like feedback loops and tension between objectives. The Ceptre programming language provides a way to author interactive simulations in a rule-based way, amenable to both intuitive understanding and analysis. We propose to expand Ceptre’s audience by implementing a user interface that enforces syntax-level and type-level correctness of programs, which can be run and deployed in the browser for rapid prototyping.

CRII: SHF: Supporting Domain-Specific Inquiry with Rule-Based Modeling
Christopher Martens
$161,142 by National Science Foundation (NSF)
03/ 1/2018 - 02/29/2020

Collaborative Research: Cyberinfrastructure for Robust Learning of Interconnected Knowledge
Noboru Matsuda
$386,884 by National Science Foundation (NSF)
07/15/2020 - 06/30/2023

We propose to develop learning-engineering methods to efficiently build an effective online STEM learning environment, in the form of adaptive online courseware called CyberBook, to promote robust mathematics learning with understanding. The proposed CyberBook is a combination of traditional online courseware (that promotes conceptual understanding) and intelligent tutoring systems (that support guided learning-by-doing). We hypothesize that these two well-established technologies can be combined by identifying the shared latent learning constructs, i.e., skills and concepts to be learned. We further hypothesize that the resulting cyberlearning space will promote synergetic learning that, by definition, will fertilize the desired proficiency.

Collaborative Research: Cyberinfrastructure for Robust Learning of Interconnected Knowledge
Noboru Matsuda
$16,000 by National Science Foundation (NSF)
07/15/2020 - 06/30/2023

Developing An Online Environment For Learning Algebra By Teaching A Synthetic Peer
Noboru Matsuda
$1,399,947 by US Dept. of Education (DED)
05/ 9/2019 - 08/31/2021

In this project, researchers will develop and evaluate an online game-like environment for middle school students to solve and learn algebra linear equations by teaching a simulated peer student. Learning by teaching is a promising style of instruction, with evidence supporting that when students engage in peer tutoring there is a benefit for both the tutee and tutor.

SHF:Small: Mega Transfer: On the Value of Learning from 10,000+ Software Projects
Timothy Menzies
$472,024 by National Science Foundation (NSF)
10/ 1/2019 - 09/30/2022

Software analytics is a workflow that distills large amounts of low-value data into small chunks of very high-value data. A typical research paper in software analytics studies less than a few dozen projects. Such small samples can never be representative of something as diverse as software engineering. Perhaps it is time to stop making limited conclusions from tiny sets of software projects. To that end, we 3ill apply innovative transfer learning methods (based on very fast clustering and transfer learners based on very fast stream mining algorithms that use incremental hyperparameter optimizers) to the 10,000+ projects currently in Github.

Elements: Can Empirical SE be Adapted to Computational Science?
Timothy Menzies
$592,129 by National Science Foundation (NSF)
10/ 1/2019 - 09/30/2022

Standard methods in empirical software engineering (SE) needs to be adapted before it can be safely deployed in other domains like computational science. But what adaption methods are useful/useless? Are they cost effective? Do they work effectively across multiple data sets? We have some preliminary results suggesting that the work for (a) defect prediction but can we also adapt other tasks such as (b) test case prioritization, (c) effort estimation, (d) learning to avoid spurious false negatives from static code analysis, etc. Why is this important? Well, building software is hard. Building good software is even harder when developers have not formally studied SE (i.e. as in the case of many developers of computational science software developers). How can we capture and maintain expertise about software development, then make that expertise more widely available?

LAS DO1 Menzies - 2.4 Analytics, AI and Machine Learning
Timothy Menzies
$255,684 by Laboratory for Analytic Sciences
01/17/2019 - 12/31/2020

LAS DO1 Menzies - 2.4 Analytics, AI and Machine Learning

SMOKE (anyone can see the fire, but when can you notice the smoke?)
Timothy Menzies
$47,082 by LexisNexis
05/ 1/2020 - 12/31/2020

This work builds time series over industrial scale text mining data, looking for predictors for significant business events.

LR2: Ultra-fast Novelty Recognition and Repair for Deep Learning
Timothy Menzies
$72,675 by Quantum Ventura Inc.
01/30/2020 - 05/28/2020

Deep Learning has problems (CPU cost and the incomprehensibility of its models) which can be solved by samples the rate of change in internal network weights as the deep learner streams over the data. Our LR2 algorithm detects novel inputs then repair existing models as appropriate. Using adaptive instance-based reasoning, LR2’s model-based sequential optimizer continually improves local models across the decision boundary. These models can report anomalies and also generate explanations about why particular examples lead fo one conclusion or another/ LR2 should also significantly reduce training times for Deep Learning

SHF:Medium:Scalable Holistic Autotuning for Software Analytics
Timothy Menzies ; Xipeng Shen
$898,349 by National Science Foundation
07/ 1/2017 - 06/30/2022

This research proposes to advance the state of the art to holistic scalable autotuners, which tunes all levels of options for multiple optimization objectives at the same time. It will achieve this ambitious goal through the development of a set of novel techniques that efficiently handles the tremendous tuning space. These techniques take advantage of the synergies between all those options and goals by exploiting relevancy filtering (to quickly dispose of unhelpful options), locality of inference (that enables faster updates to out- dated tunings) and redundancy reduction (that reduces the search space for better tunings). This new autotuner will be a faster method for finding better tunings that satisfy more goals. To test this claim, this research will assess if this new tool can reduce the total computational resources required for effective SE data analytics by orders of magnitude.

Collaborative Research: Building a Computational Thinking Foundation in Upper Elementary Science with Narrative-Centered Maker Environments
Bradford Mott ; James Minogue ; Kevin Oliver
$1,599,339 by National Science Foundation (NSF)
08/ 1/2019 - 07/31/2022

Recent years have seen a growing recognition of the importance of computer science experience for today's K-12 students. Knowledge of computing is essential for students' success throughout their academic and professional careers. Engaging elementary students in computational thinking through the creation of rich interactive computational narratives offers an innovative approach to building students’ computational thinking practices and interest in computing. This project will engage students in a broad range of computing activities centered on creating digital interactive narratives. The project will see the development of a narrative-centered maker environment that introduces computational thinking into upper elementary science education emphasizing connections to the Next Generation Science Standards.

SaTC: CORE: Small: Enhanced Security and Reliability for Embedded Control Systems
Frank Mueller
$500,000 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2022

CPS and IoT devices are inherently networked, which exposes them to malware attacks. We propose to significantly increase the cyber security specifically of CPS and IoT computing devices by developing real-time monitoring techniques that defeat cyber-attacks.

Large-Scale Automatic Analysis of the OAI Magnetic Resonance Image Dataset
Frank Mueller
$331,603 by UNC - UNC Chapel Hill
08/15/2017 - 07/31/2022

The goal of this proposal is to optimize and to openly provide to the OA community a new technology to rapidly and automatically measure cartilage thickness, appearance and changes on magnetic resonance images (MRI) of the knee for huge image databases. This will allow assessment of trajectories of cartilage loss over time and associations with clinical outcomes on an unprecedented scale; future work will focus on incorporating additional disease markers, ranging from MRI-derived biomarkers for bone and synovial lesions, to biochemical biomarkers, to genetic information.

HPC Power Modeling and Active Control
Frank Mueller
$386,290 by Lawrence Livermore National Laboratory via US Department of Energy
10/25/2016 - 09/30/2021

As we approach the exascale era, power has become a primary bottleneck. The US Department of Energy has set a power constraint of 20MW on each exascale machine. To be able achieve one exaflop in 20MW,it is necessary that we use power intelligently to maximize performance under a power constraint. In this work, we propose to alleviate the shortcomings of current HPC systems in addressing power constraints by (1) power-aware machine partitioning, (2) power-constrained job scheduling, (3) systematic provisioning and procurement of hardware under a power cap, (4)modeling of network, deep memories, and storage, as well as (5)investigating the inter-dependence between power and cooling.

Software Support for Heterogeneous Memories in HPC
Frank Mueller
$315,429 by TRIAD National Security, LLC Formerly Los Alamos National Laboratory (LANL)
11/ 9/2018 - 09/30/2021

This project proposes to explore different solutions of software support for heterogeneous memory architectures for future supercomputers. Its primary aim will be to seamlessly integrate the new memory technologies in to the existing architecture support while taking full advantage of its characteristics for existing HPC applications and making the programmers job easier for developing new applications for the future systems.

EAGER: Curricula Development of a Quantum Programming Class with Hardware Access
Frank Mueller ; Patrick Dreher ; Gregory Byrd
$100,000 by National Science Foundation (NSF)
09/ 1/2019 - 08/31/2021

Quantum Computing (QC) has reached an early state of device maturity with the availability of several hardware platforms and corresponding programming environments. The potential of QC is significant as algorithms, such as Shor's prime factoring, have the potential to break the barriers of classical complexity classes and thus provide ``quantum supremacy'' for such algorithms. We propose to create a curriculum for a quantum programming class with access to cutting-edge quantum computing platforms. Specifically, we propose to utilize cloud-based access to one gate-based platform and one annealing-based platform to provide hands-on experience with programming actual quantum hardware. Curricular material will include the fundamentals in physics and mathematics required to understand quantum computing, introductory material to the quantum field, and programming environments for two cloud-based platforms. We also propose to develop training material suitable for tutorials at major conferences/symposia across different fields as well as online courses for faculty, staff and students. As a means to gauge success, the suitability of the material will be thoroughly evaluated statistically via surveys at the end of educational units for both classes and tutorials.

SHF: Small: Retrospective and Prospective Studies of the Effects of Gender Bias in Software Engineering
Emerson Murphy-Hill
$498,461 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

Gender bias in the workplace has been documented in a variety of studies, and recent work in software engineering has likewise revealed how bias effects the tech industry. This project seeks to better understand the causes and effects of gender bias in software engineering, by studying the people and artifacts who practice it.

CSR:Medium:SmartChainDB - Enabling Smart Marketplaces With A Scalable Semantically-Enhanced Blockchain Platform
Kemafor Ogan ; Alessandra Scafuro ; Binil Starly
$499,773 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

This project seeks to develop a platform-SmartChainDB for supporting Smart Marketplaces in trustless environments. Such marketplaces should enable efficient assessment of bids in response to service requests without a-priori trust establishment between parties. In domains like Digital Manufacturing, job bid assessments are very time consuming efforts that take order of months. The platform will be developed by extending a BlockChain database for managing trust, with transaction types necessary to support a protocol for service requests and response bids. Another key extension will be semantic-enablement of the BlockChain database. A proof-of-concept prototype in Smart Manufacturing will be developed using SmartChainDB.

SHF: SMALL: Effective and Equitable Technical Interviews in Software Engineering
Christopher Parnin
$300,000 by National Science Foundation (NSF)
10/ 1/2020 - 09/30/2023

Software engineering candidates commonly participate in high-pressure technical interviews, or whiteboard interviews. Critics have argued that these types of interviews unnecessarily stress and filter out otherwise qualified candidates, yet it remains a standard hiring practice. This project proposes a series of randomized control trials to understand how these practices influence performance of candidates, identify any bias or confounding factors in the process, and develop interventions to make problem-solving assessment more equitable and inclusive.

Collaborative Research: SaTC: TTP: Small: eSLIC: Enhanced Security Static Analysis for Detecting Insecure Configuration Scripts
Christopher Parnin
$199,978 by National Science Foundation (NSF)
10/ 1/2020 - 09/30/2023

Configuration scripts are used to manage system configurations and provision infrastructure at scale. Configuration scripts are susceptible of including security weaknesses such as hard-coded passwords, which can facilitate large-scale data breaches, as well as provisioned systems being compromised. We propose an automated technique to identify security weaknesses so that configuration scripts do not cause large-scale security attacks and data breaches. We will build upon our recent research and construct eSLIC, which will overcome previous limitations of our initial prototype and facilitate wide-spread security static analysis of infrastructure. We will make eSLIC available for OSS and practitioners in industry.

SHF: SMALL: DockerizeME: Automatic Inference and Repair of Computing Environments
Christopher Parnin
$345,875 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

Data scientists perform analysis that society increasingly relies on. As data and analysis grow in complexity, so does the computing environments required to run that analysis. A wide variety of tools have been created to help data scientists do their jobs, yet the computing environments used for these analyses are often difficult to reproduce and vary significantly from tool to tool. As a result, data scientists may waste time trying to share, reproduce, and scale these computing environments, instead of making their analysis more thorough and reliable. In this project, we describe how we can infer configurations from existing code snippets and computing environments and automatically create application containers for running and scaling the computations. Further, based on our infrastructure, we can provide the ability to repair code and code environments in order to perform automated maintenance of computing environments.

SHF: SMALL: DockerizeME: Automatic Inference and Repair of Computing Environments (Supplement)
Christopher Parnin
$3,000 by National Science Foundation (NSF)
10/ 1/2018 - 09/30/2021

CRII: SHF: Building Visibility into the Cognitive Processes of Software Engineers via Biosensors
Christopher Parnin
$159,662 by National Science Foundation (NSF)
02/ 1/2018 - 01/31/2020

Despite its vast capacity and associative powers, the human brain does not deal well with interruptions. Particularly in situations where information density is high, such as during a programming task, recovering from an interruption requires extensive time and effort. Although researchers recognize this problem, no programming tool takes into account the brain’s structure and limitations in its design. In this project, we measure cognitive load of programmers during different programming tasks. To measure cognitive load, we collect both biometrics and metrics collected from sensors and brain imaging devices. We apply our measures to applications in the software engineering domain: 1) Measuring cognitive load during technical interviews, and 2) Correlating complexity measures of code with higher measures of cognitive load.

REU Site: Science of Software
Christopher Parnin ; Emerson Murphy-Hill ; Sarah Heckman
$355,365 by National Science Foundation
01/ 1/2016 - 01/31/2020

There are not enough skilled data science researchers, particularly in software engineering. Hence, this REU Site in Science of Software (SOS) will engage undergraduates as data scientists studying exciting and meaningful SE research problems. Students work side-by-side with faculty mentors to gain experience in qualitative and quantitative research methods in SOS. Activities include dataset challenges, pair research, literature reviews, and presentations. Ultimately, each student works independently toward a published research result with their faculty mentors.

SHF: SMALL: Automated Discovery of Cross-Language Program Behavior Inconsistency
Christopher Parnin ; Kathryn Stolee
$499,994 by National Science Foundation (NSF)
08/ 1/2020 - 07/31/2023

This project advances the state of knowledge about how to infer misconceptions and generate explanations without any explicit models of a programming language. In contrast to existing approaches, which involves manual identification of misconceptions in programming languages, or cross- language migrations—which provide translations but no explanations—our technique automatically discovers inconsistencies cross-languages and supports automatic resolution for problematic translations.

Intelligent Support for Creative, Open-ended Programming Projects
Thomason Price ; Tiffany Barnes ; Christopher Martens
$749,920 by National Science Foundation (NSF)
08/ 1/2019 - 07/31/2022

We will develop new data-driven methods to support students automatically as they create novel, open-ended and creative, computational artifacts. Specifically, we will develop techniques to adaptively scaffold project design and planning, detect students' programming goals, offer on-demand example-based support and tailor help to students needs through an interactive help interface. We will augment the popular Snap programming environment, which is already used in hundreds of high school and college classrooms, with these features and evaluate their effective in a series of experiments designed to explore how students approach open-ended tasks and how best to support them.

CRII: SaTC: Techniques for Measuring and Characterizing Robocalls
Bradley Reaves
$174,999 by National Science Foundation (NSF)
05/ 1/2019 - 04/30/2021

Robocalls are unwanted spam calls, and consumers are plagued by billions of calls annually. These calls have significantly degraded the usefulness of the global telephone network. The goal of this project is to develop techniques to measure the prevelence and characteristics of robocalls. This work will pave the way for a more trustworthy and useful phone network.

SFS: A Cybersecurity Educational Partnership for the Government Workforce
Douglas Reeves ; Sarah Heckman
$2,748,558 by National Science Foundation (NSF)
01/ 1/2020 - 12/31/2024

Educating the next generation of cybersecurity professionals is a critical need for the State of North Carolina and the United States. We are utilizing our expertise in cybersecurity research to prepare undergraduate and Masters computer science students at NC State for cybersecurity jobs. Scholarship for Service (SFS) will provide students from North Carolina and the United States, especially from underrepresented groups, the opportunity to receive a high quality cybersecurity focused degree. SFS students will be part of a larger cohort of cybersecurity students who will participate in supplemental activities, events, and conferences as part of their educational experience.

Joint Faculty Appointment Between Oak Ridge National Laboratory (ORNL) and North Carolina State University (NCSU) for Dr. Xu Liu for 2020-2021
Gregg Rothermel
$50,208 by Oak Ridge National Laboratories - UT-Battelle LLC
09/17/2020 - 09/18/2021

Dr. Xu Liu JFA with ORNL.

CNS Core:Small:On Parallelizing Optical Network Design Problems:Towards Network Optimization as a Service
George Rouskas
$439,148 by National Science Foundation (NSF)
10/ 1/2019 - 09/30/2022

Planning, deploying, and engineering the networks that make up the Internet infrastructure involves complex problems that we will refer to generically as "network design" problems. Effective and efficient solutions to network design problems are crucial to the operation and economics of the Internet and its ability to support critical and reliable communication services. With this research project we aim to make contributions that will lead to new approaches for tackling network design problems in a scalable manner. In particular, we will develop parallel solutions that are applicable to a wide range of problems by exploiting a feature common to all, namely, that the optimization process incorporates both a routing aspect and a resource allocation aspect.

Graduate Industrial Trainee-ship-Genworth Mortgage Insurance Corporation-Risk July 2019 - May 2020
George Rouskas
$54,536 by Genworth Mortgage Insurance Corporation
07/ 1/2019 - 05/15/2020

NCSU through the Genworth GA will provide research and analysis to Genworth as set forth in this Agreement. Such research and analysis shall include, but is not limited to, research, generation, testing, and documentation of risk modeling software. Genworth GA will provide such services for Genworth's offices in Raleigh, North Carolina, at such times as have been mutually agreed upon by the parties.

Investigating the Role of Interest in Middle Grade Science with a Multimodal Affect-Sensitive Learning Environment
Jonathan Rowe
$414,761 by National Science Foundation (NSF)
07/15/2020 - 06/30/2023

The proposed project will see the design, development, and investigation of a multimodal affect-sensitive learning environment for generating student interest in middle school science. We will capture rich multi-channel data (eye gaze, facial expression, posture, interaction traces) on student problem solving with an inquiry learning environment. We will utilize multimodal machine learning to induce affect recognition models, which will drive run-time affect-sensitive interventions to trigger and sustain student interest. The project will culminate in a classroom experiment to evaluate the impact of the multimodal affect-sensitive learning environment on student learning and science interest.

SaTC: CORE: Small: Collaborative: A Broad Treatment of Privacy in Blockchains
Alessandra Scafuro
$249,922 by National Science Foundation (NSF)
09/ 1/2017 - 08/31/2020

A blockchain is a public, distributed, append-only database whose consistency is maintained by the combined work of users across the world rather than a single party, thus avoiding single points of failure and trust. The public nature of the blockchain, however, raises important privacy concerns. Existing work partially addressed privacy concerns for the restricted case of blockchains used for financial transactions. As blockchains are set to be used in a variety of contexts, proposed work will initiate a broad treatment of privacy definition and provide constructions achieving new privacy goals that can be implemented across different blockchains.

NeTS: Small: Fine-grained Measurement of Performance Metrics in the Internet of Things
Muhammad Shahzad
$449,999 by National Science Foundation
10/ 1/2016 - 09/30/2021

PI proposes to develop a framework for passive and fine-grained measurements of the performance metrics in the Internet of Things, which include both Quality of Service metrics such as latency, loss, and throughput and Resource Utilization metrics such as power consumption, storage utilization, and radio on time etc. Measurements of these performance metrics can be used reactively by network operators to perform tasks such as detecting and localizing offending flows that are responsible for causing delay bursts, throughput deterioration, or even power surges. These measurements can also be used proactively by network operators to locate and preemptively update any potential bottlenecks.

WiFi based Indoor Mapping and Human Discovery
Muhammad Shahzad
$329,335 by University of North Carolina System Office (formerly UNC - General Administration)
02/15/2019 - 02/14/2021

In this project, our objective is to use WiFi’s wireless channel metrics to generate indoor map of any given building without entering it. In generating these maps, our secondary objective is to discover any humans that are present in the building, identify their locations, and determine which of them are stationary and which are mobile.

CSR:Small:Collaborative Research: Scalable Fine-Grained Cloud Monitoring for Empowering IoT
Muhammad Shahzad
$257,996 by National Science Foundation
09/15/2016 - 08/31/2020

Due to the rapid adoption of the cloud computing model, the size of the data centers and the variety of the cloud services is increasing at an unprecedented rate. Due to this, fine-grained monitoring of the health and the usage of data center resources is becoming increasingly important and challenging. In this work, we address the problem of efficiently acquiring and transporting cloud management and monitoring data. For data acquisition, we address the crucial challenge of controlling data size. For data transportation, we focus on efficiently moving the data from the point it is collected inside the data center to the point it needs to be stored for processing.

Accurate Position Tracking Through Smart Fusion of Inertial Sensors with Ambient WiFi
Muhammad Shahzad
$150,000 by Sony
07/ 1/2019 - 06/30/2020

In this project, we propose to perform indoor position tracking using inertial sensors that are already built into most commodity handheld and wearable devices. We plan to improve the accuracy of the position tracking by leveraging the ambient WiFI signals already present in most modern buildings.

CRII: CSR: Pervasive Gesture Recognition Using Ambient Light
Muhammad Shahzad
$174,878 by National Science Foundation
05/ 1/2016 - 04/30/2020

The PI proposes to use ambient light for recognizing human gestures. The intuition behind the proposed approach is that as a user performs a gesture in a room that is lit with light, the amount of light that he/she reflects and blocks changes, resulting in a change in the intensity of light in all parts of the room. This change can be measured and the pattern of change in the intensity of light is different for different gestures. Leveraging this observation, the proposed approach first learns these patterns for different gestures and then recognizes the gestures in real-time.

CAREER: Algorithmic Challenges and Opportunities in Spatial Data Analysis
Donald Sheehy
$277,465 by National Science Foundation (NSF)
08/23/2019 - 01/31/2022

Spatial data takes many forms including configuration spaces of robots or proteins, collections of shapes, and physical models. These data sets often contain intrinsic, nonlinear, low-dimensional structure hidden in complex high-dimensional input representations.To uncover such structure one needs to adapt to local changes in scale, recognize multiscale structure, represent the intrinsic space underlying the data, compute with coarse approximate distances, and integrate heterogeneous data into meaningful distance functions. There is a need for algorithms and data structures that can search, represent, and summarize such data sets efficiently. The PI will develop new data structures, models of computation, sampling theories, and metrics for addressing these challenges.

HPC-FAIR: A Framework Managing Data and AI Models for Analyzing and Optimizing Scientific Applications
Xipeng Shen
$508,977 by US Dept. of Energy (DOE)
09/23/2020 - 09/22/2023

The overarching goal of this proposal is to develop a generic HPC data registration and retrieval framework (named HPC-FAIR) to make both training data and AI models of scientific applications findable, accessible, interoperable, and reusable. This framework provisions significant speedup of the research and development of ML-based approaches for analyzing and optimizing scientific applications running on heterogeneous supercomputers. The datasets and AI models from HPC-FAIR will also serve as common baselines to quickly, consistently, and fairly evaluate new AI models for quality, complexity, and overhead.

CSR:Small:Supporting Position Independence and Reusability of Data on Byte-Addressable Non-Volatile Memory
Xipeng Shen
$499,998 by National Science Foundation (NSF)
08/16/2017 - 07/31/2022

Byte-Addressable Non-Volatile Memory (NVM) is the upcoming next generation of memory with tremendous potential benefits. This proposal is about offering programming system-level support of persistency on NVM. Particularly, it focuses on effective support of the usage of dynamic data structures on NVM.

Collaborative Research: PPoSS: Planning: Scaling Secure Serverless Computing on Heterogeneous Datacenters
Xipeng Shen
$76,611 by National Science Foundation (NSF)
10/ 1/2020 - 09/30/2021

The project will try to explore innovative approaches to enhance the scalability, security, and performance of serverless computing on datacenters that are equipped with heterogeneous computing units and memory.

XPlacer: Extensible and Portable Optimization of Data Placement in Memory
Xipeng Shen
$235,398 by Lawrence Livermore National Laboratory
01/29/2018 - 09/30/2020

Modern supercomputers with heterogeneous components (e.g., GPUs) feature complex memory systems to meet the ever growing demands for data by processors. Putting data into the proper part of a memory system is essential for program performance, but is difficult to do. To address this challenge, we propose a new paradigm featuring three interacting components: 1) an extensible memory specification language to describe memory properties, 2) a compiler for analyzing data access patterns and transforming code for runtime adaptation, and 3) a data placement runtime to find and materialize the best data placements on the fly. The result will be a software framework (named XPlacer) that transforms OpenMP code to automatically place its data in memory in a way best suiting the GPU architecture, inputs, and program phases.

SHF: Small: Improving Memory Performance on Fused Architectures through Compiler and Runtime Innovations
Xipeng Shen ; Frank Mueller
$470,000 by National Science Foundation
08/ 1/2015 - 07/31/2020

Contemporary architectures are adopting an integrated design of conventional CPUs with accelerators on the same die with access to the same memory, albeit with different coherence models. Examples include AMD's Fusion architecture, Intel's integrated main-stream CPU/GPU product line, and NVIDIA Tegra's integrated graphics processor family. Integrated GPUs feature shared caches and a common memory interconnect with multicore CPUs, which intensify resource contention in the memory hierarchy. This creates new challenges for data locality, task partitioning and scheduling, as well as program transformations. Most significantly, a program running on GPU warps and CPU cores may adversely affect performance and power of one another. The objective of this work is to understand these novel implications of fused architectures by studying their effects, qualifying their causes and quantifying the impacts on performance and energy efficiency. We propose to advance the state-of-the-art by creating spheres of isolation between CPU and GPU execution via novel systems mechanisms and compiler transformations that reduce cross-boundary contention with respect to shared hardware resources. This synergy between systems and compiler techniques has the potential to significantly improve performance and power guarantees for co-scheduling pgrams fragments on fused architectures. impact: The proposed work, if successful, has the potential to transform resource allocation and scheduling at the systems level and compiler optimizations at the program level to create a synergistic development environment with significant performance and power improvements and vastly increased isolation suitable for synergistic co-deployment of programs crossing boundaries on innovative fused architectures.

Foureye: Cyber Defensive Deception based on Hypergame Theory for Tactical Networks
Munindar Singh
$95,000 by Virginia Polytechnic Institute and State University (aka Virginia Tech)
12/ 1/2020 - 04/30/2023

This project investigates a form of active cyberdefense based on defensive deception against an attacker. It applies a form of game theory called hypergame theory that enables a natural representation of situations where an attacker and a defender can be understood as playing a different game. This project will develop computational models of hypergames that reflect cyber attack and defense strategies to support the investigation of tradeoffs such as between defense effectiveness and cost. If successful, the project will yield representations and algorithms that defenders could apply to disrupt an attacker's beliefs and thus cause attacks to fail.

RI: Small: Principles of Normative Multiagent Systems for Decentralized Applications
Munindar Singh
$450,000 by National Science Foundation (NSF)
10/ 1/2019 - 09/30/2022

This project will investigate theoretical models and programming techniques for decentralized applications. Emerging technologies show great potential in helping bring about a new era of automated contracts that enables flexible transactions among independent parties---with applications in finance, healthcare, pharmaceuticals, among other domains. This project will go beyond current approaches by providing a new declarative model that is able to handle the challenging computing-related aspects of real-life contracts. This model is accompanied with techniques that provide guidance on how to specify and enact contracts in a manner that is precise, flexible, and eliminates unnecessary information sharing.

Realizing Cyber Inception: Toward a Science of Personalized Deception for Cyber Defense
Munindar Singh
$375,360 by University of Southern California via US Army Research Office
09/ 1/2017 - 12/31/2021

Frequent security breaches have highlighted both the growing importance of cybersecurity and weaknesses of traditional methods such as firewalls, malware detection, intrusion detection, and prevention technologies. To leap ahead of attackers, we must move beyond passive defense strategies toward a new science of interactive personalized deception for cyberdefense. Our proposed approach involves (1) building models of attackers and their propensities and (2) characterizing computers, networks, users, and their relationships and interactions so as to enable realistic deception. We will develop a modular framework for evaluation of the key deception techniques consisting of a pluggable game-based scaffolding.

LAS DO1 Singh - 2.1 Human- Machine Collaboration
Munindar Singh
$210,406 by Laboratory for Analytic Sciences
01/17/2019 - 12/31/2020

LAS DO1 Singh - 2.1 Human- Machine Collaboration - Workflows

Data Access and Governance over Shared Ledger, CHMPR Core Project
Munindar Singh
$37,149 by Center of Hybrid Multicore Productivity Research (CHMPR) - NCSU Research Site
01/ 1/2020 - 06/30/2020

: How can we help organizations share access to their data with their business partners and users effectively (providing precise controls including the ability to terminate access as needed) and efficiently (with low administrative overhead)? Specifically, how can organizations negotiate and “prosecute” (i.e., enact, monitor, enforce) business contracts in real-time to support real-time analytics? This project investigates an approach based on the combination of a logic-based approach based on legal norms with blockchain architecture for a shared ledger. RQ1. How can we express sharing policies formally, including policies referencing other policies? RQ2. How can we represent the requisite information in an information architecture that combines an immutable ledger with mutable information stores? RQ3. How can we reason with sharing policies in our information architecture, using smart contracts on the blockchain to verify compliance and guide the reasoning of the off-blockchain agents?

CAREER: On the Foundations of Semantic Code Search
Kathryn Stolee
$500,000 by National Science Foundation (NSF)
08/ 1/2018 - 07/31/2023

Semantic code search uses behavioral specifications, such as input/output examples, to identify code in a repository that matches the specification. Challenges include handling scenarios when 1) there are too few solutions, 2) it is difficult to understand how solutions differ, and 3) there are too many solutions. I propose techniques to 1) expand the scope of code that can be modeled and find approximate solutions when an exact one does not exist, 2) determine the differences between two code fragments, and 3) navigate a large space of possible solutions are needed by selecting inputs that maximally divide the solution space.

SHF: Small: Supporting Regular Expression Testing, Search, Repair, Comprehension, and Maintenance
Kathryn Stolee
$499,996 by National Science Foundation (NSF)
08/15/2017 - 07/31/2021

Regular expressions (regexes) are responsible for numerous faults in many software products, and yet, static bug finders and automated program repair techniques generally ignore this common language feature. First, I propose to explore and characterize regex-related bugs in bug repositories. From there, I propose to develop approaches for detecting regex-related bugs using static analysis and patching regex-related bugs using automated program repair. The proposed detection and patching techniques both depend on similarity analysis of regexes. The expected research outcomes include a publicly available data set of regex-related faults, new regex-related bug patterns for static bug finders like FindBugs and PMD, and in the best case, an open source tool for automated patch generation for regular expressions.

SHF: Medium: Collaborative Research: Semi and Fully Automated Program Repair and Synthesis via Semantic Code Search
Kathryn Stolee
$387,661 by National Science Foundation
07/ 1/2016 - 06/30/2021

Software plays an integral role in our society. However, software bugs are common, routinely cause security breaches, and cost our economy billions of dollars annually. The software industry struggles to overcome this challenge: Software is so inherently complex, and mistakes so common, that new bugs are typically reported faster than developers can fix them. Recent research has demonstrated the potential of automated program repair techniques to address this challenge. However, these techniques often produce low-quality repairs that break existing functionality. In this research, we develop new techniques to fix bugs and implement new features automatically, producing high-quality code.

Near Real-time Analytics on Embedded Edge Computing Devices, CARTA Core Project
Ranga Vatsavai
$60,000 by Center for Accelerated Real Time Analytics (CARTA) - NCSU Research Site
07/ 1/2019 - 06/30/2020

In many real-world applications, data loses its value if it’s not analyzed in near real time. Examples include natural disasters, crop disease identification and bioterrorism, traffic monitoring, monitoring human activities and public places. Edge computing refers to pushing computing power to the edge of the network or bringing it closer to the sensors. We envision that the embedded supercomputers (e.g., Jetson TX1 and TX2; 1 Teraflop; ~10 Watts) allow computing at the edge (e.g., UAVs). This framework would then allow near real-time analytics on streaming data, which is critical for first responders to national security agencies alike, and compress/reduce data before transmitted to the cloud or data centers. In this project, we propose to develop novel machine learning algorithms on the embedded supercomputers while the data is still in device memory and demonstrate the technology in two real-world applications: crop monitoring and traffic monitoring. Proposed technical work involves following three key stages. (i) Generate a statistical model from historical data (e.g., spectral signatures of different crops) by using statistically principled mixture model (e.g., Gaussian Mixture Model (GMM)), (ii) As the data is being acquired compare new (streaming) data with the GMM model to identify any anomalous patterns (e.g., weeds), (iii) generate event signal about the anomaly before the data is being compressed and transferred out from devise memory.

CHS: Small: Adaptive Rendering and Display for Emerging Immersive Experiences
Ben Watson
$497,177 by National Science Foundation (NSF)
10/ 1/2020 - 09/30/2023

This project will develop adaptive rendering and display technologies supporting emerging immersive displays, including wall-spanning, glasses-free stereo windows and lightweight AR/VR glasses. Such displays demand high-bandwidth, low-latency input not available today. We will attack this problem with software and hardware that exploit perceptual asymmetries and spatiotemporal redundancies. The resulting immersive displays will help realize nascent applications such as immersive entertainment and simulation, socially engaging conferencing and first-person wayfinding.

LAS DO1 Watson - 2.1 Human-Machine Collaboration
Ben Watson
$175,581 by Laboratory for Analytic Sciences
01/17/2019 - 12/31/2020

LAS DO1 Watson - 2.1 Human-Machine Collaboration

LAS DO1 Option Year Williams- 3.0 Machine Learning Integrity
Laurie Williams
$116,075 by Laboratory for Analytic Sciences
01/17/2019 - 12/31/2020

LAS DO1 Option Year Williams- 3.0 Machine Learning Integrity The goal of the projects examining Machine Learning Integrity is to identify and address issues that impact the timeliness, objectivity, reliability, explainability, and quality of machine learning approaches. These issues may arise from the specifics of an application, from a lack of data, from computational or policy constraints, or from adversarial actions.

SHF: Small: Detecting the 1%: Growing the Science of Vulnerability Detection
Laurie Williams ; Timothy Menzies
$499,998 by National Science Foundation (NSF)
10/ 1/2019 - 09/30/2022

Software practitioners need methods to prioritize security verification efforts through the development of practical vulnerability prediction models. The PIs of this project have conducted extensive research of software analytics and vulnerability prediction algorithms. Based on that work, we can assert that vulnerability predictors usually use old data mining technology, some of which dates back several decades. This proposal will explore numerous better ways to build vulnerability predictors.

Science of Security Lablet: Impact through Research, Scientific Methods, and Community Development
Laurie Williams ; Munindar Singh
$467,750 by US Dept. of Defense (DOD)
04/ 4/2018 - 09/14/2022

This project proposes the continuation of the Science of Security Lablet at NC State University. Science of Security refers to the study of cybersecurity from an explicitly scientific perspective. Cybersecurity encompasses elements of technology, human behavior, and policy. Science of Security seeks to identify and apply the appropriate scientific principles on cybersecurity problems, enhancing rigor and reproducibility, thereby improving the transfer of research to practice. This Lablet provides a home for investigations into diverse topics pertaining to a Science of Security. The Lablet will support the three major elements of a Science of Security: research, scientific methods, and community engagement.

Science of Security Lablet: Impact through Research, Scientific Methods, and Community Development - Additional Funding
Laurie Williams ; Munindar Singh
$537,539 by National Security Agency
04/ 4/2018 - 09/14/2022

Science of Security Lablet: Impact through Research, Scientific Methods, and Community Development - Additional Funding
Laurie Williams ; Munindar Singh
$356,447 by National Security Agency
04/ 4/2018 - 09/14/2022

FARM BILL: NRI: INT: Towards the Development of a Customizable Fleet of Autonomous Co-Robots for Advancing Aquaculture Production
Sierra Young ; Steven Hall ; John-Paul Ore ; Celso Castro Bolinaga ; Natalie Nelson
$499,245 by US Dept. of Agriculture (USDA) - National Institute of Food and Agriculture
11/ 1/2020 - 10/31/2024

Aquaculture, the rearing and harvesting of organisms in water environments, is a rapidly expanding industry that now produces more seafood than all wild caught fisheries worldwide. This inevitable growth must be steered towards sustainable production practices, which requires intensive monitoring in areas that are difficult and potentially dangerous to access. The vision of this project is to improve the efficiency and sustainability of near-shore aquaculture production through integrating a flexible, customizable, multi-task vehicle fleet, consisting primarily of unmanned aerial vehicles (UAVs) and unmanned surface vehicles (USVs), with a biologically-relevant framework for accelerated prototyping. This project will use oyster production along the Eastern US shoreline as a case study and testbed.

Collaborative Research: CNS Core: Small: Robust Resource Planning and Orchestration to Satisfy End-to-End SLA Requirements in Mobile Edge Networks.
Ruozhou Yu
$142,500 by National Science Foundation (NSF)
10/ 1/2020 - 09/30/2023

The potential of modern real-time applications, while enabled by advances in wireless communication technologies, is limited by the poor and unpredictable performance of the cloud backend as an Internet-based service. Edge computing is believed to be the magic bullet to this problem, but after years of research, we have yet witnessed the first large-scale deployment and utilization of edge computing. We believe the barrier is the lack of SLA-based performance guarantee, due to the inevitable risk of SLA violation. This project aims to take the first step in modeling and optimization of SLA violation risks in mobile edge computing.