This page contains a description of the funded research projects for which David Whalley is a principal or co-principal investigator. The publications listed for each project are only those that are directly related to the project's goals. As described below, Dr. Whalley's research spans compiler optimizations, computer architecture, embedded systems, compilation tools, security, and worst-case execution time analysis. His recent research has been in making processors more energy efficient and techniques for improving instruction level parallelism.
This project develops a Statically Controlled Asynchronous Lane Execution (SCALE) approach that supports separate asynchronous execution lanes where dependencies between instructions in different lanes are statically identified by the compiler to provide inter-lane synchronization. As implied by its name, the SCALE approach has the ability to scale to different types and levels of parallelism by allowing the compiler to generate code for different modes of execution to adapt to the type of parallelism that is available at each point within an application.
Instruction-level parallelism (ILP) in computing allows different machine-level instructions within an application to execute in parallel within a micro-processor. Exploitation of ILP has provided significant performance benefits in computing, but there has been little improvement in ILP in recent years. This project proposes a new approach called "eager execution" that could significantly increase ILP. The project's approach includes the following advantages: (1) immediately-dependent consumer instructions can be more quickly delivered to functional units for execution; (2) the execution of instructions whose source register values have not changed since its last execution can be detected and redundant computation can be avoided; (3) the dependency between a producer/consumer pair of instructions can sometimes be collapsed so they can be simultaneously dispatched for execution; (4) consumer instructions from multiple paths may be speculatively executed and their results can be naturally retained in the paradigm to avoid re-execution after a branch misprediction;and (5) critical instructions can be eagerly executed to improve performance, which include loads to prefetch cache lines and pre-computation of branch results to avoid branch misprediction delays.
Existing program execution models based on the Von-Neumann execution paradigm are not scalable mainly due to the diffculty of scaling the communication, synchronization and naming aspects of the model. This project aims to develop an alternative execution model, namely, demand-driven execution of imperative programs. In our model, a program that is compiled from a contemporary language is executed in a demand-driven fashion in such a way that both instruction and data-level parallelism can be harvested through compiler-architecture collaboration. Key to this collaboration is our single-assignment form representation of programs, where the form not only provides an appropriate program representation to be used by an optimizing compiler, but also represents the instruction-set architecture of the machine. This is a collaborative project with Soner Onder at Michigan Technological University, sponsored by NSF from 08/15 to 07/19.
This project will enable Florida State University (FSU) students to visit Chalmers University of Technology (CTH) to conduct research on the development of ecient and secure mobile systems. Over the three years of this project, 15 students will visit CTH, which is in Gothenburg, Sweden, for a period of 10 weeks during May, June, and July. While in residence, the students will work closely with the faculty and students in the research groups of Professors Per Larsson-Edefors, Sally A. McKee, Alejandro Russo, and Per Stenstrom. This is a collaborative project with Gary Tyson and Andy Wang and is sponsored by NSF from 09/14 to 08/17.
As mobile embedded systems become more prevalent, there is an increasing demand to make processor pipelines more power efficient. Conventional pipeline inefficiencies include unnecessary accesses to the register file due to duplication or avoidable computation from constantly checking for forwarding and hazards at points where they cannot possibly occur, repeated calculation of invariant values, etc. It is desirable to develop an alternative processor design that can avoid these wasteful energy consumption aspects of a traditionally pipelined processor while still achieving comparable performance. A statically pipelined processor is expected to achieve these goals by having the control during each cycle for each portion of the processor explicitly represented in each instruction. The pipelining is in effect statically determined by the compiler, which has several potential benefits, such as reducing energy consumption without degrading performance, supporting a less complex design with a lower production cost, and being able to apply more effective compiler optimizations due to instructions having more explicit control of the processor. This is a collaborative project with Gary Tyson at Florida State University, sponsored by NSF from 05/10 to 04/14 with an extension year to 04/15.
Mobile computer systems and software are increasingly subject to a host of security threats and malicious software (malware) attacks due to vulnerabilities in their coding. Traditional approaches have sought to provide an absolute defense to specific malware attacks by patching software vulnerabilities or detecting and blocking malware. The current situation also represents a programmatic arms race between patching existing vulnerabilities and exploiting vulnerabilities in new application code. This research develops a new secure mobile computing environment based on current mobile technology widely available as consumer end products that seeks to use program differentiation to reduce the propagation rate of malware when a software vulnerability exists. This results not in the direct elimination of security vulnerabilities, but in the dramatic reduction in scope of any security exploit to infect large numbers of systems. By constraining the outbreak to only a few systems, counter measures can be employed before significant economic damage can result. By modifying aspects of the execution of the application, application executables can be permuted into unique versions for each distributed instance. Differentiation is achieved using hardware and/or systems software modifications. This is a collaborative project with Gary Tyson at Florida State University, sponsored by NSF from 09/09 to 08/12.
An instruction register file (IRF) can be used to hold the most frequently occurring instructions within an application. The instructions in the IRF can be referenced by a single packed instruction in ROM or a L1 instruction cache (IC). Use of an IRF can decrease code size due to referencing multiple IRF instructions within a packed instruction, reduce energy consumption since a 32 entry IRF requires much less power to access than a L1 IC, and reduce execution time due to a smaller footprint in the IC and the ability to execute instructions during an IC miss. In this project we are evaluating compiler and architectural techniques to address limitations on the number of packed instructions and how an IRF can be used to complement other compiler optimizations and architectural features that save energy and reduce code size. This is a collaborative project with Gary Tyson at Florida State University, sponsored by NSF from 09/06 to 08/09.
Predicting the WCET (worst-case execution time) of tasks is needed for real-time systems. One issue that limits the set of applications whose WCET can be predicted is determing the worst-case number of iterations that each loop will iterate. We are developing a timing analyzer that can predict the WCET of applications where the number of loop iterations is known at run-time. Rather than predicting a constant number of cycles, we will instead produce a WCET formula, which can be quickly evaluated at run-time to support dynamic scheduling decisions. We have also integrated a timing analyzer with a compiler to support the development of compiler optimizations to decrease WCET. This is a collaborative project with Chris Healy at Furman University and Frank Mueller at North Carolina State University, sponsored by NSF from 09/03 to 08/06.
Conditional branches are expensive since they are frequently executed, consume cycles, and can be mispredicted which can cause pipeline stalls. We are investigating how to reduce the number of executed branches by merging branch conditions. This is a collaborative project with Mark Bailey at Hamilton College and Robert van Engelen and Xin Yuan at Florida State University, sponsored by NSF from 09/02 to 08/05.
VISTA (Vpo Interactive System for Tuning Applications) allows embedded application developers to interact with a low-level compiler to tune their code. Features include visualization of the program representation, selecting the order and scope of compiler optimizations, entering code-improving transformations by hand, automatic selection of optimization phase sequences according to specified performance criteria, and undoing of previously applied transformations. This is a collaborative project with Jack Davidson at University of Virginia, Doug Jones at University of Illinois-Champaign, and Kyle Gallivan at Florida State University, sponsored by NSF from 10/00 to 09/05.
Embedded applications are often developed entirely or partially in assembly code. In this project we investigated techniques for validating both manually and automatic code-improving transformations. This was a collaborative project with Robert van Engelen and Xin Yuan at Florida State University, sponsored by NSF from 09/99 to 08/03.
This project involved the development of optimization techniques that can be used on large applications. This was a collaborative project with Rajiv Gupta at University of Arizona, Lori Pollack at University of Delaware, and Mary Lou Soffa at University of Pittsburgh, sponsored by NSF from 10/98 to 09/02.
Predicting the worst-case execution time (WCET) of tasks is required for real-time systems. This project involved the development of a timing analysis techniques that could automatically predict the WCET of programs in the presence of modern architectural features such as caches and pipelines. This was a collaborative project with Marion Harmon at Florida A&M University, sponsored by ONR from 10/93 to 09/96.