This page contains a description of the funded research projects for which David Whalley is a principal or co-principal investigator. The publications listed for each project are only those that are directly related to the project's goals. As described below, Dr. Whalley's research spans compiler optimizations, computer architecture, embedded systems, compilation tools, security, and worst-case execution time analysis. His recent research has been in making processors more energy efficient and techniques for improving instruction level parallelism.


Sphinx: Combining Data and Instruction Level Parallelism through Demand Driven Execution of Imperative Programs

Existing program execution models based on the Von-Neumann execution paradigm are not scalable mainly due to the diffculty of scaling the communication, synchronization and naming aspects of the model. This project aims to develop an alternative execution model, namely, demand-driven execution of imperative programs. In our model, a program that is compiled from a contemporary language is executed in a demand-driven fashion in such a way that both instruction and data-level parallelism can be harvested through compiler-architecture collaboration. Key to this collaboration is our single-assignment form representation of programs, where the form not only provides an appropriate program representation to be used by an optimizing compiler, but also represents the instruction-set architecture of the machine. This is a collaborative project with Soner Onder at Michigan Technological University, sponsored by NSF from 08/15 to 07/19.


Supporting FSU Student Research with CTH Faculty on Efficient and Secure Mobile Systems

This project will enable Florida State University (FSU) students to visit Chalmers University of Technology (CTH) to conduct research on the development of ecient and secure mobile systems. Over the three years of this project, 15 students will visit CTH, which is in Gothenburg, Sweden, for a period of 10 weeks during May, June, and July. While in residence, the students will work closely with the faculty and students in the research groups of Professors Per Larsson-Edefors, Sally A. McKee, Alejandro Russo, and Per Stenstrom. This is a collaborative project with Gary Tyson and Andy Wang and is sponsored by NSF from 09/14 to 08/17.

  1. "Redesigning a Tagless Access Buffer That Requires Minimal ISA Changes" by C. Sanchez, P. Gavin, D. Moreau, M. Sjalander, D. Whalley, P. Larsson-Edefors, S. McKee in the Proceedings of the IEEE/ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, October 2016.

Static Pipelining, an Approach for Ultra-Low Power Embedded Processors

As mobile embedded systems become more prevalent, there is an increasing demand to make processor pipelines more power efficient. Conventional pipeline inefficiencies include unnecessary accesses to the register file due to duplication or avoidable computation from constantly checking for forwarding and hazards at points where they cannot possibly occur, repeated calculation of invariant values, etc. It is desirable to develop an alternative processor design that can avoid these wasteful energy consumption aspects of a traditionally pipelined processor while still achieving comparable performance. A statically pipelined processor is expected to achieve these goals by having the control during each cycle for each portion of the processor explicitly represented in each instruction. The pipelining is in effect statically determined by the compiler, which has several potential benefits, such as reducing energy consumption without degrading performance, supporting a less complex design with a lower production cost, and being able to apply more effective compiler optimizations due to instructions having more explicit control of the processor. This is a collaborative project with Gary Tyson at Florida State University, sponsored by NSF from 05/10 to 04/14 with an extension year to 04/15.

  1. "Improving Low Power Processor Efficiency with Static Pipelining" by I. Finlayson, G. Uh, D. Whalley, G. Tyson in the Proceedings of the Workshop on Interaction between Compilers and Computer Architecture (INTERACT), February 2011.
  2. "An Overview of Static Pipelining" by I. Finlayson, G. Uh, D. Whalley, G. Tyson in IEEE Computer Architecture Letters (CAL), June 2012, pages 17-20.
  3. "Improving Processor Efficiency by Statically Pipelining Instructions" by I. Finlayson, B. Davis, P. Gavin, G. Uh, D. Whalley, M. Sjalander, G. Tyson in the Proceedings of the ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), June 2013, pages 33-43.
  4. "Towards a Performance and Energy-Efficient Data Filter Cache" by A. Bardizbanyan, M. Sjalander, D. Whalley, and P. Larsson-Edefors in the Proceedings of the ACM Workshop on Optimizations for DSPs and Embedded Systems (ODES), February 2013, pages 21-28.
  5. "Improving Data Access Efficiency by Using a Tagless Access Buffer (TAB)" by A. Bardizbanyan, P. Gavin, D. Whalley, M. Sjalander, P. Larsson-Edefors, S. McKee, P. Stenstrom in the Proceedings of the ACM/IEEE International Symposium on Code Generation and Optimization (CGO), February 2013, pages 269-279.
  6. "Speculative Tag Access for Reduced Energy Dissipation in Set-Associative L1 Data Caches" by A. Bardizbanyan, M. Sjalander, D. Whalley, P. Larsson-Edefors in the Proceedings of the IEEE International Conference on Computer Design (ICCD), October 2013, pages 302-308.
  7. "Designing a Practical Data Filter Cache to Improve Both Energy Efficiency and Performance" by A. Bardizbanyan, M. Sjalander, D. Whalley, and P. Larsson-Edefors in ACM Transactions on Architecture and Code Optimization (TACO), vol 10, no 4, December 2013.
  8. "Reducing Instruction Fetch Energy in Multi-Issue Processors" by P. Gavin, D. Whalley, and M. Sjalander in ACM Transactions on Architecture and Code Optimization (TACO), vol 10, no 4, December 2013.
  9. "Reducing Set-Associative L1 Data Cache Energy by Early Load Data Dependence Detection (ELD3) by A. Bardizbanyan, M. Sjalander, D. Whalley, P. Larsson-Edefors in the Proceedings of the IEEE/ACM Design Automation and Test in Europe (DATE) Conference, March 2014.
  10. "Optimizing Transfers of Control in the Static Pipeline Architecture" by R. Baird, P. Gavin, M. Sjalander, D. Whalley, G. Uh in the Proceedings of the ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), June 2015.
  11. "Improving Data Access Efficiency by Using Context-Aware Loads and Stores" by A. Bardizbanyan, M. Sjalander, D. Whalley, P. Larsson-Edefors in the Proceedings of the ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), June 2015.
  12. "Scheduling Instruction Effects for a Statically Pipelined Processor" by B. Davis, P. Gavin, R. Baird, M. Sjalander, I. Finlayson, F. Rasapour, G. Cook, G. Uh, D. Whalley, G. Tyson in the Proceedings of the IEEE/ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), October 2015.

Reducing Virus Propagation in Mobile Devices

Mobile computer systems and software are increasingly subject to a host of security threats and malicious software (malware) attacks due to vulnerabilities in their coding. Traditional approaches have sought to provide an absolute defense to specific malware attacks by patching software vulnerabilities or detecting and blocking malware. The current situation also represents a programmatic arms race between patching existing vulnerabilities and exploiting vulnerabilities in new application code. This research develops a new secure mobile computing environment based on current mobile technology widely available as consumer end products that seeks to use program differentiation to reduce the propagation rate of malware when a software vulnerability exists. This results not in the direct elimination of security vulnerabilities, but in the dramatic reduction in scope of any security exploit to infect large numbers of systems. By constraining the outbreak to only a few systems, counter measures can be employed before significant economic damage can result. By modifying aspects of the execution of the application, application executables can be permuted into unique versions for each distributed instance. Differentiation is achieved using hardware and/or systems software modifications. This is a collaborative project with Gary Tyson at Florida State University, sponsored by NSF from 09/09 to 08/12.

  1. "Program Differentiation" by D. Chang, S. Hines, P. West, G. Tyson, D. Whalley in the Proceedings of the Workshop on Interaction between Compilers and Computer Architecture (INTERACT), March 2010.

  2. "Program Differentiation" by D. Chang, S. Hines, P. West, G. Tyson, D. Whalley in the Journal of Circuits, Systems, and Computers, vol 21, no 2, April 2012.

Enhancing the Effectiveness of Utilizing an Instruction Register File

An instruction register file (IRF) can be used to hold the most frequently occurring instructions within an application. The instructions in the IRF can be referenced by a single packed instruction in ROM or a L1 instruction cache (IC). Use of an IRF can decrease code size due to referencing multiple IRF instructions within a packed instruction, reduce energy consumption since a 32 entry IRF requires much less power to access than a L1 IC, and reduce execution time due to a smaller footprint in the IC and the ability to execute instructions during an IC miss. In this project we are evaluating compiler and architectural techniques to address limitations on the number of packed instructions and how an IRF can be used to complement other compiler optimizations and architectural features that save energy and reduce code size. This is a collaborative project with Gary Tyson at Florida State University, sponsored by NSF from 09/06 to 08/09.

  1. "Improving Program Efficiency by Packing Instructions into Registers" by S. Hines, J. Green, G. Tyson, D. Whalley in the Proceedings of the IEEE/ACM International Symposium on Computer Architecture (ISCA), June 2005, pages 260-271.

  2. "Improving the Energy and Execution Efficiency of a Small Instruction Cache by Using an Instruction Register File" by S. Hines, G. Tyson, D. Whalley, in the Proceedings of the Watson Conference on Interaction between Architecture, Circuits, and Compilers, September 2005, pages 160-169.

  3. "Reducing Instruction Fetch Cost by Packing Instructions into Register Windows" by S. Hines, G. Tyson, D. Whalley, in the Proceedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2005, pages 19-29.

  4. "Adapting Compilation Techniques to Enhance the Packing of Instructions into Registers" by S. Hines, D. Whalley, G. Tyson in the Proceedings of the International IEEE/ACM Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), October 2006, pages 43-53.

  5. "Addressing Instruction Fetch Bottlenecks by Using an Instruction Register File" by S. Hines, G. Tyson, D. Whalley in the Proceedings of the ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), June 2007.

SPARTA: Static Parametric Timing Analysis to Support Dynamic Decisions in Embedded Systems

Predicting the WCET (worst-case execution time) of tasks is needed for real-time systems. One issue that limits the set of applications whose WCET can be predicted is determing the worst-case number of iterations that each loop will iterate. We are developing a timing analyzer that can predict the WCET of applications where the number of loop iterations is known at run-time. Rather than predicting a constant number of cycles, we will instead produce a WCET formula, which can be quickly evaluated at run-time to support dynamic scheduling decisions. We have also integrated a timing analyzer with a compiler to support the development of compiler optimizations to decrease WCET. This is a collaborative project with Chris Healy at Furman University and Frank Mueller at North Carolina State University, sponsored by NSF from 09/03 to 08/06.

  1. "Parametric Timing Analysis" by E. Vivancos, C. Healy, F. Mueller, and D. Whalley in the Proceedings of the ACM SIGPLAN Workshop on Language, Compilers, and Tools for Embedded Systems (LCTES), June 2001, pages 88-93.

  2. "Tuning the WCET of Embedded Applications" by W. Zhao, D. Whalley, C. Healy, F. Mueller, and G. Uh in the Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), May 2004, pages 472-481.

  3. "WCET Code Positioning" by W. Zhao, D. Whalley, C. Healy, and F. Mueller in the Proceedings of the IEEE Real-Time Systems Symposium (RTSS), December 2004.

  4. "Improving WCET by Optimizing Worst-Case Paths" by W. Zhao, W. Kreahling, D. Whalley, C. Healy, F. Mueller in the Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), March 2005, pages 138-147.

  5. "Timing Analysis for Sensor Network Nodes of the Atmega Processor Family" by S. Mohan, F. Mueller, D. Whalley, C. Healy in the Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), March 2005, pages 405-414.

  6. "Parascale: Exploiting Parametric Timing Analysis for Real-Time Schedulers and Dynamic Voltage Scaling" by S. Mohan, F. Mueller, C. Healy, D. Whalley, W. Hawkins, M. Root in the Proceedings of the IEEE Real-Time Systems Symposium (RTSS), December 2005, pages 233-242.

  7. "Improving WCET by Applying a WC Code Positioning Optimization" by W. Zhao, D. Whalley, C. Healy, F. Mueller, in ACM Transactions on Architecture and Code Optimization (TACO), December 2005, pages 335-365.

  8. "Improving WCET by Applying Worst-Case Path Optimizations" by W. Zhao, W. Kreahling, D. Whalley, C. Healy, F. Mueller in Real-Time Systems (RTS), October 2006, pages 129-152.

Branch Elimination by Condition Merging

Conditional branches are expensive since they are frequently executed, consume cycles, and can be mispredicted which can cause pipeline stalls. We are investigating how to reduce the number of executed branches by merging branch conditions. This is a collaborative project with Mark Bailey at Hamilton College and Robert van Engelen and Xin Yuan at Florida State University, sponsored by NSF from 09/02 to 08/05.

  1. "Branch Elimination via Multi-Variable Condition Merging" by W. Kreahling, D. Whalley, M. Bailey, X. Yuan, G. Uh, R. van Engelen in the Proceedings of the European Conference on Parallel and Distributed Computing, August 2003, pages 261-270.

  2. "Branch Elimination by Condition Merging" by W. Kreahling, D. Whalley, M. Bailey, X. Yuan, G. Uh, R. van Engelen in Software Practice & Experience (SP&E), January 2005, pages 51-74.

  3. "Reducing the Cost of Conditional Transfers of Control by Using Comparison Specifications" by W. Kreahling, S. Hines, D. Whalley, G. Tyson in the Proceedings of the ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), June 2006, pages 64-71.

A Comprehensive, Retargetable Embedded Systems Software Development Environment

VISTA (Vpo Interactive System for Tuning Applications) allows embedded application developers to interact with a low-level compiler to tune their code. Features include visualization of the program representation, selecting the order and scope of compiler optimizations, entering code-improving transformations by hand, automatic selection of optimization phase sequences according to specified performance criteria, and undoing of previously applied transformations. This is a collaborative project with Jack Davidson at University of Virginia, Doug Jones at University of Illinois-Champaign, and Kyle Gallivan at Florida State University, sponsored by NSF from 10/00 to 09/05.

  1. "VISTA: A System for Interactive Code Improvement" by W. Zhao, B. Cai, D. Whalley, M. Bailey, R. van Engelen, X. Yuan, J. Hiser, J. Davidson, K. Gallivan, and D. Jones in the Proceedings of the ACM SIGPLAN Conference on Language, Compilers, and Tools for Embedded Systems (LCTES), June 2002, pages 155-164.

  2. "Finding Effective Optimization Phase Sequences" by P. Kulkarni, W. Zhao, H. Moon, K. Cho, D. Whalley, J. Davidson, M. Bailey, Y. Paek, K. Gallivan, D. Jones in the Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), June 2003, pages 12-23.

  3. "Tuning the WCET of Embedded Applications" by W. Zhao, P. Kulkarni, D. Whalley, C. Healy, F. Mueller, G. Uh in the Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), May 2004.

  4. "Fast Searches for Effective Optimization Phase Sequences" by P. Kulkarni, S. Hines, J. Hiser, D. Whalley, J. Davidson, D. Jones in the Proceedings of the ACM SIGPLAN Conference on Programming Languages Design & Implementation (PLDI), June 2004, pages 171-182.

  5. "Tuning High Performance Kernels through Empirical Compilation" by R. Whaley, D. Whalley in the Proceedings of the International Conference on Parallel Processing (ICPP), June 2005, pages 89-98.

  6. "Fast and Efficient Searches for Effective Optimization Phase Sequences" by P. Kulkarni, S. Hines, D. Whalley, J. Hiser, J. Davidson, D. Jones, in ACM Transactions on Architecture and Code Optimization (TACO), June 2005, pages 165-198.

  7. "Improving the Energy and Execution Efficiency of a Small Instruction Cache by Using an Instruction Register File" by S. Hines, G. Tyson, D. Whalley, in the Proceedings of the Watson Conference on Interaction between Architecture, Circuits, and Compilers, September 2005, pages 160-169.

  8. "Using De-optimization to Re-optimize Code" by S. Hines, P. Kulkarni, D. Whalley, J. Davidson, in the Proceedings of the Embedded Software (EMSOFT) Conference, September 2005.

  9. "Reducing Instruction Fetch Cost by Packing Instructions into Register Windows" by S. Hines, G. Tyson, D. Whalley, in the Proceedings of the International Symposium on Microarchitecture (MICRO), November 2005, pages 19-29.

  10. "Exhaustive Optimization Phase Order Search Exploration" by P. Kulkarni, D. Whalley, G. Tyson, J. Davidson in the Proceedings of the International Symposium on Code Generation and Optimization (CGO), March 2006.

  11. "On the Use of Compilers in DSP Laboratory Instruction" by M. Kleffner, D. Jones, J. Hiser, P. Kulkarni, J. Parent, S. Hines, D. Whalley, J. Davidson, K. Gallivan in the Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASP), May 2006.

  12. "VISTA: VPO Interactive System for Tuning Applications" by P. Kulkarni, W. Zhao, D. Whalley, X. Yuan, R. van Engelen, K. Gallivan, J. Hiser, J. Davidson, B. Cai, M. Bailey, H. Moon, K. Cho, Y. Paek, D. Jones, in ACM Transactions on Embedded Computing Systems (TECS), November 2006, pages 819-863.

  13. "In Search of Near-Optimal Optimization Phase Orderings" by P. Kulkarni, D. Whalley, G. Tyson, J. Davidson in the Proceedings of the ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), June 2006.

  14. "Evaluating Heuristic Optimization Phase Order Search Algorithms" by P. Kulkarni, D. Whalley, G. Tyson, J. Davidson in the Proceedings of the International Symposium on Code Generation and Optimization (CGO), March 2007.

Automatic Validation of Code-Improving Transformations and Related Applications

Embedded applications are often developed entirely or partially in assembly code. In this project we investigated techniques for validating both manually and automatic code-improving transformations. This was a collaborative project with Robert van Engelen and Xin Yuan at Florida State University, sponsored by NSF from 09/99 to 08/03.

  1. "Automatic Validation of Code-Improving Transformations" by R. van Engelen, D. Whalley, and X. Yuan in the Proceedings of the ACM SIGPLAN Workshop on Language, Compilers, and Tools for Embedded Systems (LCTES), June 2000.

  2. "Validation of Code-Improving Transformations for Embedded Systems" by R. van Engelen, D. Whalley, and X. Yuan, in the Proceedings of the ACM SIGAPP Symposium on Applied Computing, March 2003, pages 684-691.

  3. "Automatic Validation of Code-Improving Transformations on Low-Level Program Representations" by R. van Engelen, D. Whalley, and X. Yuan, in Science of Computer Programming, August 2004, vol 52, pages 257-280.

Experimental Evaluation of Scalable Optimization Techniques

This project involved the development of optimization techniques that can be used on large applications. This was a collaborative project with Rajiv Gupta at University of Arizona, Lori Pollack at University of Delaware, and Mary Lou Soffa at University of Pittsburgh, sponsored by NSF from 10/98 to 09/02.

  1. "Coalescing Conditional Branches into Efficient Indirect Jumps" by G. Uh and D. Whalley in the Proceedings of the Static Analysis Symposium (SAS), September 1997, pages 315-329.

  2. "Improving Performance by Branch Reordering" by M. Yang, G. Uh, and D. Whalley in the Proceedings of the SIGPLAN '98 Conference on Programming Language Design and Implementation (PLDI), June 1998, pages 130-141.

  3. "Effectively Exploiting Indirect Jumps" by G. Uh and D. Whalley, in Software Practice & Experience (SPE), December 1999, pages 1061-1101.

  4. "Efficient and Effective Branch Reordering Using Profile Data" by M. Yang, G. Uh, and D. Whalley, in ACM Transactions on Programming Languages and Systems (TOPLAS), vol 24, no 6, November 2002, pages 667-697.

Predicting Execution Time of Large Code Segments

Predicting the worst-case execution time (WCET) of tasks is required for real-time systems. This project involved the development of a timing analysis techniques that could automatically predict the WCET of programs in the presence of modern architectural features such as caches and pipelines. This was a collaborative project with Marion Harmon at Florida A&M University, sponsored by ONR from 10/93 to 09/96.

  1. "Predicting Instruction Cache Behavior" by F. Mueller, D. B. Whalley, M. G. Harmon in the Proceedings of the ACM SIGPLAN Workshop on Language, Compiler, and Tool Support for Real-Time Systems (LCTRTS), June 1994.

  2. "Bounding Worst-Case Instruction Cache Performance" by R. D. Arnold, F. Mueller, D. B. Whalley, and M. G. Harmon in the Proceedings of the IEEE Real-Time Systems Symposium (RTSS), December 1994, pages 172-181.

  3. "Supporting User-Friendly Analysis of Timing Constraints" by L. Ko, D. B. Whalley, M. G. Harmon in the Proceedings of the ACM SIGPLAN Workshop on Language, Compilers, and Tools for Real-Time Systems (LCTRTS), June 1995, pages 107-115.

  4. "Integrating the Timing Analysis of Pipelining and Instruction Caching" by C. A. Healy, D. B. Whalley, and M. G. Harmon in the Proceedings of the IEEE Real-Time Systems Symposium (RTSS), December 1995, pages 288-297.

  5. "Supporting the Specification and Analysis of Timing Constraints" by L. Ko, C. Healy, E. Ratliff, R. Arnold, D. Whalley, and M. G. Harmon in the Proceedings of the IEEE Real-Time Technology and Applications Symposium (RTAS), June 1996, pages 170-178.

  6. "Timing Analysis for Data Caches and Set-Associative Caches" by R. White, F. Mueller, C. Healy, D. Whalley, and M. G. Harmon in the Proceedings of the IEEE Real-Time Technology and Applications Symposium (RTAS), June 1997, pages 192-202.

  7. "Bounding Loop Iterations for Timing Analysis" by C. Healy, M. Sjodin, V. Rustagi, and D. Whalley in the Proceedings of the IEEE Real-Time Technology and Applications Symposium (RTAS), June 1998, pages 12-21.

  8. "Bounding Pipeline and Instruction Cache Performance" by C. A. Healy, R. D. Arnold, F. Mueller, D. B. Whalley, and M. G. Harmon in IEEE Transactions on Computers (ToC), January 1999, pages 53-70.

  9. "Timing Constraint Specification and Analysis" by L. Ko, N. Al-Yaqoubi, C. Healy, E. Ratliff, R. Arnold, D. Whalley, and M. Harmon in Software Practice & Experience (SP&E), January 1999, pages 77-98.

  10. "Tighter Timing Predictions by Automatic Detection and Exploitation of Value-Dependent Constraints" by C. Healy and D. Whalley in the Proceedings of the IEEE Real-Time Technology and Applications Symposium (RTAS), June 1999, pages 79-88.

  11. "A General Approach for Tight Timing Predictions of Non-Rectangular Loops" by C. Healy, R. van Engelen and D. Whalley in the WIP Proceedings of the IEEE Real-Time Technology and Applications Symposium (RTAS), June 1999, pages 11-14. (This was a Work in Progress (WIP) paper.)

  12. "Timing Analysis for Data Caches and Wrap-Around Fill Caches" by R. White, F. Mueller, C. Healy, D. Whalley, and M. Harmon, in Real-Time Systems (RTS), November 1999, pages 209-233.

  13. "Supporting Timing Analysis by Automatic Bounding of Loop Iterations" by C. Healy, M. Sjodin, V. Rustagi, D. Whalley, and R. van Engelen, in Real-Time Systems (RTS), May 2000, pages 121-148.

  14. "Automatic Detection and Exploitation of Branch Constraints for Timing Analysis" by C. Healy and D. Whalley, in IEEE Transactions on Software Engineering (ToSE), August 2002, pages 763-781.