Low-level code improvement and loop optimizations made simple

Many low-level code and loop optimization techniques can be simplified with the use of the the Chains of Recurrences (CR) algebra applied to SSA forms of intermediate code. We applied the method to automatic loop parallelization and SIMD vectorization. We also implemented the SSA-CR method in GCC 4.1 for low-level code optimizations and produced better code optimization results compared to the highest level of optimization in GCC 4.1.

Compiler transformations are applied under strict conditions to ensure the semantic equivalence of the code before and after a code-optimizing transformation. The conditions for loop transformations are based on extensive loop code analysis to find linear and nonlinear induction variables, determine cross-iteration data dependences, and other loop-related properties. Loop parallelization for multicore machines gives reasonable speedups, because most of the execution time of (scientific) applications is spent in loops. New loop analysis techniques based on the Chains of Recurrences (CR) algebra are applied to automatic loop parallelization, SIMD vectorization, and strength reduction. We are also investigating efficient methods for loop iteration space bounding, with applications to worst-case execution time (WCET) estimation.