# Efficient Symbolic Analysis for Optimizing Compilers

## Symbolic Analysis

Generalized induction variable recognition
E.g. [Ammerguallat et al. 1990, Eigenmann et al. 1991, Haghighat et al. 1992, Singh & Hennessy 1992, Wolfe 1992]

Non-Linear & symbolic
data dependence analysis
E.g. [Blume & Eigenmann 1995, Fahringer 1998] and many others...

## Linear Induction Variable

• Loop counter variable I
• Variable with single update operation

 DO ¯I = A, B  ...  J = J+H  ...

where H is a loop-invariant expression

## Generalized Induction Variable (giv)

• GIV values form polynomial and geometric progressions through loop iterations
• A GIV can have multiple update operations
• GIVs can be coupled

 DO ¯I = A, B  ...  J = J+K       polynomial  ...  K = K+H       linear  ...  J = J+1       2nd update  ...  P = 2*P       geometric  ...

## giv Recognition

GIV recognition amounts to finding the closed-form characteristic function c of a GIV:

 c(n)=j(n)+r an
where

n is the iteration number
j is a polynomial
r and a are loop-invariant expressions

## giv RecognitionExample

 DO ¯I = 0, N  ...  J = J+K  ...  K = K+H  ...  J = J+1  ...  P = 2*P  ...

 J
 =
 cJ(I)=J0 + H * (I2 - I)/2+ K0 * I + 1
 K
 =
 cK(I)=K0 + H * I
 P
 =
 cP(I)=P0 * 2I
J0, K0, and P0 are initial values of J, K, and P; H is loop invariant

## giv Recognition and Compiler Optimizations

Induction variable recognition is important to

• Data dependence analysis
• Loop parallelization and unimodular loop transformations of non-rectangular loops
• Loop transformations for data cache optimization, e.g. blocking
• Performance estimation by counting number of iterations of non-rectangular loops
• Parallelization of Perfect Benchmarktm codes which contain a number of GIVs

## givs and Data Dependence Analysis

 ¯ J = 0  K = 1  DO ¯I = 0, N S1:  J = J+K S2:  A[J] = B[K] S3:  K = K+2 S4:  B[2*I] = ...  ENDDO

ß

 ¯ J = 0  K = 1  DO ¯I = 0, N S1:  J = J+K S2:  A[I*(I+2)+1] = B[2*I+1] S3:  K = K+2 S4:  B[2*I] = ...  ENDDO

GCD-test shows no dependence S2 d= S4

## Parallelization by giv Substitution

 J = 0 K = 1 DO ¯I = 0, N  J = J+K  A[J] = B[K]  K = K+2  B[2*I] = ... ENDDO

ß

 DO ¯I = 0, N  A[I*(I+2)+1] = B[2*I+1]  B[2*I] = ... ENDDO

Elimination of GIV updates may enable loop parallelization and other optimizations

## Haghighat's Symbolic Differencing

Detects GIVs in loops by differencing the values of each variable through symbolically executing the first m loop iterations and by using Newton's interpolating formula to find the closed form

 DO ¯I = 0, N  T = a  a = b  b = c+2*b-T+2  c = c+d  d = d+I ENDDO

## Symbolic Differencing(cont'd)

Problem: possibility of incorrect closed form when user sets m too low

• For m=3: 3rd order polynomial closed form
T = a+(d*i**3+(3*c-3*d+6)*i**2+(6*b-9*c+2*d-6)*i)/6
• For m=5: 5th order polynomial closed form
T = a+(i**5-10*i**4+(20*d+35)*i**3+(60*c-60*d+70)*i**2
+(120*b-180*c+40*d-96)*i)/120
No feedback of this polynomial truncation error for this example!

## giv Recognition With Chains of Recurrences

• Safe method
• Simple to implement
• Efficient
• Detects larger class of induction variables
• Detects wrap-around variables
• Detects conditional GIVs
• Cannot detect GIVs from cyclic recurrences (these are very rare)

## Cyclic RecurrencesExample

Cyclic recurrences are introduced by variable updates that involve some form of exchange

 A = 1 B = -1 DO ¯I = 0, N  T=A  A=B  B=T  ... ENDDO

Closed form of T = cT(I) = (-1)I

Note: loop unrolling might help!

## Chains of Recurrences

A chain of recurrences (CR) is a compact representation of

• polynomials
• (hyper)geometric series
• factorials
• trigonometric functions
• combinations of these
[Bachmann 1994]

## CR Notation

 Fi={f0,Ä1,f1,Ä2,¼,Äk-1,fk-1,Äk,fk}i
where

i spans the index space (i ³ 0)
fj are loop-invariant expressions (0 £ j < k)
Äj=+ or Äj=*
fk is a loop-invariant expression or a CR in i

## CR Algebra

14 core rewrite rules on CRs

 E+{f0,+,f1}i
 Þ
 {E+f0,+,f1}i
 when E is loop invariant
 E*{f0,+,f1}i
 Þ
 {E*f0,+,E*f1}i
 when E is loop invariant
 E*{f0,*,f1}i
 Þ
 {E*f0,*,f1}i
 when E is loop invariant
 :
 :
CR term rewriting system is complete
(confluent and terminating)

## CR Construction

CRs are constructed by applying rewrite rules from the CR algebra to expressions

Loop variable I=A..B is replaced by {A,+,1}I

 DO ¯I = A, B  L = 2*I+H  P = I*I-I  Q = I*(I-1)  R = 2**I

 I
 Þ
 {A,+,1}I
 L = 2*I+H
 Þ
 2*{A,+,1}I+H
 Þ
 {2*A+H,+,2}I
 P = I*I-I
 Þ
 {A2-A,+,2*A,+,2}I
 Q = I*(I-1)
 Þ
 {A2-A,+,2*A,+,2}I
 R = 2I
 Þ
 {2A,*,2}I

## giv Recognition Step 1

Obtain a SSA form of code with single scalar variable updates ordered such that each scalar variable has its def after all its uses

 DO ¯I = A, B  J = J+K  K = K+H  J = J+1  ...

ß

 DO ¯I = A, B  J = J+K+1  K = K+H  ...

## giv Recognition Step 2

 V = V+X
 Þ
 {V0,+,X}i
 V = V-X
 Þ
 {V0,+,-X}i
 V = V*X
 Þ
 {V0,*,X}i
 V = V/X
 Þ
 {V0,*,1/X}i
where X is a loop-invariant expression or a CR

Replace all uses of V with the CR of V

 DO ¯I = A, B  K = {K0,+,H}I  J = J+{K0,+,H}I+1  ...

ß

 DO ¯I = A, B  K = {K0,+,H}I  J = {J0+1,+,K0,+,H}I  ...

## Induction Variable Substitution (ivs)

Apply CR inverse rules to obtain closed forms and normalize the loop

 DO ¯I = A, B  K = {K0,+,H}I  J = {J0+1,+,K0,+,H}I  ...

ß

 J0 = J K0 = K DO ¯I = 0, B-A  K = K0+H*I  J = J0+(H*I*(I-1))/2+I*K0+1  ...

## ivs Algorithm

• Programming model of algorithm in paper:
source-code level sequences, assignment, if-then-else, do-loop, while-loop
• Analyzes loop hierarchies
• Detects multivariate GIVs using multivariate CRs
• Applies GIV recognition and IVS to the innermost loops first

## ivs Worst-Case Complexity

 O(k n log(n) m2)
where

k is the maximum loop nesting level
n is the length of the source code
m is the maximum polynomial order of GIVs

## Wrap-Around Variables

 J = M DO ¯I = M, N  A[J] = B[K]  J = I+1  K = J ENDDO

ß

 J0 = M DO ¯I = 0, N-M  A[¯{J0-M,*,0}I+{M, +, 1}I] =         B[{K-M,*,0}I+{M,+,1}I] ENDDO

ß

 DO ¯I = 0, N-M  A[I+M]=B[(0**I)*(K-M)+I+M] ENDDO

## No Closed Form

GIVs (polynomials and (hyper)geometric series) always have closed forms. We generate index array for induction variables with no closed form

 DO ¯I = 1, N  X[J] = ...  J = J+K  K = K*P  P = P*4 ENDDO

ß

 A[0] = 0 DO ¯I = 1, N-1  A[I]=A[I-1]+(P**I)*(2**(I*(I-1))) ENDDO DO ¯I = 0, N-1  X[J+K*A[I]] = ... ENDDO

## Loop Strength Reduction

Transformation from closed form to CR to induction variable update operations [Bachmann 1994]

CR of a closed form c(I) (0 £ I £ N) is

 {f0,Ä1,f1,Ä2,¼,Äk-1,fk-1,Äk,fk}I

 cr0 = f0 cr1 = f1 : crk = fk DO ¯I = 0, N  c(I) = cr0  cr0 = cr0Ä1 cr1  cr1 = cr1Ä2 cr2  :  crk-1 = crk-1Äk crk ENDDO

## Conclusions

• GIV recognition method is safe
• IVS algorithm is as powerful as symbolic differencing
• IVS cannot handle cyclic recurrences
• Implementation requires expression rewriting comparable to constant folding
• CRs are normal forms for polynomials and geometric series

File translated from TEX by TTH, version 2.87.
On 10 Apr 2001, 13:38.