Loop parallelization (Term project) Part 3. Dependence analysis

Hope you are doing well in the first two parts of loop parallelization term project. In this assignment, you are going to do dependence analysis based on the previous implementation (if you failed to implement the previous two parts, please email me dchen39@cs.rochester.edu).

The deadline is 11:59pm Tuesday midnight, 11th April 2017.

Read Chapter 2 and 3 in Optimizing compilers for modern architectures.

To do:

1. List all references  (load/store) to the same array in the loop, assign a unique ID to each reference along with the type of operation that performed (load or store).

2. Pair all the references (load-load, load-store, store-load, store-store) to the same array in the loop.

3. Calculate distance vector and direction vector for each reference pair.

4. Output the classification, whether there is a dependence, whether the dependence is loop carried or loop independent, and whether the dependence is True, Anti, Ouput dependence.

5. Write README.txt file to briefly describe your code and list the testing result for the tests we provided in the course materials.

Example:

For (i …)

        a[i] = a[i] + a[i+1]

In LLVM IR form

forBody:

       load a[i]

       load a[i+1]

       store a[i]

Output:

=================================

An example loop with induction variable i:

References to a: a[i], a[i+1], a[i]; ID assigned & operation: <a0, load>, <a1, load>, <a2, store>;

Reference pairs in loop i: (a0, a1), (a0, a2), (a1, a0), (a1, a2), (a2, a0) , (a2, a1)

Distance and direction vector:

(a0, a1):  (1), (<)

(a0, a2): (0), (=)

…..

Dependence classification:

(a0, a1): No dependence

(a0, a2): Loop independent, True dependence

…..

=================================

Note that in this assignment you are only required to handle single loop with accesses to 1 dimensional array (no nested loops, no multi-dimensional array accesses). Be sure to finish the basic requirement before you start to think about the extra part which handles nested loops. 

Extra part reminder: A compiler that supports and can parallelize nested loops (and multi-dimensional arrays) will receive up to 30% extra credit for the project.  The extra credit part is graded only at the last phase, after the compiler is completed and can generate parallelized code.

Loop parallelization (Term project) Part 2. Induction variable analysis and array index analysis

Hope you are doing well in the first part of the term project. In this assignment, you are asked to do induction variable analysis and array index analysis.

The deadline is 11:59pm Tuesday March 28th. 

Induction variable analysis:

Read chapter 11 in the SSA book. Read slides from Princeton, to find the algorithm to detect loop induction variables. You can also find other algorithms beyond the ones described in the SSA book and slides provided. But be sure to describe the algorithms you use in README file (Do not just directly call built-in llvm pass to output induction variable). 

You are required to find both basic and derived induction variables. The definitions are described as follows, the example can be found in the slides above.

basic induction variables – variables (i) whose only definitions within the loop are of the form i = i + c or i = i – c, c is loop invariant.

derived induction variables – variables (j) defined only once within the loop, whose value is linear function of some basic induction variable .

Array index analysis:

Array accesses in C code will be compiled to getelementptr statement in LLVM IR. The index expression can be extracted by checking the operands of getelementptr instruction and follow the definition-use chain to find out the full expression.

For example:

a[i-1] in C code will be compiled to LLVM IR form as follows:

%13 = load i32* %i, align 4

%14 = sub nsw i32 %13, 1

%15 = sext i32 %14 to i64

%16 = load i32** %a, align 8

%17 = getelementptr inbounds i32* %16, i64 %15

There are two operands in getelementptr statement, one is %16 which indicates the array pointer. The other is %15 which indicates the array index expression. So your work is to construct the full expression of %15 which contains only constants, loop induction variables, global variables and parameters. Using isa<> and dyn_cast<> to check constants, arguments, global variable. The isa<>, cast<> and dyn_cast<> templates.

The approach is to follow the def-use chains to put together all the expressions. Iterating over def-use & use-def chains.

%15 = sext %14 = sext (%13 – 1) = sext (%i – 1), as sext is typecasting instruction, so we can ignore that and get the final expression: %i – 1.

Output:

Output the induction variable (basic and derived separately) you find and output the array name along with array index for each array access in the test program.

For example:

Loop induction variable (basic): i

Loop induction variable (derived): None

        Array access to a with index i – 1

        Array access to b with index i

Tests:

You are encouraged to implement your own tests and we released a number of tests cases which you can find in the course materials on blackboard.

Readme:

Write down the algorithm skeleton for your implementations and present the testing results(both success and fail).

Notice: Analysis for nested loops is not required for this and later phases of the program parallelization project.  However, a compiler that supports and can parallelize nested loops will receive up to 30% extra credit for the remaining phases of the project.  The extra credit part is graded only at the last phase, after the compiler is completed and can generate parallelized code.