Loop parallelization (Term project) Part 4. Parallelization

Hope you are doing well in the previous 3 parts of the term project. In this assignment, you are going to do the final step to finally do the parallelization.

The deadline is 11:59pm ,Monday May 1st, 2017

In this assignment, you need two steps:

(1) Merge previous three parts to analyze dependence for loops.

(2) Parallelization: There are two choices for parallelization part: one is to generate #pragma omp for loops (just the pragma, not generating runnable code), the other is to generate parallel IR by inserting detach, reattach and sync instructions provided by Tapir (The published paper won the best paper in PPoPP 2017). Tapir is installed in cycle2 local disk.


Expected output:


For OMP, your analysis pass is to generate annotation for OMP for each loop inside the program if there is no loop carried dependences, otherwise generate loop carried dependence information. You need to locate the line number of the loops in the source code from IR, to do this you need to (1) pass -O0 and -g to clang, clang -O0 -g -S -emit-llvm sample.c -o sample.ll and (2) check MDNode and DILocation classes to read line number for IR instructions.

For example:

loop1 (line 3-6): “#pragma omp parallel for”

    loop2 (line 7-10): Not parallelizable, loop carried dependence for array access pair (a[i], a[i-1]), (a[i], a[i+1]), …


For Tapir based implementation, you need to generate parallel IR by inserting detach, reattach and sync instructions (check the source code in lib/IR/instructions.cpp to know how to create these instructions). Examples are inside the cycle2 local disk:

For  the input code:


The reference output is :


Notes: Don’t forget the extra credits for nested loops.