Loop parallelization (Term project) Part 1. Finding loops

In the previous LLVM assignment, you have used the interface LLVM provided. From this assignment on, we are starting to do a loop parallelization term project which is divided into 3 assignments: (1) Loop detection. (2) Dependence analysis and induction variable analysis. (3) Parallelization.

RUST may also be used if there are enough interests.  Email Jacob Bisnett if you are interested.

IMPORTANT NOTES: As we are doing term project from this assignment on, the problem we are going to solve is more close to the real-world compiler development.   The assignment may have missing information and you may encounter problems not covered by the textbooks or the lectures, part of the project experience is to take initiative, formulate these problems, seek help from others (teaching staff, classmates, and online resources), and solve these problems .

The deadline is 11:59pm Friday March 10th. 

For this assignment, you need to use data flow analysis to find loops in the LLVM IR code.

The definition for a loop in the flow graph can be find in the dragon book Chapter 8.4.5 P531. The loop identification approach can be found in these slides from CMU .

For implementation, three steps described in the slides are needed as follows.  Each step generates an output which will be used for grading.

(1) Calculate Dominance relation for basic blocks by data flow analysis.  Provide a function named dump to output the dominance relation.  (Do NOT use the dominator analysis in the LLVM compiler.)

  • To calculate Dominance, you need to traverse CFG.  Take a look at the class reference for functions and basic blocks to see how to find the entry block and how to find the successors or predecessors for a given basic block;
  • You also need to know about some C++ data structures, such as vectors, map, pair, stack, queue. That will simply your implementation.
  • To find the name of basic blocks, look at the APIs such as hasName(), getValueName(), setValueName() provided in value class.

(2) Find back edges; Provide a dump function to output all back edges.

(3) Find natural Loops. Provide a dump function to dump the set of basic blocks in the loop.

The output should first give the number of loops, the dominance relation and all back edges. For each loop, output a set of basic blocks in the loop. For example:


Number of loops: 1

Dominance relation: BB1 -> BB2, BB1->BB3, BB2->BB3

Back edge: BB3 -> BB1

Basic blocks in the loop 1: BB1, BB2, BB3


Write at least two test programs with at least one loop in each program. Provide detailed README file to describe the design and limitations, including what types of loops can be detected and what types of loops can not.

Extra credit (10%):

LLVM built-in analysis may be useful for loop detection, for example the dominator analysis in the LLVM compiler?  Implement a second loop detection pass which you can use any LLVM API for loop detection. Compare different implementations and write down the detection coverage for different loop structures.

Assignment 2 (LLVM/RUST)

Hope you are doing well in the first assignment and ready for the next one.
In this assignment, you will implement and test a compiler pass that instruments a program to report the number of intermediate-level executed instructions. The idea is to insert appropriate calls in the program (instrumentation). You can choose to implement this pass either in LLVM or RUST. For your convenience, these two compilers are already installed on the csug network. (For grads, please ask Marty Guenther <marty@cs.rochester.edu> for an undergrad account to access csug network)

Deadline is 11.59pm Friday, Feb 10th, 2017.

*********************************Instructions on RUST****************************************——————————————————————————————–

Instructions for Writing an Instruction Counting Pass for Rust’s MIR

Environment setup:

In order to write an MIR transformation pass easily you need three things:

  1. A Rust nightly compiler
  2. Cargo: the Rust build system
  3. The source code for the Rust compiler.

Thankfully there is an easy way to get all three: rustup.

Rustup is the system most Rust hackers use to manage their Rust environment. It’s very easy to install:

  1. SSH into the cycle machines
  2. Copy paste the following into your command line, it will run the install script and set your default rust compiler to nightly

$ curl https://sh.rustup.rs -sSf | sh -s — –default-toolchain nightly

  1. Follow the instructions to add rustc and cargo to your PATH
  2. Ensure your rust compiler is on the correct version by making sure the output of the following command contains the word “nightly”:

$ rustc -V

  1. Make sure the same is true of the next command

$ cargo -V

  1. Type the following command to download the rust source code into “~/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/” or something similar

$ rustup component add rust-src

Once you have done the above, you are ready to start writing your first MIR pass.

Start your pass:

This git repository is a rust project that contains everything you need to start writing your own compiler pass. You need to finish the compiler pass in the “src/lib.rs” file so that it insert dynamic instruction counting tooling into the compiled “src/main.rs” program. You can use “cargo build” in order to compile the project, and “cargo run” to run it. Your goal is to make the X static variable equal to the number of dynamically executed MIR statements, including Storage, Descriminant, and NOOPs.

Some sample code is in lib.rs to get your started.

If you have any questions, please email jbisnett@u.rochester.edu

*********************************Instructions on LLVM**************************************

Environment setup:

Log in to your csug account.
$ cp -r /u/cs255/cs255-llvm YOURUSERNAME-cs255-llvm
$ cd YOURUSERNAME-cs255-llvm

Start your pass:

Included files:
* lib/InstCounter.cpp
You must implement your compiler pass here.
This file already provides the skeleton and a related example.

* runtime/InstCounting.c
This file implements the runtime functions that you need for the instrumentation. init() is to initialize the counter to ZERO before counting, increase() is to increase the counter by 1 and print_stmt_cnt() is to print the value of the counter.

* test/test.c
This is a simple program to test your pass.

After implementing your pass, compile it by running “make” in your top-level directory. Then cd into the “test” directory and run “make check” to test your pass. This gives you the instrumented program “test”. Run it on the string “cs255” and report your output. Make sure to explain your implementation and findings in a readme file.

Submission guideline:
Archive your working directory using the following command line, and submit on Blackboard.
tar –exclude=’.svn’ –exclude=’autoconf’ -czvf YOURUSERNAME-cs255-llvm.tar.gz YOURUSERNAME-cs255-llvm/

Helpful documentations:

(1) LLVM Programer’s Manual: highlight some of the important classes and interfaces available in the LLVM source-base (For example, how to iterate over basic blocks, how to iterate over instructions inside a basic block or function).

(2) LLVM Language Reference Manual: reference manual for the LLVM assembly language.

(3) LLVM class references: reference for the interfaces of the classes needed. (For example instructions, basic blockfunctions)

Note: you can use the llvm-dis tool (/u/cs255/build-llvm-38/bin/llvm-dis) to check your instrumentation at IR level. Run this tool on the llvm bitcode file that is generated by your pass:
/u/cs255/dc_llvm/build/bin/llvm-dis test.bc.opt

If you have any questions, please email dchen39@cs.rochester.edu


Continue Reading

CS255 Assignment #1 (LVN)

Hi CS255/455 students:

Hope you all enjoyed the course! The first assignment is to implement local value numbering (LVN) and check for redundant expressions.

You are expected to handle commutativity for commutative operations. Recall that an operation is commutative if you can change the order of operands without changing the result. For example (+) is commutative but (-) is not. Your implementation must be able to assign the same value number to a+b and b+a.

As the second requirement, improve LVN by adding the Stewart extension. The Stewart extension improves LVN by identifying additional redundancy in the following example form.

a = b + c

d = a – b

Specifically, it guides LVN to assign the same value number to both (c) and (d). The idea of the solution was first raised by Chris Stewart when he was taking the class around 2004.  His idea was to insert additional value relations into the value number table. You should first work out this idea and make it concrete as an extension to the basic value numbering algorithm.

Note 1: You are expected to apply the Stewart extension on four operations: ‘+’, ‘-‘, ‘*’, and ‘/’.

Note 2: You should make sure that the Stewart extension can be applied on the following code as well.

a = b + c

e = a

d = e – b

Finally, transform the code sequence by removing the redundant expression(s) and print the transformed code sequence.

To complete this assignment, take the following steps:

1. From Blackboard, download the code from Course Materials: 02 Local value numbering demo programs

2. Implement and add commutativity and Stewart extension to the LVN class in the file vn.rb.

3. Implement code generation.

4. Make sure all the tests in vn_tests.rb pass.

5. Document any test failures, if there is any, and explain why, in README.txt in plain text.

6. Extra credit.  In addition to finding the statements with a redundant expression, generate optimized code where all redundant expressions are removed.  Demonstrate the optimizer with a set of tests and put them in opt_tests.rb.  The tests should include all three tests in vn_tests.rb.

7. Submit your assignment on Blackboard. Make sure to include all the ruby files in your submission and the file README.txt to document the submission.

Due time: Tuesday Jan 31st at 23:59:59 EST. (5% bonus points for submission before Friday Jan 27th at 23:59:59 EST.)

Late submission policy: Each student can have a total of two days used for late submissions over all assignments . This means that if you submit the LVN assignment on Thursday, you will not be able to do any other late submission. But if you submit on Wednesday, you still have one more day to use for one other assignment.

Policy on academic honesty :  Every line of code of the LVN analyzer and optimizer must be written by the student.  Do not copy code.  Do not show your LVN code to others until 2 days (48 hours) past the assignment due time.  The teaching staff is required to report every violation of the policy or suspicion of violation to the university’s Academic Honesty Board.  Contact the teaching staff if you have questions on the policy.

CS 255/455 Spring 2017

CSC 255/455 Software Analysis and Improvement (Spring 2017)

Lecture slides (when used), demonstration programs, and some of the reading material will be distributed through Blackboard.  Assignments and projects will be listed here.


  • Trivia assignment.  Search slashdot.org for the posts on GCC, LLVM, RUST, Scala or Haskell.  Select two posts to read the posts and all discussions.  Write a summary with 200 or more words for each of the two posts.  The summary should include at least a precise fact on the topic as well as an opinion with all supporting and disagreeing arguments pulled from the discussions.  Print and submit a paper copy Monday January 23rd at the start of the class.  Then see me in one of my office hours for feedback on the summary.   Meet me on or before February 3rd.  The grade is assigned after the meeting.  Bring a copy of your paper to the meeting (in addition to the one you submit).

 Course description

With the increasing diversity and complexity of computers and their applications, the development of efficient, reliable software has become increasingly dependent on automatic support from compilers & other program analysis and translation tools. This course covers principal topics in understanding and transforming programs by the compiler and at run time. Specific techniques include data flow and dependence theories and analyses; type checking and program correctness, security, and verification; memory and cache management; static and dynamic program transformation; and performance analysis and modeling.

Course projects include the design and implementation of program analysis and improvement tools.  Meets jointly with CSC 255, an undergraduate-level course whose requirement includes a subset of topics and a simpler version of the project.


 Instructor and grading

Teaching staff: Chen Ding, Prof., CSB Rm 720, x51373;  Dong Chen, Grad TA;  Jacob Bisnett, Undergrad TA.

Lectures: Mondays and Wednesdays, 10:25am-11:40am, CSB 601

Office hours: Ding, Fridays 11am to noon (and Mondays for any 15 minute period between 3:30pm and 5:30pm if pre-arranged).

TA Office hours: Dong Chen, Tuesdays 3:30pm to 4:30, CSB 720. Jacob Bisnett, Thursday 1:00 pm to 1:50 pm, CSB 720.

Grading (total 100%)

  • midterm and final exams are 15% and 20% respectively
  • the projects total to 40% (LVN 5%, GCC/LLVM/RUST 5%, local opt 10%, global opt 10%, final phase 10%)
  • written assignments are 25% (trivial 1%; 3 assignments 8% each)

 Textbooks and other resources

Optimizing Compilers for Modern Architectures (UR access through books24x7), Randy Allen and Ken Kennedy, Morgan Kaufmann Publishers, 2001. Chapters 1, 2, 3, 7, 8, 9, 10, 11. lecture notes from Ken Kennedy. On-line Errata

Engineering a Compiler, (2nd edition preferred, 1st okay), Keith D. Cooper and Linda Torczon, Morgan Kaufmann Publishers. Chapters 1, 8, 9, 10, 12 and 13 (both editions). lecture notes and additional reading from Keith Cooper. On-line Errata

Compilers: Principles, Techniques, and Tools (2nd edition), Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman, Pearson.

Static Single Assignment Book, Rastello et al. (in progress)

Let All Trees Grow

Tree type definitions in Haskell as answers to a homework question of CSC 253/453.  They make a good demonstration how a type system can be used by a programmer to communicate its program design to the compiler, so the compiler can check correctness of the implementation automatically.

It is also a good exercise to try writing the HasMap instance for these definitions.  Can you find which one of them has a “growth” problem and cannot be used in practice?

Robby Findler seminar and guest lecture


Macros matter: effectively building lots of programming languages
Robby Findler
Northwestern University & PLT
Monday, November 14, 2016

Building new programming languages from whole cloth is a difficult proposition at best. Macro system provide an alternative; they support the construction of new programming languages from existing pieces, while still providing the flexibility to radically change the syntax and semantics of the programming language.

In this talk, I will give a high-level overview of the myriad of programming languages that Racket supports, as well as an overview of the research area of macros, showing what can be accomplished with them and introducing some of the associated technical challenges (and their solutions).

Robby Findler is currently an Associate Professor at Northwestern University, and received his PhD from Rice University in 2002. His research area is programming languages and he focuses on programming environments, software contracts, and tools for modeling operational semantics. He maintains DrRacket, the program development environment for the programming language Racket and he co-authored the book _How to Design Programs_, a textbook for teaching introductory programming.

(URCS seminar announcement)


(CSC 253/453 Guest Lecture)  Redex: A Language for Lightweight Semantics Engineering

Professor Robby Findler, Northwestern University

Redex is a programming language designed to support semantics engineers as they experiment with programming language models.  To explore a model, an engineer writes down grammars, type systems, and operational semantics in a notation inspired by the programming languages literature. Redex breathes life into the model, building typing derivations, running example expressions, and using random generation to falsify claims about the model.

This talk gives an overview of Redex, motivating its design choices and giving a sense of how it feels to program in Redex. Then the talk dives into some of the techniques that Redex uses to generate random expressions.

A video by Prof. Findler on Redex