CS 255/455 Spring 2017

CSC 255/455 Software Analysis and Improvement (Spring 2017)

Lecture slides (when used), demonstration programs, and some of the reading material will be distributed through Blackboard.  Assignments and projects will be listed here.


Assignments:

  • Trivia assignment.  Search slashdot.org for the posts on GCC, LLVM, RUST, Scala or Haskell.  Select two posts to read the posts and all discussions.  Write a summary with 200 or more words for each of the two posts.  The summary should include at least a precise fact on the topic as well as an opinion with all supporting and disagreeing arguments pulled from the discussions.  Print and submit a paper copy Monday January 23rd at the start of the class.  Then see me in one of my office hours for feedback on the summary.   Meet me on or before February 3rd.  The grade is assigned after the meeting.  Bring a copy of your paper to the meeting (in addition to the one you submit).

 Course description

With the increasing diversity and complexity of computers and their applications, the development of efficient, reliable software has become increasingly dependent on automatic support from compilers & other program analysis and translation tools. This course covers principal topics in understanding and transforming programs by the compiler and at run time. Specific techniques include data flow and dependence theories and analyses; type checking and program correctness, security, and verification; memory and cache management; static and dynamic program transformation; and performance analysis and modeling.

Course projects include the design and implementation of program analysis and improvement tools.  Meets jointly with CSC 255, an undergraduate-level course whose requirement includes a subset of topics and a simpler version of the project.

 

 Instructor and grading

Teaching staff: Chen Ding, Prof., CSB Rm 720, x51373;  Dong Chen, Grad TA;  Jacob Bisnett, Undergrad TA.

Lectures: Mondays and Wednesdays, 10:25am-11:40am, CSB 601

Office hours: Ding, Fridays 11am to noon (and Mondays for any 15 minute period between 3:30pm and 5:30pm if pre-arranged).

TA Office hours: Dong Chen, Tuesdays 3:30pm to 4:30, CSB 720. Jacob Bisnett, Thursday 1:00 pm to 1:50 pm, CSB 720.

Grading (total 100%)

  • midterm and final exams are 15% and 20% respectively
  • the projects total to 40% (LVN 5%, GCC/LLVM/RUST 5%, local opt 10%, global opt 10%, final phase 10%)
  • written assignments are 25% (trivial 1%; 3 assignments 8% each)

 Textbooks and other resources

Optimizing Compilers for Modern Architectures (UR access through books24x7), Randy Allen and Ken Kennedy, Morgan Kaufmann Publishers, 2001. Chapters 1, 2, 3, 7, 8, 9, 10, 11. lecture notes from Ken Kennedy. On-line Errata

Engineering a Compiler, (2nd edition preferred, 1st okay), Keith D. Cooper and Linda Torczon, Morgan Kaufmann Publishers. Chapters 1, 8, 9, 10, 12 and 13 (both editions). lecture notes and additional reading from Keith Cooper. On-line Errata

Compilers: Principles, Techniques, and Tools (2nd edition), Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman, Pearson.

Static Single Assignment Book, Rastello et al. (in progress)

Assignment 6 (Dead Code Elimination)

This assignment is due Monday April 18th at 11:59pm.

For this assignment you are expected to identify and eliminate dead code. Dead code is any instruction that does not affect the output of the program (a function, in particular). For DCE (dead code elimination), you will need to use def-use chains to reason about instructions that do not affect the outputs. Remember that the output of a function is not limited to what it is returning.

IMPORTANT NOTE:

Those who implemented assignment 4 in LLVM, and have relied on the SSA property to find def-use chains with a linear scan of the LLVM-IR, are expected to implement a further requirement for this assignment, whether they are choosing URCC or LLVM for this assignment. That is Common Subexpression Elimination (CSE). An example of this optimization has been illustrated below.

CSE-example

Adopted from [1]

For CSE, you must first do the AVAIL analysis and then reason about expressions that are redundant.

Note 1: Since LLVM does not permit copy instructions, you may need to use the “ReplaceAllUsesWith” method to replace all uses of c with t. Also, since LLVM uses the SSA representation, you might need to insert new phi nodes.

Note 2: You do not need to implement partial redundancy elimination (PRE) for this assignment.

Testing

Your are expected to test your optimization pass by counting the number executed instructions. To do so, you are provided with a test script. Download this script into your URCC repository.

If you are using URCC for this assignment, you can easily run “./test.rb all” to run all the tests. You can also run an individual test by running “./test.rb #{program_name}”.

If you are using LLVM for this assignment, you will need to reimplement Test.test method in the script above by following the current implementation of that method. This is where you need to compile a program and apply your optimization, and instrumentation. You are allowed to use your peers’ test scripts.

The “./test.rb all” command will give you a “urcc_test_results.txt” file that includes the program outputs (along with your dynamic instruction count. Remember to print the instruction count as a single line “INST COUNT: x”, where x is the number of total executed instructions). Include this file in your report, along with your test script (if modified) and instructions on how to run the tests. Do not forget to give a thorough explanation about your implementation.

Note: URCC appears to be failing on generating a correct executable for two of the test cases: sort.c and tax.c. You can exclude these test cases by removing them from the test directory.

Assignment 5: Def-Use Chains

This assignment is due on Wednesday March 30th at 11:59pm.

In this assignment, you are expected to build the def-use chains for a program. You can either use your own CFG pass (if you are sure about its correctness), or adopt any correct CFG pass that is available to you.

Your output must have the following format. For every definition of a variable a, and for every use of a which is reachable from that definition (without any intervening def), you must output

[a, def, use],

where def and use are the program statements corresponding to the definition and use of a.

Run your analysis on the test cases provided in the URCC test directory and report the results. Explain your implementation and findings in a readme file. Archive everything and submit on blackboard.

Assignment 4: LVN in URCC/LLVM

For this assignment you are expected to implement the local value numbering optimization in URCC or LLVM. The requirements are the same as in assignment 1, except that the redundancy elimination part is now required. For this assignment, you don’t need to implement Stewart extension.

Test your optimization pass on URCC test cases. Report your implementation details and your findings in a readme file and submit on blackboard.

If you are working on LLVM, remember to turn off optimizations by using the “-O0” flag when emitting LLVM bitcode (The base project directory in /home/hoover/u1/cs255/cs255-llvm has been modified accordingly).

This assignment is due Sunday March 20th at 11:59pm.

Assignment 3 (CFG in URCC/LLVM)

In this assignment you will implement a pass to build a program’s CFG from its linear code. This assignment is due on Sunday March 6th at 11:59pm.

For this assignment, you can choose to use either the URCC compiler or LLVM.

Instructions for URCC

  1. Checkout and install the URCC compiler (follow the instructions in the README file). Make sure you are using a version of clang between 3.3 and 3.6 or URCC may fail. If you decide to use the CSUG machines, you can use clang 3.6 that is installed in /u/cs255/build-llvm-36/bin.
  2. Follow the algorithm in section 5.3.4 of the Cooper & Torczon book (page 241) to write a pass that builds the CFG of the program and visualizes it using graphViz (instructions on how to write a URCC pass are included in the README file).
  3. Test your compiler pass on test cases included in the test directory.
  4. Explain your implementation in a Readme file. Report all test case failures.
  5. Archive your code and the Readme file and submit on Blackboard.

Note: A CFG pass is already implemented in the URCC compiler. You shall not look at that implementation.

Instructions for LLVM

  1. Following the same instructions as in assignment 2, create an LLVM project in your CSUG account.
  2. Implement a pass that builds the CFG of the program and visualizes it using graphviz. If you need to iterate over instructions in a function, you must use the following code:
    #include "llvm/IR/InstIterator.h"
    
    // F is a pointer to a Function instance
    for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
      errs() << *I << "\n";
  3. You can use BGL (Boost Graph Library), which is already installed on CSUG filesystem. Graphviz is included in this library.
  4. Test your compiler pass on test cases included in URCC’s repository (https://github.com/dcompiler/urcc/tree/master/tests).
  5. Explain your implementation in a Readme file. Report all test case failures.
  6. Archive your code and the Readme file and submit on Blackboard.

Note: A CFG pass is already implemented in LLVM. You shall not look at that implementation.

 

Assignment 2 (GCC/LLVM)

In this assignment, you will implement and test a compiler pass that instruments a program to report the number of intermediate-level executed instructions. You can choose to implement this pass either in gcc or llvm. For your convenience, these two compilers are already installed on the csug network.

The idea is to insert appropriate calls in the program (instrumentation).

*********************************Instructions on GCC****************************************
——————————————————————————————–
Log in to your csug account.
$ cp -r /u/cs255/cs255-gcc YOURUSERNAME-cs255-gcc
$ cd YOURUSERNAME-cs255-gcc

Included files:
———————–
* inst-cnt-plugin.cc
You must implement your compiler pass here.
This file already provides the skeleton and a related example.

* inst-cnt-rt.c
This file implements the runtime functions that you need for the instrumentation.

* test.c
This is a simple program to test your pass.

After implementing the pass, compile it by running “make” and test it using “make check”. This gives you the instrumented program “test”. Run it on the string “cs255” and report your output. Make sure to explain your implementation and findings in a readme file.

Submission guideline:
Archive your working directory using the following command line, and submit on Blackboard.
tar -czvf YOURUSERNAME-cs255-gcc.tar.gz YOURUSERNAME-cs255-gcc/

Note: you can use the gcc option -fdump-tree-all to inspect your instrumentation at the GIMPLE level. After running “make check” with this option, look for the file which has your pass name as a suffix in its filename.

*********************************Instructions on LLVM**************************************
——————————————————————————————–
Log in to your csug account.
$ cp -r /u/cs255/cs255-llvm YOURUSERNAME-cs255-llvm
$ cd YOURUSERNAME-cs255-llvm

Included files:
———————–
* lib/InstCounter.cpp
You must implement your compiler pass here.
This file already provides the skeleton and a related example.

* runtime/InstCounting.c
This file implements the runtime functions that you need for the instrumentation.

* test/test.c
This is a simple program to test your pass.

After implementing your pass, compile it by running “make” in your top-level directory. Then cd into the “test” directory and run “make check” to test your pass. This gives you the instrumented program “test”. Run it on the string “cs255” and report your output. Make sure to explain your implementation and findings in a readme file.

Submission guideline:
Archive your working directory using the following command line, and submit on Blackboard.
tar –exclude=’.svn’ –exclude=’autoconf’ -czvf YOURUSERNAME-cs255-llvm.tar.gz YOURUSERNAME-cs255-llvm/

Note: you can use the llvm-dis tool (/u/cs255/build-llvm-38/bin/llvm-dis) to check your instrumentation at IR level. Run this tool on the llvm bitcode file that is generated by your pass:
/u/cs255/build-llvm-38/bin/llvm-dis <test.bc.opt> test.bc.opt.ll
———————————————————————————————