RubyBOP Introduction

Behavior-oriented parallelization (BOP) provides a suggestion interface for a user to mark possible parallelism and run-time support to guarante correctness and efficient execution whether the hints are correct or not.  It enables program parallelization based on partial information and is useful for incrementally parallelizing a program or streamlining it for common uses.

This page contains a tutorial to introduce RubyBOP to a CS undergraduate student who will become a developer of the system.

Introduction

Modern programs often have dynamic parallelism at the high level but are hard to parallelize because of complex code such as the use of bit-level operations, unrestricted pointers, exception handling, custom memory management, and third-party libraries. Moreover, many programs have input-dependent behavior where both the degree and the granularity of parallelism are not guaranteed or even predictable. For manual parallelization, the complexity and the uncertain performance gain do little to warrant the investment of time and the risk of error.

Behavior-oriented parallelization (BOP) addresses these problems by providing a suggestion interface for a user to mark possible parallelism in a program and a run-time system to guarantee correctness and efficiency. If the hints are correct, the program will run in parallel. If the hints are not correct or the suggested parallelism is too fine-grained, the program still finishes as fast as sequential execution. In both cases, the program produces the same output as the sequential execution without hints. BOP is based on frequent, input-dependent behavior rather than definite behavior. It allows program parallelization based on partial information and is useful for incrementally parallelizing a program or streamlining it for common uses.

Paper Reading

Example (Unsafe) OpenMP Programming

Use the add example.  It computes on the N elements of an array and adds the results together.  It is parallelized in OpenMP.  Use GCC to build it and run it in parallel.   The executable has two parameters: the number of elements to compute (in millions) and the number of blocks (to divide the elements into and compute in parallel).  Change these two parameters and the number of OMP threads.  Observe the performance and the result.  Do you see a non-deterministic error?

C-BOP Manual

RubyBOP source code is a Mercurial repository.  Download it from cycle1.cs.rochester.edu://p/compiler/hg-repos/rubybop/  Here we refer to the root as [repos].

Read the programming manual of C-BOP.  C-BOP is the subdirectory [repos]/cbop/.  It is a C library that a C program can call to parallel the program.

A description of the library API calls is given in the programming manual in [repos]/cbop/prog_man/.  Use a web browser to open the base manual page index.html.  The information may be dated or missing.  We will update and improve the document as we go.

Write the First C-BOP Program

Use the programming manual and the C-BOP source code to parallel the add program using BOP programming hints.

Ruby C Extension

Use the most recent Ruby release (2.2).  Add a new trivial class in C and a public method in the new class to return some number or string.  To test your extension, write a Ruby program, standalone or in the interpreter, create an object of this new single-method class and call the method to check the results.

2 thoughts on “RubyBOP Introduction

  1. Hao and Chencheng walked in to see Ben’s code show. Hao questioned that since programmers are responsible for correct access annotation, what are the reasons that we still need speculative parallelization?

    There are two reasons. First, annotating data access is an easier task than parallelizing a program. The BOP approach divides the task of parallelization into two sub-problems. The first problem is data sharing. This is solved by annotation, which marks all accesses to shared data by a PPR task. The understanding is required just for each statement. The second problem is control flow, i.e. the relative order in which the statements in a task are executed and the interleaving in which the statements of different tasks are executed. Compared to data sharing, the control flow problem is more difficult especially the understanding and checking of an interleaved execution. BOP solves the second problem automatically through speculation.

    Access annotation can be automated using compiler support, and the cost of annotation can be reduced through profiling and compiler analysis (Tongxin Bai’s dissertation). A programmer can insert annotation manually and can use the knowledge about the program to minimize the cost.

    Access annotation can be performed statement by statement and function by function. Annotated modules can be used together without change. Therefore, the manual effort to solve the first problem is cumulative, since the task is modular. In contrast, the task of parallelization is not modular. For example, parallel functions that work individually may not work when they are used together.

    The second benefit is speculative parallelism, which is more general than static or dynamic parallelism. With speculation, we can parallelize possibly parallel tasks that we may not know whether they are actually parallel until we finish running them. The dependence hint paper (Ke et al. OOPSLA 2011) gives several examples that cannot be parallelized without speculation.

    The project of Ruby BOP is a test of the BOP approach and how far these two benefits can take us.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s