Data layout is critical for cache performance. An example of a poor data layout is one that maps frequently accessed data to the same set, which leads to an increase in conflict misses.
This famous piece of work by Petrank and Rawitz formally defines the problem and shows that computing the optimal data layout is extremely difficult, even if the full memory access sequence is known upfront. They study the problem in a cache conscious setting (all the cache parameters are known). The optimal data layout is a one-to-one mapping from data objects to the memory that gives the smallest number of cache misses.
For a t-way set associate cache of size k (a general specification of the cache), the paper calls the problem “minimum t-way k-cache misses.”
The paper proves that for any polynomial data layout algorithm there are sequences onwhich the algorithm performs extremely poorly. Specifically, it shows that for any e>0 there exists a “villain” sequence for which the data layout provided by the algorithm leads to more misses than a factor of N^(1-e) of the optimal (N is the length of the sequence).
The authors provide a reduction from the NP-hard “graph k-colorability” problem which is defined as follows.
Given a graph G, and k, is G k-colorable? i.e., does there exist a vertex coloring that leaves no monochromatic edges in the graph (edges that have both their endpoints colored the same are called monochromatic edges)?
Here is a sketch of the proof for a direct-mapped cache of size k.
For every instance of the graph k-coloring problem, we construct an instance of the minimum k-cache miss problem. For every vertex in the graph, we associate one memory object. For every edge we construct a memory access subsequence. Finally, we concatenate the subsequences to construct to full sequence. The key idea is to ensure that mapping the memory objects corresponding to the endpoints of an edge to the same location in the cache would incur a number of conflict misses that is polynomially related to the number of edges in the graph. This would guarantee that there is a huge gap between a data layout which corresponds to a valid k-coloring of the graph and one that is not. Therefore, if we are able to efficiently give a data layout that is within a decent bound of the optimal, we would be able to solve the k-coloring problem as well.
Following the same idea, the case can be proved for the general t-way set associative cache. The only difference is that in order to ensure the gap, we need to introduce more objects in the subsequences associated with the edges.
In the light of the negative result, a question is “how close we can get to the optimal layout?”. The authors present an algorithm that guarantees a data placement that always leads to a number of misses within a factor of (n/log n) of the optimal.
There are an assumptions in this paper:
1. uniform object size, which is not usually true.
My thinking:
Can we add more setup to the cache to obtain better results. How about cache partitioning? If we assume the cache can be partitioned, either way-partitioning or set-partitioning, can we improve the results?