[Harold +:TC92]Optimal Partitioning of Cache Memory

Harold S. Stone, Fellow, IEEE, John Turek, Member, IEEE, and Joel L. Wolf

While improving locality through cache management, one of the most significant problems is determining how close is our approach to the optimal one. For sequential program, we get OPT replacement algorithm. And for co-runed programs, we should have one. This paper tried. Authors figure out a model on analyzing quality of partition, figuring out optimal partition and dynamically close to it.

This paper contributes in 4 aspects:

  • Optimally partition cache for 2 programs
  • Find a partition which has similar performance as LRU
  • Theoretically show LRU is far from optimal for transient data allocation
  • Near optimally partition cache for N programs

Optimally partition cache for 2 programs

Authors start modeling with an insight that miss ratio is linear to cache size in log/log scaling. Thus they got a prediction function for miss ratio that MI(x)=aI*x^(bI*log10), where x is cache size assigned to program I, a and b are coefficient which got from profiling.

Since the total misses is TotalMisses=(MI(x)+MD(C-x))*T/2, where C is total size of cache, and they suppose the two programs I and D has same access rates T. Then within an assumption that the miss ratio is convexity according to cache size, they can got minimal point for TotalMisses when derivation equal to 0.

Combined with there miss ratio function, authors got optimal partition size with followed equation: bI*aI*log10*x^(bI*log10-1)-aD*bD*log10*(C-x)^(bD*log10-1)

Find a partition which has similar performance as LRU

To achieve this goal, authors firstly propose a concept named state of cache for 2 programs. A State x of cache means overall miss ratio without partition is equivalent to a partition, which  assign x cache for program I and C-x for program D. And S(x) is possibility function of state x appears.

Since the possibility of stat x transforms to  x+1 should be equal to the possibility of x+1 to x, then they have such equation that S(x)MI(x)=S(x+1)MD(C-x-1). According to the monotonic property of MI and MD, S(x) is unimodal.

Finally, for the most possible stat x’, authors showed it is close to optimal partition.

LRU is far from optimal for transient data allocation

With derivation of stats according to time, authors shows LRU needs time to achieve state x’, which is the most possible state as well as the state near to optimal partition.

Since derivation of state x can be represent as dx/dt = Rate of increasing x – Rate of decreasing x,  which is essentially equal to MI(x) – MD(C-x). This derivation shows velocity and time for LRU to achieve a near optimal partition state, and also shows when time is short enough, LRU cannot perform close to optimal.

Near optimally partition cache for N programs

With the assumption that miss ratio function is convexity, authors propose a greedy algorithm to achieve a near optimal solution according to the insight of the first section, which is optimal is achieved when the sum of derivation of miss ratio equal to 0. Thus they employ followed algorithm keep the sum as close to 0 as possible:

Let Ci to be cache size of partition i

Initialization: set C1=C2=…=CN=0

Induction step : Find the most benefit programs i, which would reduce the most cache misses with extra 1 cache block

let Ci=Ci+1, and keep other partition does not change, goto step 2 until sum of partition size equal to C

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s