Andreas Sembrant and etc. proposed a phase-aware model to calculate an application’s miss ratio more efficiently. They combine online phase detection and phase guided profiling to reduce overhead and increase the accuracy.
The StatCache which proposed by Berg and Hagersten is a low overhead statistical model. It can estimate miss ratio for caches of arbitrary size. But this model only report the average miss ratio which may not fully expose the application’s features. The miss ratio R can be calculated by the following function:
R * N = h1 f(1*R) + h2 f(2*R) + h3 f(3*R) + …
N = h1 + h2 + h3 + …
f(n) = 1 – (1-1/L)n
N is the number of reuse distance samples, f(n) is the function that indicate the probability that a cache line has been evicted from the cache when it was in the cache n cache misses ago. L is the number of cache lines in the cache.
One simple solution is that divide the program into window and sampling for each window to handle program phases. But the overhead will be significantly increase to capture the short application phases. To avoid this overhead, phase detecting algorithms are used to divide the program into different phases, sampling each phase once is enough and the subsequent instances of the same phase can use the same profiled data.
Based on this thinking, They proposed the ScarPhase (Sample-based Classification and analysis for Runtime Phases). They group the reuse distance of the program by phases and than apply the StatCache model to calculate the miss ratio for each group. They use ScarPhase library to detect and classify phases base on the application’s execution history.
Also they found that variations can also happen inside phases and classified them into four types: No variations, Transition variations, Periodic variations and Instance variations.
(1)No variations mean little changes happened inside the phase, so only a small part of each phase needs to be profiled.
(2)Transition variations mean the behavior slowly changes form one phase to another. So the transition should be divided into windows from the start to the end. More windows to profile, more accuracy we gain.
(3)Periodic variations mean the behavior changes rapidly and periodically.
(4)Instance variations mean the same phase operating on different data may have different miss ratio.
Compared with StatCache model, the ScarPhase increases accuracy by 39% and efficiency by 600%.

