Mache: No-Loss Trace Compaction

By

A. Dain Samples

Trace compaction has two goals: (1) reduce the space required to hold traces, (2) reduce the time to get results from simulations using the traces. The general approach to compressing traces in the past has been to discard "irrelevant characteristics" of the trace, thereby reducing the trace by one to two orders of magnitude. This approach may lead to errors in the simulations that must be understood and controlled, and because the "relevant characteristics" are only given for a particular simulation, the original full-size trace must be stored if it is to be used in any other simulations. Mache is a scheme that reduces the storage space for a trace by an order of magnitude by taking advantage of the characteristics of the trace data without any lost information.

Other methods for compressing traces were mentioned, but they were interested in decreasing the simulation time of the traces as well as decreasing the amount of space needed to hold the traces. These techniques all introduced errors to their simulations. Mache is only interested in reducing the space needed to store the trace without losing any information.

Assuming a trace composed of a sequence of addresses with associated labels, the trace is "Mached" by first creationg a "cache difference trace". The last address read from the trace for each label type is contained in an array and initialized to zero at the start of the trace. When an address is read, the difference is computed between this address and the oncontained in the array. If the difference is below a determined threshold, a 'hit' occurs and the difference and label are emitted, otherwise a 'miss' occurs and the entire address with the label preceded by a miss indication is emitted. This "cache difference trace" is then fed to a dynamically adaptive one-pass Lemple-Ziv(LZ) compression scheme which takes advantage of patterns in the cache difference stream. For this paper, the hits were encoded in two bytes and the misses were encoded in five bytes, given that the original trace addresses were of 32 bits. The value of the threshold is a function of how the data is packed and in this scheme it would by 8192 as two bits are used for the label and one bit for sign.

Mache was tested using seven different traces that were obtained from various programs that represented different computer applications. The traces were similar in their mix: 50-70% instruction references, 20-30% data reads and 10-20% data writes. The traces were compressed using the UNIX "compress" program as well as the Mache technique. Mache always produced a trace file ten times smaller than the original file and almost always three times smaller than the "compress" file. It was shown that a majority of the compression from Mache is due to locality in the instruction addresses. The instruction addresses were reduced to 0.69% to 3.2% of their original sizes, but the data addresses were only reduced to 5.4% to 15% of their original sizes.

Although Mache does require some extra overhead to difference the addresses, keep the last address and pass the difference to the LZ compression routine, Mache is often faster than "compress" at compressing traces due to the decreased workload for the compression scheme. When decompressing a file, the overhead in Mache of caching and adding addresses represents a greater proportion of the execution time so that to "uncompress" is slightly faster than to "unMache", but the difference is small.

Other variations of the technique that were initially tested but latter replaced by the simple technique described above, were all more complex. Instead of the array containing the last address of each label, the array was a cache with the number of lines varying from 2 to 256. Although slightly better results were obtained, the added complexity to the program was not worth the small improvement.

There are three reasons why Mache works: (1) encoding differences between addresses often requires fewer bits than encoding the entire addresses, (2) the "cache difference" trace is more regular than the address trace, and (3) the LZ compression technique works mostly by discovering and encoding common sub-sequences in the data. Producing a "cache difference" trace results in more redundancy, smaller numbers, and obvious patters.


Back to Stan Dixon's home page


Stan Dixon <sdixon@nmsu.edu>
Last modified: Tue Oct 3 08:34:24 1995