This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches
June 1994 (vol. 43 no. 6)
pp. 664-675

The paper compares the trace-sampling techniques of set sampling and time sampling. Using the multi-billion reference traces of A. Borg et al. (1990), we apply both techniques to multi-megabyte caches, where sampling is most valuable. We evaluate whether either technique meets a 10% sampling goal: a method meets this goal if, at least 90% of the time, it estimates the trace's true misses per instruction with /spl les/10% relative error using /spl les/10% of the trace. Results for these traces and caches show that set sampling meets the 10% sampling goal, while time sampling does not. We also find that cold-start bias in time samples is most effectively reduced by the technique of D.A. Wood et al. (1991). Nevertheless, overcoming cold-start bias requires tens of millions of consecutive references.

[1] A. Agarwal, J. Hennessy, and M. Horowitz, "Cache performance of operating systems and multiprogramming workloads,"ACM Trans. Comput. Syst., vol. 6, pp. 393-431, Nov. 1988.
[2] A. Agarwal and M. Huffman, "Blocking: Exploiting spatial locality for trace compaction,"Proc. Conf. Measurement and Modeling of Computer Systems1990, pp. 48-57.
[3] A. Borg, R. E. Kessler, G. Lazana and D. W. Wall, "Long address traces from rise machines: Generation and analysis,"Res. Rep. 89/14, Western Res. Lab., Digital Equipment Corp., Palo Alto, CA, Sept. 1989.
[4] A. Borg, R.E. Kessler, and D.W. Wall, "Generation and Analysis of Very Long Address Traces,"Proc. 17th Int'l Symp. Computer Architecture, May 1990, IEEE CS Press, Los Alamitos, Calif. Order No. 2047, pp. 270-279.
[5] W. G. Cochran,Sampling Techniques, 3rd ed. New York: John Wiley, 1977.
[6] M. C. Easton and R. Fagin, "Cold-start versus warm-start miss ratios,"Commun. ACM, vol. 21, no. 10, pp. 866-872, Oct. 1978.
[7] P. Heidelberger and H. S. Stone, "Parallel trace-driven cache simulation by time partitioning," IBM Res. Rep. RC 15500, no. 68960, Feb. 1990.
[8] M. D. Hill and A. J. Smith, "Evaluating associativity in CPU caches,"IEEE Trans. Comput., vol. 38, no. 12, pp. 1612-1630, Dec. 1989.
[9] N.P. Jouppi, "Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers,"Proc. 17th Int'l Symp. Computer Architecture, CS Press, Los Alamitos, Calif., May 1990, pp. 364-373.
[10] R. E. Kessler, "Analysis of multi-megabyte secondary CPU cache memories," Ph.D. thesis, Comput. Sci. Tech. Rep. no. 1032, Univ. of Wisconsin-Madison, WI, July 1991.
[11] D. Kroft, "Lockup-free instruction fetch/prefetch cache organization," inProc. 8th Annu. Symp. Comput. Architecture, June 1981, pp. 81-87.
[12] S. Laha, J. H. Patel, and R. K. Iyer, "Accurate low-cost methods for performance evaluation of cache memory systems,"IEEE Trans. Comput., vol. 37, no. 11, pp. 1325-1336, Nov. 1988.
[13] S. Laha, "Accurate low-cost methods for performance evaluation of cache memory systems," Ph.D. dissertation, Univ. Illinois, IL, Nov. 1987.
[14] I. Miller, J. E. Freund, and R. Johnson,Probability and Statistics for Engineers, fourth ed. Englewood Cliffs, NJ: Prentice Hall, 1990.
[15] M. J. K. Nielsen, "Titan system manual," Res. Rep. 86/1, Western Res. Lab., Digital Equipment Corp., Palo Alto, CA, Sept. 1986.
[16] S. A. Przybylski, "Performance-directed memory hierarchy design," Ph.D. thesis, Tech. Rep. CSL-TR-88-366, Stanford Univ., Stanford, CA, Sept. 1988.
[17] S. Przybylski, M. Horowitz, and J. Hennessy, "Characteristics of performance optimal multilevel cache hierarchies," inProc. 16th Annu. Int. Symp. Comput. Architecture, 1989, pp. 114-121.
[18] T. Puzak, "An analysis of cache replacement algorithms," Ph.D. dissertation., Univ. of Massachusetts, Feb. 1985.
[19] A. D. Samples, "Mache: No-loss trace compaction," inProc. Int. Conf. Measurement and Modeling of Comput. Syst., 1989, pp. 89-97.
[20] A. J. Smith, "Two methods for the efficient analysis of memory address trace data,"TEEE Trans. Software Eng., vol. SE-3, no. 1, pp. 94-101, Jan. 1977.
[21] A. Smith, "Cache Memories,"Computing Surveys, Vol. 14, No. 3, Sept. 1982, pp. 473- 530.
[22] H. S. Stone,High Performance Computer Architecture. Reading, MA: Addison-Wesley, 1990.
[23] W. Wang and J. Baer, "Efficient trace-driven simulation methods for cache performance analysis," inProc. Conf. Measurement and Modeling of Comput. Syst., 1990, pp. 27-36.
[24] D. A. Wood, "The design and evaluation of in-cache address translation," Ph.D. thesis, Comput. Sci. Division, Tech, Rep, UCB/CSD 90/565, Univ. of California, Berkeley, CA, Mar. 1990.
[25] D. A. Wood, M. D. Hill, and R. E. Kessler, "A model for estimating trace-sample miss ratios," inProc. ACM SIGMETRICS Conf. Measurement and Modeling of Comput. Syst., 1991, pp. 79-89.

Index Terms:
buffer storage; memory architecture; program diagnostics; performance evaluation; digital simulation; trace-sampling techniques; multi-megabyte caches; time sampling; reference traces; sampling goal; relative error; cold-start bias; consecutive references.
Citation:
R.E. Kessler, M.D. Hill, D.A. Wood, "A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches," IEEE Transactions on Computers, vol. 43, no. 6, pp. 664-675, June 1994, doi:10.1109/12.286300
Usage of this product signifies your acceptance of the Terms of Use.