This Article 
 Bibliographic References 
 Add to: 
Measuring the Performance of Multimedia Instruction Sets
November 2002 (vol. 51 no. 11)
pp. 1317-1332

Abstract—Many microprocessor instruction sets include instructions for accelerating multimedia applications such as DVD playback, speech recognition, and 3D graphics. Despite general agreement on the need to support this emerging workload, there are considerable differences between the instruction sets that have been designed to do so. In this paper, we study the performance of five instruction sets on kernels extracted from a broad multimedia workload. We compare the performance of contemporary implementations of each extension against each other as well as to the original compiled C performance. From our analysis, we determine how well multimedia workloads map to current instruction sets, noting what was useful and what was not. We also propose two enhancements: fat subwords and strided memory operations.

[1] G.E. Allen, B.L. Evans, and L.K. John, “Real-Time High-Throughput Sonar Beamforming Kernels Using Native Signal Processing and Memory Latency Hiding Techniques,” Proc. 33rd IEEE Asilomar Conf. Signals, Systems, and Computers, pp. 137-141, 1999.
[2] Advanced Micro Devices, Inc., “AMD Athlon Processor x86 Code Optimization Guide,” Publication #22007, Rev. G, pdf22007.pdf, Apr. 2000.
[3] Advanced Micro Devices, Inc., “AMD Athlon Processor Technical Brief,” Publication #22054, Rev. D, pdf22054.pdf, Apr. 2000.
[4] Advanced Micro Devices, “3DNow! Technology vs. KNI,” white paper, , Apr. 2000.
[5] R. Bhargava, L.K. John, B.L. Evans, and R. Radhakrishnan, “Evaluating MMX Technology Using DSP and Multimedia Applications,” Proc. IEEE Symp. Microarchitecture, pp. 37-46, Dec. 1998.
[6] T. Burd, “General Processor Info,” CPU Info Center, localsummary.pdf, Apr. 2000.
[7] D.A. Carlson, R.W. Castelino, and R.O. Mueller, “Multimedia Extensions for a 550-MHz RISC Microprocessor,” IEEE J. Solid-State Circuits, vol. 32, no. 11, pp. 1618-1624, Nov. 1997.
[8] W. Chen, H.J. Reekie, S. Bhave, and E.A. Lee, “Native Signal Processing on the UltraSparc in the Ptolemy Environment,” Proc. 30th Ann. Asilomar Conf. Signals, Systems, and Computers. vol. 2, pp. 1368-1372, 1996.
[9] Compaq Computer Corp., “Alpha 21264 Microprocessor Hardware Reference Manual,” Part No. DS-0027A-TE,http://www. documentation/current/21264_EV67ds-0027a-te_21264_hrm.pdf , Apr. 2000.
[10] T.M. Conte, P.K. Dubey, M.D. Jennings, R.B. Lee, A. Peleg, S. Rathnam, M. Schlansker, P. Song, and A. Wolfe, “Challenges to Combining General-Purpose and Multimedia Processors,” Computer, vol. 30, no. 12, pp. 33-37, Dec. 1997.
[11] C. Hansen, "MicroUnity's MediaProcessor Architecture," IEEE Micro, July/Aug. 1996, pp. 34-41.
[12] Intel Corp., “Intel Introduces The Pentium Processor with MMX Technology,” Press Release, dp010897.htm, Apr. 2000.
[13] Intel Corp., “Intel Architecture Software Developer's Manual, Volume 1: Basic Architecture,” Publication 243190, manuals24319002.PDF, Apr. 2000.
[14] Intel Corp., “IA32 Intel Architecture Software Developer's Manual with Preliminary Intel Pentium 4 Processor Information, Volume 1: Basic Architecture” /future/manuals/24547001.pdf, Sept. 2000.
[15] Intel Corp., “Intel Announces New NetBurst Micro-Architecture for Pentium IV Processor,” dp082200.htm, Sept. 2000.
[16] J. Keshava and V. Pentkovski, “Pentium III Processor Implementation Tradeoffs,” Intel Technolog J., Quarter 2 1999, q21999/pdfimpliment.pdf, Apr. 2000.
[17] T. Kientzle, “Implementing Fast DCTs,” Dr. Dobb's J., vol. 24, no. 3, pp. 115-119, Mar. 1999.
[18] L. Kohn, G. Maturana, M. Tremblay, A. Prabhu, and G. Zyner, “Visual Instruction Set (VIS) in UltraSPARC™,” Proc. COMPCON '95, pp. 462-469, Mar. 1995.
[19] I. Kuroda and T. Nishitani, “Multimedia Processors,” Proc. IEEE, vol. 86, no. 6, pp. 1203-1221, June 1998.
[20] S. Larsen and S. Amarasinghe, “Exploiting Superword Level Parallelism with Multimedia Instruction Sets,” Proc. ACM SIGPLAN '00 Conf. Programming Language Design and Implementation, pp. 145-156, 2000.
[21] R.B. Lee, “Subword Permutation Instructions for Two-Dimensional Multimedia Processing in MicroSIMD Architectures,” Proc. IEEE Int'l Conf. Application-Specific Systems, Architectures, and Processors, pp. 3-14, 2000.
[22] R.B. Lee, “Multimedia Extensions for General Purpose Processors,” Proc. IEEE Workshop VLSI Signal Processing, pp. 1-15, 1997.
[23] MIPS Technologies, Inc., “MIPS Extension for Digital Media with 3D,” white paper, , Apr. 2000.
[24] Motorola Inc., “MPC7400 RISC Microprocessor User's Manual, Rev. 0,” Document MPC7400UM/D, teklibrary/manualsMPC7400UM.pdf, Apr. 2000.
[25] J. Nakashima and K. Tallman, “The VIS Advantage: Benchmark Results Chart VIS Performance,” white paper,, Apr. 2000.
[26] H. Nguyen and L.K. John, “Exploiting SIMD Parallelism in DSP and Multimedia Algorithms Using the AltiVec Technology,” Proc. 1999 Int'l Conf. Supercomputing, pp. 11-20, 1999.
[27] K. Noer, “Heat Dissipation Per Square Millimeter Die Size Specifications,”, Apr. 2000.
[28] K.B. Normoyle, M.A. Csoppenszky, A. Tzeng, T.P. Johnson, C.D. Furman, and J. Mostoufi, “UltraSPARC-IIi: Expanding the Boundaries of a System on a Chip,” IEEE Micro, vol. 18, no. 2, pp. 14-24, Mar./Apr. 1998.
[29] A.D. Pimentel, P. Struik, P. van der Wolf, and L.O. Hertzberger, “Hardware versus Hybrid Data Prefetching in Multimedia Processors: A Case Study,” Proc. IEEE Int'l Performance, Computing and Comm. Confe., pp. 525-531, 2000.
[30] P. Ranganathan, S. Adve, and N. Jouppi, “Performance of Image and Video Processing with General-Purpose Processors and Media ISA Extensions,” Proc. 26th Ann. Int'l Symp. Computer Architecture, pp. 124-135, 1999.
[31] S. Rathnam and G. Slavenburg, "An Architectural Overview of the Programmable Multimedia Processor, TM-1," Proc. Compcon, IEEE Computer Society Press,Los Alamitos, Calif., 1996, pp. 319-326.
[32] D.S. Rice, “High-Performance Image Processing Using Special-Purpose CPU Instructions: The UltraSPARC Visual Instruction Set,” Univ. of California at Berkeley, Master's report, Mar. 1996.
[33] P. Rubinfeld, B. Rose, and M. McCallig, “Motion Video Instruction Extensions for Alpha,” white paper, , Apr. 2000.
[34] N.T. Slingerland and A.J. Smith, “Design and Characterization of the Berkeley Multimedia Workload,” Technical Report CSD-00-1122, Univ. of California at Berkeley Computer Science, Dec. 2000, also to appear in ACM Multimedia Systems J.
[35] N.T. Slingerland and A.J. Smith, “Multimedia Instruction Sets for General Purpose Microprocessors: A Survey,” Technical Report CS-00-1124, Univ. of California at Berkeley Computer Science, Dec. 2000.
[36] N.T. Slingerland and A.J. Smith, “Performance Analysis of Instruction Set Architecture Extensions for Multimedia,” Proc. Third Workshop Media and Streaming Processors, pp. 53-75, Dec. 2001, also Technical Report CSD-00-1125, Univ. of California at Berkeley Computer Science, Dec. 2000.
[37] A. Stiller, “Architecture Contest,” c't Magazine, vol. 16/99,, Dec. 2000.
[38] P. Struik, P. van der Wolf, and A.D. Pimentel, “A Combined Hardware/Software Solution for Stream Prefetching in Multimedia Applications,” Proc. 10th Ann. Symp. Electronic Imaging (Multimedia Hardware Architectures track), pp. 120-130, 1998.
[39] “IEEE Standard Specifications for the Implementation of 8x8 Inverse Discrete Cosine Transform,” IEEE Standard 1180-1990, M.T. Sun ed., 1991.
[40] Sun Microsystems Inc., “UltraSPARC-IIi User's Manual,” Part No. 805-0087-01, SPARC-IIi/ docs805-0087.pdf, Apr. 2000.
[41] S. Thakkar and T. Huff, “The Internet Streaming SIMD Extensions,” Intel Technical J., q21999.htm, Apr. 2000.
[42] M. Tremblay et al., "VIS Speeds New Media Processing," IEEE Micro, Aug. 1996, pp. 10-20.
[43] L. Zhang, J.B. Carter, W. Hsieh, and S.A. McKee, “Memory System Support for Image Processing,” Proc. 1999 Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 98-107, 1999.

Index Terms:
SIMD, subword parallel, multimedia, performance measurement, benchmarking, MMX, SSE, AltiVec, VIS, MVI.
Nathan Slingerland, Alan Jay Smith, "Measuring the Performance of Multimedia Instruction Sets," IEEE Transactions on Computers, vol. 51, no. 11, pp. 1317-1332, Nov. 2002, doi:10.1109/TC.2002.1047756
Usage of this product signifies your acceptance of the Terms of Use.