This Article 
 Bibliographic References 
 Add to: 
Native Data Representation: An Efficient Wire Format for High-Performance Distributed Computing
December 2002 (vol. 13 no. 12)
pp. 1234-1246

Abstract—New trends in high-performance software development such as tool- and component-based approaches have increased the need for flexible and high-performance communication systems. When trying to reap the well-known benefits of these approaches, the question of what communication infrastructure should be used to link the various components arises. In this context, flexibility and high-performance seem to be incompatible goals. Traditional HPC-style communication libraries, such as MPI, offer good performance, but are not intended for loosely-coupled systems. Object- and metadata-based approaches like XML offer the needed plug-and-play flexibility, but with significantly lower performance. We observe that the flexibility and baseline performance of data exchange systems are strongly determined by their wire formats, or by how they represent data for transmission in heterogeneous environments. After examining the performance implications of using a number of different wire formats, we propose an alternative approach for flexible high-performance data exchange, Native Data Representation, and evaluate its current implementation in the Portable Binary I/O library.

[1] B. Parvin, J. Taylor, G. Cong, M. O'Keefe, and M.-H. Barcellos-Hoff, “Deepview: A Channel for Distributed Microscopy and Informatics,” Proc. Supercomputing '99 Conf. (SC1999), Nov. 1999.
[2] C.M. Pancerella, L.A. Rahn, and C.L. Yang, “The Diesel Combustion Collaboratory: Combustion Researchers Collaborating over the Internet,” Proc. Supercomputing '99 Conf. (SC 1999), Nov. 1999.
[3] R. Bramley, K. Chiu, S. Diwan, D. Gannon, M. Govindaraju, N. Mukji, B. Temko, and M. Yechuri, “A Component Based Services Architecture for Building Distributed Applications,” Proc. Ninth High Performance Distributed Computing (HPDC-9), Aug. 2000.
[4] S. Parker and C.R. Johnson, “SCIRun: A Scientific Programming Environment for Computational Steering,” Proc. Supercomputing '95 Conf. (SC1995), Dec. 1995.
[5] T. Haupt, E. Akarsu, and G. Fox, “Webflow: A Framework for Web Based Metacomputing,” High-Performance Computing and Networking, Seventh Int'l Conf. (HPCN Europe), pp. 291-299, Apr. 1999.
[6] A. Wollrath, R. Riggs, and J. Waldo, “A Distributed Object Model for Java System,” Proc. USENIX Conf. Object Oriented Technologies and Systems (COOTS 1996), 1996.
[7] “W3C, Extensible Markup Language (XML),” http://w3c.orgXML. 2002.
[8] F.E. Bustamante, G. Eisenhauer, K. Schwan, and P. Widener, “Efficient Wire Formats for High Performance Computing,” Proc. Supercomputing '00 Conf. (SC2000), Nov. 2000.
[9] D. Zhou, K. Schwan, G. Eisenhauer, and Y. Chen, “JEChoInteractive High Performance Computing with Java Event Channels,” Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS 2001), Apr. 2001.
[10] M.-C. Rosu, K. Schwan, and R. Fujimoto, “Supporting Parallel Applications on Clusters of Workstations: The Virtual Communication Machine-Based Architecture,” Cluster Computing, Special Issue on High Performance Distributed Computing, vol. 1, pp. 51-67, Jan. 1998.
[11] M. Welsh, A. Basu, and T.V. Eicken, “Incorporating Memory Management into User-Level Network Interfaces,” Proc. Hot Interconnects V, pp. 27-36, 1997.
[12] G. Eisenhauer, “Portable Self-Describing Binary Data Streams,” Technical Report GIT-CC-94-45, College of Computing, Georgia Inst. of Tech nology, 1994.
[13] V.S. Sunderam, A. Geist, J. Dongarra, and R. Manchek, “The PVM Concurrent Computing System,” Parallel Computing, vol. 20, pp. 531-545, Mar. 1994.
[14] I. Foster, C. Kesselman, and S. Tuecke, “The Nexus Approach to Integrating Multithreading and Communication,” J. Parallel and Distributed Computing, pp. 70-82, 1996.
[15] “M.P.I.M. Forum, MPI: A Message Passing Interface Standard,” technical report, Univ. of Tennessee, 1995.
[16] M. Lauria, S. Pakin, and A.A. Chien, “Efficient Layering for High Speed Communication: Fast messages 2.x,” Proc. Seventh High Performance Distributed Computing (HPDC-7), July 1998.
[17] Object Management Group, “The Common Object Request Broker Architecture and CORBA 2.0/IIOP Specification,” technical report, OMG, formalcorba_iiop.htm. Dec. 1998.
[18] G.T. Almes, “The Impact of Language and System on Remote Procedure Call Design,” Proc. Sixth Int'l Conf. Distributed Computing Systems (ICDCS), pp. 414-421, May 1986.
[19] S.W. O'Malley, T.A. Proebsting, and A.B. Montz, “Universal Stub Compiler,” Proc. Symp. Comm. Architectures and Protocols (SIGCOMM '94), Aug. 1994.
[20] D.D. Clark and D.L. Tennenhouse, “Architectural Considerations for a New Generation of Protocols,” Proc. Symp. Comm. Architectures and Protocols (SIGCOMM '90), pp. 200-208, Sept. 1990.
[21] M. Schroeder and M. Burrows, “Performance of Firefly RPC,” Proc. 12th ACM Symp. Operating System Principles, pp. 83-90, Dec. 1989.
[22] P. Widener, G. Eisenhauer, and K. Schwan, “Open Metadata Formats: Efficient XML-Based Communication for High Performance Computing,” Proc. 10th High Performance Distributed Computing (HPDC-10), Aug. 2001.
[23] “MPICHA Portable Implementation of MPI,” Argonne Nat'l Lab, 2002.
[24] J. Clark, “ExpatXML Parser Toolkit,” 2002.
[25] A. Gokhale and D. Schmidt, “Principles for Optimizing Corba Internet Inter-Orb Protocol Performance,” Proc. 31st Hawaii Int'l Conf. Systems Sciences, Jan. 1998.
[26] D.R. Engler, “Vcode: a Retargetable, Extensible, Very Fast Dynamic Code Generation System,” Proc. SIGPLAN Conf. Programming Language Design and Implementation (PLDI '96), May 1996.
[27] G. Eisenhauer and L.K. Daley, “Fast Heterogenous Binary Data Interchange,” Proc. Heterogeneous Computing Workshop (HCW 2000), May 2000.
[28] F.E. Bustamante, “The Active Streams Approach to Adaptive Distributed Applications and Services,” PhD Thesis, College of Computing, Georgia Inst. of Technology, Atlanta, Nov. 2001.
[29] R. Krishnamurthy, K. Schwan, R. West, and M. Rosu, “A Network Coprocessor Based Approach to Scalable Media Streaming in Servers,” Proc. Int'l Conf. Parallel Processing (ICPP 2000), Aug. 2000.

Index Terms:
High-performance, distributed computing, communication, wire format.
Greg Eisenhauer, Fabián E. Bustamante, Karsten Schwan, "Native Data Representation: An Efficient Wire Format for High-Performance Distributed Computing," IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 12, pp. 1234-1246, Dec. 2002, doi:10.1109/TPDS.2002.1158262
Usage of this product signifies your acceptance of the Terms of Use.