This Article 
 Bibliographic References 
 Add to: 
Using Automatic Process Clustering for Design Recovery and Distributed Debugging
June 1995 (vol. 21 no. 6)
pp. 515-527
Distributed applications written in Hermes typically consist of a large number of sequential processes. The use of a hierarchy of process clusters can facilitate the debugging of such applications. Ideally, such a hierarchy should be derived automatically. This paper discusses two approaches to automatic process clustering, one analyzing runtime information with a statistical approach and one utilizing additional semantic information. Tools realizing these approaches were developed and a quantitative measure to evaluate process clusters is proposed. The results obtained under both approaches are compared, and indicate that the additional semantic information improves the cluster hierarchies derived. We demonstrate the value of automatic process clustering with an example. It is shown how appropriate process clusters reduce the complexity of the understanding process, facilitating program maintenance activities such as debugging.

[1] T.A. Corbi,“Program understanding: Challenge for the 1990s,” IBM Systems J., vol. 28, no. 2, pp. 294-306, 1989.
[2] R.D. Banker, S.M. Datar, C.F. Kemerer, and D. Zweig, "Software Complexity and Maintenance Costs," Comm. ACM, vol. 36, pp. 81-94, Nov. 1993.
[3] D.J. Taylor,“The use of process clustering in distributed-system event displays,” Proc. of the 1993 CAS Conf.,Toronto, Ont., Canada, Oct. 1993, IBM Canada Ltd. Laboratory, Centre for Advanced Studies, pp. 505-512.
[4] E.J. Byrne,“Software reverse engineering: A case study,” Software—Practice and Experience, vol. 21, no. 12, pp. 1,349-1,364, Dec. 1991.
[5] R.E. Strom,D.F. Bacon,A.P. Goldberg,A. Lowry,D.M. Yellin,, and S.A. Yemini,HERMES: A Language for Distributed Computing.Englewood Cliffs, N.J.: Prentice Hall, Inc., 1991.
[6] P. Bates,“Distributed debugging tools for heterogeneous distributed systems,” Proc. 8th Int’l Conf. on Distributed Computing Systems,San Jose, Calif., June 1988, pp. 308-315.
[7] W. Hseush and G.E. Kaiser,“Data path debugging: Data-oriented debugging for a concurrent programming language,” Proc. ACM SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging,Madison, Wis., May 1988, pp. 236-247, Appeared as ACM SIGPLAN Notices, vol. 24, no. 1, Jan. 1989.
[8] J.E. Lumpp, Jr.,T.L. Casavant,H.J. Siegel,, and D.C. Marinescu,“Specification and identification of events for debugging and performance monitoring of distributed multiprocessor systems,” Proc. 10th Int’l Conf. on Distributed Computing Systems,Paris, France, May 1990, pp. 476-483.
[9] G.R. Andrews,R.A. Olsson,M. Coffin,I. Elshoff,K. Nilsen,T. Purdin,, and G. Townsend,“An overview of the SR language and implementation,” ACM Trans. on Programming Languages and Systems, vol. 10, no. 1, pp. 51-86, Jan. 1988.
[10] J. Kramer,J. Magee,M. Sloman,, and N. Dulay,“Configuring object-based distributed programs in REX,” Software Eng. J., vol. 7, no. 2, pp. 139-149, Mar. 1992.
[11] J. Magee, N. Dulay, and J. Kramer, "Structuring Parallel and Distributed Programs," Software Eng. J., Vol. 8, No. 2, 1993, pp. 73-82.
[12] M.S. Aldenderfer and R.K. Blashfield,Cluster Analysis, Sage University Paper Series on Quantitative Applications in the Social Sciences, Series no. 07-044. Beverly Hills, Calif.: Sage Publications, Inc., 1984.
[13] E.R. Barnes,“An algorithm for partitioning the nodes of a graph,” SIAM J. on Algebraic and Discrete Methods, vol. 3, no. 4, pp. 541-550, 1982.
[14] L.R. Ford and D.R. Fulkerson,Flows in Networks.Princeton, N.J.: Princeton University Press, 1962.
[15] M.J. ${\rm Noru{\check s}is}$,SPSSXAdvanced Statistics Guide.New York: McGraw Hill, 1985.
[16] C.M. Fiduccia and R.M. Mattheyses, "A Linear Time Heuristic for Improving Network Partitions," Proc. 19th Design Automation Conf., pp. 175-181, 1982.
[17] E.J. Chikofsky and J.H. Cross II, "Reverse Engineering and Design Recovery: A Taxonomy," IEEE Software, Vol. 7, No. 1, Jan./Feb. 1990, pp. 13-17.
[18] A. Cimitile and U. de Carlini,“Reverse engineering: Algorithms for program graph reduction,” Software—Practice and Experience, vol. 21, no. 5, pp. 519-537, May 1991.
[19] P. Benedusi,A. Cimitile,, and U. De Carlini,“A reverse engineering methodology to reconstruct hierarchical data flowdiagrams for software maintenance,” Proc. Conf. on Software Maintenance,Los Alamitos, Calif., 1989, pp. 180-189.
[20] Y. Chen, M. Nishimito, and C. Ramamoorthy, "C Information Abstraction System," IEEE Trans. Software Eng., vol. 16, no. 3, pp. 325-334, Mar. 1990.
[21] D. Bustard,J. Elder,, and J. Welsh,Concurrent Program Structures.Englewood Cliffs, N.J.: Prentice Hall International Ltd, 1988.
[22] R.C.T. Lai,“Ada task taxonomy support for concurrent programming,” ACM SIGSOFT Software Eng. Notes, vol. 16, no. 1, pp. 73-91, Jan. 1991.
[23] A.L. Ambler,M.M. Burnett,, and B.A. Zimmermann,“Operational versus definitional: A perspective on programming paradigms,” IEEE Computer, vol. 25, no. 9, pp. 28-43, Sept. 1992.
[24] R.W. Floyd,“The paradigms of programming,” Comm. of the ACM, vol. 22, no. 8, pp. 455-460, Aug. 1979, 1978 ACM Turing Award Lecture.
[25] T. Kunz,“Process clustering for distributed debugging,” Proc. ACM/ONR Workshop on Parallel and Distributed Debugging,San Diego, Calif., May 1993, pp. 75-84, appeared as ACM SIGPLAN Notices, vol. 28, no. 12, Dec. 1993.
[26] T. Kunz,Abstract Behaviour of Distributed Executions with Applications to Visualization, PhD thesis, Technische Hochschule Darmstadt, Darmstadt, Germany, May 1994.
[27] N. Carriero and D. Gelernter, "How to Write Parallel Programs: A Guide to the Perplexed," ACM Computing Surveys, vol. 21, no. 3, pp. 323-358, Sept. 1989.
[28] T. Kunz,“Developing a measure for process cluster evaluation,” Tech. Rep. TI-2/93, Technical Univ. Darmstadt, Feb. 1993.
[29] R. Fairley,Software Engineering Concepts, McGraw-Hill Series in Software Engineering and Technology. New York: McGraw-Hill Book Company, 1985.
[30] S. Patel,W. Chu,, and R. Baxter,“A measure for composite module cohesion,” Proc. 14th Int’l Conf. on Software Eng.,Melbourne, Australia, May 1992, pp. 38-48.
[31] R. Schwanke, "An Intelligent Tool for Re-Engineering Software Modularity," Proc. 13th Int'l Conf. Software Eng., 1991.
[32] D.J. Taylor,“A prototype debugger for Hermes,” Proc. 1992 CAS Conf., vol. 1, Toronto, Ont., Canada, Nov. 1992, IBM Canada Ltd. Laboratory, Centre for Advanced Studies, pp. 29-42.
[33] D.W. Krumme,A.L. Couch,, and G. Cybenko,“Debugging support for parallel programs,” J. Dongarra, I. Duff, P. Gaffney, and S. McKee, eds., Vector and Parallel Computing: Issues in Applied Research and Development.Chichester, England: Ellis Horwood Limited, 1989, pp. 205-214.
[34] T. Kunz and J.P. Black,“Abstract debugging of distributed applications,” K.M. Decker and R.M. Rehmann, eds., Proc. IFIP WG10.3 Working Conference on Programming Environments for MassivelyParallel Distributed Systems.Basel, Switzerland: Birkhäuser Verlag, 1994, pp. 353-358.
[35] H.A. Müller,M.A. Orgun,S.R. Tilley,, and J.S. Uhl,“A reverse-engineering approach to subsystem structure identification,” Software Maintenance: Research and Practice, vol. 5, no. 4, pp. 181-204, Dec. 1993.

Index Terms:
Cluster analysis, design recovery, distributed debugging, Hermes, process clustering, reverse engineering.
Thomas Kunz, James P. Black, "Using Automatic Process Clustering for Design Recovery and Distributed Debugging," IEEE Transactions on Software Engineering, vol. 21, no. 6, pp. 515-527, June 1995, doi:10.1109/32.391378
Usage of this product signifies your acceptance of the Terms of Use.