Parallel and Distributed Processing Symposium, International (2011)
Anchorage, Alaska USA
May 16, 2011 to May 20, 2011
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/IPDPS.2011.60
We study the complexity of traversing tree-shaped workflows whose tasks require large I/O files. Such workflows typically arise in the multifrontal method of sparse matrix factorization. We target a classical two-level memory system, where the main memory is faster but smaller than the secondary memory. A task in the workflow can be processed if all its predecessors have been processed, and if its input and output files fit in the currently available main memory. The amount of available memory at a given time depends upon the ordering in which the tasks are executed. What is the minimum amount of main memory, over all post order schemes, or over all possible traversals, that is needed for an in-core execution? We establish several complexity results that answer these questions. We propose a new, polynomial time, exact algorithm which runs faster than a reference algorithm. Next, we address the setting where the required memory renders a pure in-core solution unfeasible. In this setting, we ask the following question: what is the minimum amount of I/O that must be performed between the main memory and the secondary memory? We show that this latter problem is NP-hard, and propose efficient heuristics. All algorithms and heuristics are thoroughly evaluated on assembly trees arising in the context of sparse matrix factorizations.
Y. Robert, M. Jacquelin, B. Uçar and L. Marchal, "On Optimal Tree Traversals for Sparse Matrix Factorization," 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011)(IPDPS), Anchorage, AK, 2011, pp. 556-567.