, IDA Center for Computing Sciences

Pages: pp. 12-13

Figure

Combinatorial questions and problems have been important to computational modeling since the earliest days of digital machines, but until recently, the computational science community has considered these topics to be a necessary substructure, not a main subject of interest. Here's an example of how things used to be: once upon a time, one of us worked as a beginning programmer for an entity called "The Westinghouse Astronuclear Lab." Much time and mental effort was spent writing a code that read from tape a matrix *A*, did a few computations to reorder *A*'s entries and make it *A*1, and then transposed *A*1 and wrote the transpose onto tape. The matrices, of course, didn't fit the core—in fact, the computer couldn't even store one row. We had to invent various "tricks'' for marking where the data came from and where it was to go. Interestingly, many of the engineers working on the problem had trouble understanding why reorganizing the data should be slightly hard, but those who'd written programs for the IBM 7090 got the point right away.

In reality, of course, questions about how to arrange data to minimize computation and provide rapid access have always been crucial to determining the success or failure of computational projects and computing machines. In a 1970 paper appropriately entitled "von Neumann's First Computer Program" ( *ACM Computing Surveys*, vol. 2, no. 4, 1970, pp. 247–260), Donald Knuth discussed what probably was the first program John von Neumann wrote for a stored program machine—a program for sorting data into non-decreasing order. According to Knuth, von Neumann chose this example because he was confident that the proposed EDVAC could do arithmetic, but he wasn't so sure about logical control in a complex process such as sorting.

The current upsurge in work on combinatorial computing is happening for several reasons. Today's machines enable computational modeling of extremely complex physical situations, and the models almost always involve discretization of some set of differential equations, which in turn generates sparse matrices that we can (and should) think of as graphs. In addition, the things being modeled have many interconnected parts—how the software handles these interconnections is a combinatorial problem. In addition to being able to perform extremely large calculations, modern hardware can also handle extremely large data sets. As a result, combinatorial problems generated by challenges in data mining and related topics are now central to computational science. Finally, there's the Internet itself, probably the largest graph-theory problem ever confronted.

Here are some of the main combinatorial metaproblems:

- Rearrange the data for easy access and/or to avoid extra computation.
- Find a specific item/show it to me.
- Count the number of items that have a specific property.

But for computational science, we should restate these metaproblems:.

- Rearrange the data—and don't take too long doing it.
- Find a specific item—fast!
- Count the number of items—quickly!

Combinatorial problems tend to be about graphs because graphs are extremely general, and they also tend to be hard—in fact, NP complete or #P complete. Therefore, we must restate the last two metaproblems once more:

- Find an item that's probably the one I'm looking for.
- Approximate the number of items fairly accurately and very quickly.

This issue's articles touch on several aspects of these metaproblems.

The article by Bruce Hendrickson and Jonathan W. Berry ("Graph Analysis with High-Performance Computing") describes some excellent examples of how to perform fast operations on graphs. It's likely that operations on graphs will soon be standard benchmarks for evaluating machine performance. For more about benchmarks, see the recent report, "The Landscape of Parallel Computing Research: A View from Berkeley" ( www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html).

In their article, "A Unified Framework for Numerical and Combinatorial Computing," John R. Gilbert, Steve Reinhardt, and Viral B. Shah describe tools and computing environments for working with combinatorial objects.

Ivona Bezáková ("Sampling Binary Contingency Tables") takes up some questions about the theory of algorithms for counting. This active research area is important to all parts of combinatorial computing.

Finally, Heiko Bauke's article, "Passing Messages to Lonely Numbers," is a beautiful example of combinatorial computing in action. He explains how to use a method called "message passing" or "belief propagation" to solve Sudoku puzzles. In addition to being amusing, his result is significant because Sudoku is known to be an NP-complete problem, so what he illustrates is one way to solve specific instances of NP-complete problems.

These articles sample some of the research areas in combinatorics in computing. They're an excellent sample, but, of course, there is much more to the field than what we present here—in fact, this topic will continue to grow as rapidly as computing grows. We expect you'll be seeing lots more on this subject over the next several years.

Isabel Beichl is a mathematician in the Information Technology Laboratory at the National Institute of Standards and Technology. Her research interests include probabilistic methods for physical problems, counting problems, and graph algorithms. Beichl has a PhD in mathematics from Cornell University. Contact her at isabel.beichl@nist.gov.

Francis Sullivan is the director of the IDA Center for Computing Sciences in Bowie, Maryland. He's also a former editor in chief of this magazine. His research interests include foundations of computing, statistical physics, and Monte Carlo approaches to combinatorial problems. Sullivan has a PhD in mathematics from the University of Pittsburgh. Contact him at fran@super.org.