Real Simulations
By David Alan Grier

Of all the branches or disciplines of computer science, the field of Monte Carlo simulation is perhaps the least understood. It is often viewed as something quite distant from computer science, an application for computers rather than a body of research that has contributed to the development of computers. Yet Monte Carlo simulation is deeply connected to the history of computing. Many early programs and many of the most sophisticated programs for the ENIAC, the first big all-electronic computing machine, were in fact Monte Carlo simulations of physical processes. “Monte Carlo methods proved to be of great importance to scientific computing and operations research after their computerized debut on ENIAC,” explained historians Tom Haigh, Mark Priestly, and Christian Rope.[1]

If we look beyond the fact that simulation was an early application for computing, we can discern a narrative in which simulation proved to be a model for computer science, a model that encouraged it to embrace “black-boxing,” a process that hides and abstracts details. By its very nature, simulation embraces the idea of black-boxing because it has the goal of presenting a simplified but realistic picture of a natural phenomenon. It was not obvious at the start of the computing era that computer science had the same goal.

Before I discuss the way that simulation and computer science interacted, it would be useful to examine the methods of Monte Carlo simulation and understand why it appealed to the early computer scientists. Monte Carlo simulation concepts predate the computing era by about two centuries and are grounded in probability theory and games of chance. In the 18th century, the French mathematician George Louis Leclerc, Comte de Buffon, noted that a certain game of chance could provide a good estimate for the number π (3.14159…). This game required a piece of paper ruled with parallel lines with each line a fixed distance from its neighbors, say 5 cm. It also required a needle that was as long as the distance between the lines, in this case, 5 cm.

The game of chance involved putting the paper on a tabletop and throwing the needle so that it would land on it. When the needle landed on the paper, it would either touch one of the lines or it would completely fall between two lines. The game would require multiple plays. At each step, the person playing it would record if it touched a line or if it fell between two. At the end of all the stops, that person would divide the total number of trials by the number of times that it touched one line. That ratio would prove to be an estimate of π. The more steps in the game, the closer that number would be.

## Background on Simulation

In the 19th century, this problem, which acquired the name of “Buffon’s Needle,” became an example of a mathematical theorem known as the “Law of Large Numbers.” That theorem said that when determining the value of a certain class of mathematical expressions, integrals could be estimated by random processes and that these random processes would converge, get closer and closer to the true value of the expression, the more the random processes were repeated. This process, estimating a mathematical quantity with random processes, came to be known as Monte Carlo integration or Monte Carle simulation after the name of the gambling casino in the principality of Monaco.

Prior to the development of the computer in the mid-1940s, Monte Carlo simulation was at best a curiosity. For most settings, it was not practicable to estimate the value of an integral by creating the appropriate random process and repeating that process multiple times. However, the electronic computer changed the economy of simulation for it could do repeated calculations easily. However, it was designed to be a deterministic machine, a machine that did the same thing every time it ran the same program. It was not clear that it could create and manage random processes.

At the time, the fields of physics, telecommunications, and production management created a series of problems that was difficult, important, and unable to be analyzed with traditional mathematical tools. The production management problem dated to the end of the 19th century and perhaps to the start of the industrial age. It involved studying how a product moved through a factory or a distribution network and identifying the crucial points in the process that might slow production. The communications problem, which dated to the 1910s and 1920s, dealt with a similar issue for telephone networks. It considered how messages moved through a grid and the points where they tended to stall.

Physics proposed several different problems, although all had common elements and all were involved in the development of atomic weaponry. One of the first of these problems involved looking at how neutrons moved through various materials and interacted with other particles. All these problems from production, communications, and physics could be analyzed mathematically, but all of the analyses were too complex to produce simple numerical answers.

Computer scientists began working on the physics problem in early 1946. Indeed, the ENIAC and many of the other early machines devoted a great deal of time to solving these kinds of problems. The production and communications problems became prominent about a decade later. One of the earliest of these simulations created a model of automobile traffic flowing through a neighborhood of Detroit, Michigan.[2] However, the field of simulation grew rapidly. The Institute for Numerical Analysis at the University of California, Los Angeles (UCLA), held the first conference on this method in 1951. By the middle of the decade, papers on simulation were being presented at the AFIPS Conference on Computing, the big computing conference of its day. In 1961, researchers had developed special software for simulation, and in 1965, they formed their own conference developed entirely to simulation. It was one of the first meetings devoted to a subfield of computing.

## Black Boxing

In the field of computer engineering, the process known as black boxing is usually associated with the 1972 paper by David Parnas, “On the Criteria to Be Used in Decomposing Systems into Modules.”[3] This paper largely introduced the concepts of information hiding and software modules. This work has been one of the foundations of software engineering and has supported a tremendous amount of software development.

However, the field of computer science, indeed most fields of science or technology, has undergone a more subtle form of black boxing, of answering fundamental questions in the field and closing the controversy over those questions. It is what philosopher Bruno Latour describes as moving from the “process of making science” to that of working with “ready made science.”[4] Once a field goes though one of these transitions, it puts aside the debate over a question and starts work based on the answer that has been accepted by the scientific community.

Latour cites the acceptance of the double helix structure of DNA as an example of this black-boxing process. Molecular biologists did not immediately embrace the idea of James Watson and Francis Crick when they announced that they had determined that DNA was organized in a double helix structure. Indeed, scientists continued to argue the nature of DNA structure for at least the next four years. Eventually, the field accepted Watson and Crick’s solution. After that, no researcher began the study of DNA without acknowledging its double helix structure.[4]

The field of computer science has not always accepted the idea that it has answered some fundamental questions and needs to move forward with the accepted answers. The introduction of high-level languages provides a good example. They start to appear in 1958 with Fortran. However, many computer scientists resisted them well into the 1980s. Members of this cohort argued that high-level languages were too inefficient and that they hid too much of the underlying structure.

By contrast, the Monte Carlo simulation community was relatively quick to accept answers to fundamental questions and start research based on the answers to those questions. The bulk of the simulation community, after all, was attempting to build models and understood that modeling involves ignoring, simplifying or hiding details. Historian Peter Gallison argued that the field was a “trading zone,” a discipline that had no central agreement but a loosely connected set of communities that operated to advance their own agenda. As a result, elements of the field were always willing to accept new ideas if those ideas helped them meet their goals.5[]

## Black Boxing Monte Carlo Simulation

Over the course of 25 years, the simulation made major decisions on four major aspects of its field: the acceptability of pseudorandom number generation,[6] the standard algorithm of events driven for priority queues,[7] the use of object oriented programming to represent simulated objects,[8] and the protocols for distributed simulation.[9]

The process of black boxing is most clearly seen in the acceptance of pseudorandom number generation. The controversy over whether deterministic computers could generate took over the first decade of the field. Many researchers felt that deterministic computers could not possibly produce a stream of digits that truly possessed the same properties as those derived from physical processes. Indeed, Derrick Lehmer, who is generally credited with inventing the first practical random generator noted, “A pseudo-random sequence is a vague notion embodying the idea of a sequence in which each term is unpredictable to the uninitiated.”[10] As an attempt to create an alternative to computer generated numbers, the RAND Corporation created a book of a million random digits using a roulette wheel.[11]

Yet, over the course of roughly a decade, researchers began to realize that the Monte Carlo field was built on a foundation that didn’t strictly require numbers to be truly random. The theorems would work if the numbers adhered to a certain statistical distribution and exhibited no autocorrelation, no relationships between successive numbers. Once the community recognized that fact, it ceased to debate random number generation and accepted it as a fact in the field, and random number generators were one of the tools for simulation.

Yet, the tool of the random number generator was not a static concept. Three times since the early 1960s, researchers have reopened the box of random number generation to fix problems in the current algorithms. In all cases, however, no one suggested that random number generation needed to be abandoned.

When the field accepted random number generation as a key tool, it made a subtle shift in methodology. Practitioners spent less time justifying the probability models behind their simulation and more time validating the output of their simulation. They would argue the correctness of their model not because the underlying mathematics was correct but because the simulation program could duplicate data that had been collected from a real phenomenon.

## Conclusion

For years, the oldest conference in the field, the IEEE’s Winter Simulation Conference, has slyly mocked the other parts of computer science that tended to dismiss it. That mockery came in the form of the award that it offered to those who made the greatest contributions to the conference. That award was a snow globe, a glass sphere that was filled with a miniature winter scene and plastic flakes that would swirl like snow whenever someone shook the globe. “We Don’t Simulate Winter,” read a plaque on the globe. “We Are Simulation.”

Ultimately, simulation has had a tremendous impact on computer science. It has helped to create much of the technology behind computer games and been a principal application for ideas from high-speed computing and computer graphics. It helped to refine the concepts of many computing languages and was the principal test bed for object-oriented programming. But perhaps it best served as an example for how a field could quickly solve difficult problems and end debate over the solutions to those problems. They were able to do this because they were more interested in modeling than in justifying the framework of modeling. They didn’t simulate winter. They weren’t trying to justify their work by how it advanced the foundation of computer science. They were trying to simulate the world.

### References

1. T. Haigh, M. Priestly, and C. Rope, “Los Alamos Bets on ENIAC: Nuclear Monte Carlo Simulations, 1947–1948,” Annals of the History of Computing, vol. 36, no. 3, 2014, pp. 42–63.
2. W. Hoffman and R. Pavley, “Applications of Digital Computers to Problems in the Study of Vehicular Traffic,” Proc. Western Joint IRE-AIEE-ACM Computer Conf., 1959, pp. 159–161.
3. D. Parnas, “On the Criteria for Decomposing Software into Modules,” Comm. ACM, vol. 15, no. 12, 1972, pp. 1053–1058.
4. B. Latour, “Opening Pandora’s Box,” Science in Action, Harvard Univ. Press, 1987.
5. P. Galison, “Computer Simulation and the Trading Zone,” The Disunity of Science: Boundaries, Contexts, and Power, Galison and Stump, eds., Stanford Univ. Press, 1996, p. 119.
6. D. Knuth, Semi-Numerical Algorithms, third edition, Addison-Wesley, 1997.
7. H. Harman, “Simulation: A Survey,” Proc. Western Joint IRE-AIEE-ACM Computer Conf., 1961, pp. 1–9.
8. R. Nance, “Simulation Programming Languages: An Abridged History,” Proc. 1995 Winter Simulation Conference, eds. C. Alexopoulos et al., IEEE, 1995.
9. A.R. Pope and R.L. Schaffer, The SIMNET Network and Protocols, BBN report 7627, Cambridge, June 1991.
10. D.H. Lehmer, “Mathematical Methods in Large-Scale Computing Units,” Proc. 2nd Symp. Large-Scale Digital Calculating, Harvard Computing Laboratory, Cambridge, 1949.
11. G. Brown, A History of Rand’s Random Digits, tech. report P-113, Rand Corporation, 1949.