Issue No. 03 - May/June (2007 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MCSE.2007.51
Paul F. Dubois , Contributing Editor
This issue's special theme is the computer programming language Python and the increasing role it plays in scientific projects. Free and universally available, Python comes with a vast standard library containing support for nearly every area of computer science. An even more extensive set of third-party tools and modules covers additional tasks, from managing a Web site to doing a fast Fourier transform to distributed or parallel programming. Python's motto, "batteries included," is meant to convey the idea that Python comes with everything you need.
Interpreted Doesn't Mean Slow or only Interactive
Python is an interpreted language, and it can be used interactively. Some might assume that this limits its uses—for example, that an interpreted language can't possibly be fast enough for scientific programming—but as you'll see, this isn't true. Others might assume that an interactive language can't be used in a large code, or in a batch or parallel system, but that's not right either.
Python's numerical extension NumPy adds an array language similar in power to the one in modern Fortran, in which operations are performed in compiled code. Together with modules for numerical mathematics and graphics, Python by itself is a powerful computational tool. Moreover, it's easy to make your own compiled code callable from Python (and able to call Python itself). Various tools help you make this connection quickly and easily, and once you connect to Python, you have access to Python's "batteries."
My own interest in Python focuses on using it for computational steering: Python serves as the input language to a scientific application, and the actual computations are performed both in Python itself and in compiled extensions. This approach gives users the chance to be creative, serves as a built-in symbolic debugger and interactive graphics capability, and reduces development time. Nobody I know who has experienced it has ever been willing to be without it in the future.
I first wrote a system for producing steered code in 1984 at Lawrence Livermore National Laboratory. This system, called Basis, was such a new idea that it was difficult at the time to explain it to people, but it proved very successful and developers have written at least 200 applications with it. Some of the larger ones are still in use today, and Basis is still an active project ( http://basis.llnl.gov). The key to its success is that the interpreted language I wrote for the steering was an array language quite similar to what was eventually in Fortran 95. By using the array operations, I could do real work in the interpreter (besides calling the compiled routines to do the bulk of the work). Most importantly, the language was simple—it was enough like Fortran that users could easily read it and learn to write it.
However, trouble was coming. Basis supported Fortran 77, and I could see that not only was Fortran going to evolve but the object-oriented revolution was upon us. So in the early 1990s, I contemplated my "Act II." I even designed an object-oriented interpreter and implemented a prototype. We held periodic meetings with Basis users to discuss their requirements.
One day I found Python and saw that it had a great similarity to my prototype, was better thought out, and much further along. The one thing it lacked was an array-language extension, but a special-interest group was already looking into that. At the next meeting, I mentioned it favorably, and David Grote, a member of the group, said that he, too, had just discovered it and thought it would do the job. I decided to throw my efforts into helping design the array extension that became Numerical Python. Jim Hugunin volunteered to write the code; he later moved on to create Jython and IronPython, the Java and .NET versions of Python. I took over as the project's coordinator, and five years later, I passed the torch to Perry Greenfield (whose article about telescopes appears in this issue). Now the project is led by Travis Oliphant, who describes it more fully on p.10.
The happy ending here is that we made a good choice, and LLNL now has many Python-based efforts built from scratch or wrapped around legacy codes, and others that evolved from Basis codes: hundreds of thousands of lines of C++, Python, and Fortran 95, all working together just as we hoped, doing compute-intensive calculations on massively parallel computers.
In this Issue
We begin the issue with a basic introduction to Python in general and the SciPy project in particular. SciPy gathers the high-performance array extension together with many modules for doing common mathematical and statistical functions. Our next major article introduces the advanced interactive interpreter IPython and the matplotlib graphics package. IPython is the computational tool of choice for some people, used in much the same way as commercial products such as Matlab but with access to the full Python world and at no cost. Matplotlib is rapidly becoming accepted as the standard two-dimensional graphics utility for Python, and the Scientific Programming department on p. 90 discusses it in even greater detail.
After the larger introductory articles, we have a series of shorter pieces that present specific scientific, engineering, and educational applications. To show you a wide variety, we tried to extract the basic material on the language and its tools into the first two articles, so we suggest you read those first after sneaking a peek at the pretty pictures in the application pieces. An extra article on Python in the classroom appears in the Education department.
Although I asked the authors to state briefly why they find Python helpful, I also asked them not to extensively argue for it over some other technology choice. As in Field of Dreams, we think we've built it and that you will come once you see it.
I hope you enjoy our special issue and will try the Python approach to scientific computing. Table 1 should get you started, with a list of basic Python resources that are open source and available without charge. There are many, many more; start your hunt at the Python Cheese Shop ( http://cheeseshop.python.org).
Paul F. Dubois is retired and lives in Pleasanton, California, where he contributes to open source projects and writes for CiSE. His column "Cafe Dubois" will return next issue. Contact him at firstname.lastname@example.org.