The Past, Present, and Future of Refactoring
Guest Editor’s Introduction • Emerson Murphy-Hill • December 2015
Translations by Osvaldo Perez and Tiejun Huang
Listen to the Guest Editor's Introduction
Software, as they say, is eating the world. From personal computers to our cell phones, cars, and vacuum cleaners, software is increasingly growing in terms of both the number of places we find it and the functionality we expect of it. Unfortunately, like most of us, as software matures and becomes more capable, it also deteriorates. Even software that starts out with a nice, clean design ends up a tangled mess as users demand more features be bolted-on and bugs be fixed more quickly. This mess becomes harder and more costly for its developers to maintain.
One of the primary ways that developers have coped with such deterioration over time is by way of refactoring — the practice of changing existing code’s design without changing the way it behaves. When a software’s design no longer fits its purpose, refactoring allows developers to change it up. Refactoring can entail a very small scale restructuring, such as changing a method’s signature so that it accepts a new parameter, or it can involve much larger restructuring, such as replacing an algorithm with another equivalent one or rearranging an inheritance hierarchy. Although the word refactoring has come into fashion relatively recently, software has needed to be restructured ever since it has been structured. After all, no one gets design right the first time.
Tools and Techniques
Refactoring by hand can be tedious and error-prone, but advances in integrated development environments have made automated refactoring tools feasible. Indeed, the mark of a mature development environment is that it has refactoring tools; such environments now include IntelliJ, Eclipse, Visual Studio, and Xcode. Given a software developer’s input on what code to refactor, such tools automatically validate the refactorings, as well as automate the changing of the code itself. For example, the most popular refactoring tool across development environments renames program elements. With this renaming refactoring tool, a developer selects, for example, a class name and chooses a new name; then, the tool makes sure that no existing classes have that name before making the change and updating all references accordingly. It’s a bit more complicated than that, of course, but that’s the gist.
Originally born of the object-oriented programming community, refactoring has grown to encompass a variety of paradigms. From logic programming to functional programming and even databases, the practice of refactoring is applicable to every kind of software you can imagine. In the same way that you’ll see patterns everywhere if you look hard enough, you’ll see patterns of change in the form of refactorings everywhere, as well.
Despite recent gains in the practice of refactoring, two chief challenges continue to confront researchers. First, refactoring tools must better support the types of changes developers want to make. Second, how can we be sure that a refactoring actually preserves a program’s behavior?
For Computing Now’s December 2015 issue, we present five articles that explore the state of the practice of refactoring.
In “The Birth of Refactoring: A Retrospective Look into the Nature of High-Impact Software Engineering Research,” Bill Griswold and Bill Opdyke reflect on their experiences as pioneers in the field, both because they helped coin the term and built the first refactoring tools as part of their PhD research. Reflecting on their experiences, the authors advise that young researchers shouldn’t be overly concerned about competing ideas; that luck and circumstance play a significant role in innovation; and that one should learn from failure.
Munawar Hafiz and Jeff Overbey examine popular notions regarding tools that automate refactoring in “Refactoring Myths.” They consider the ideas that refactoring tools are useful because they help automate tedious design changes, that a refactoring tool’s key benefit is its ability to guarantee behavior preservation, and that refactoring tools are robust. All of these myths do contain some truth, but as Hafiz and Overbey explain, they aren’t entirely accurate.
“When and Why Your Code Starts to Smell Bad” is Michele Tufano and his colleagues’ award-winning paper from the 2015 International Conference on Software Engineering. In it, they investigate the idea of “code smells” — that is, symptoms of poor design that can be alleviated by refactoring. In analyzing more than 200 open source projects, Tufano and his colleagues found evidence that most code smells are not introduced during evolutionary software development tasks, as conventional wisdom suggests.
Miryung Kim, Thomas Zimmermann, and Nachiappan Nagappan take a deep dive into refactoring at Microsoft by surveying and interviewing developers, as well as analyzing their code in “An Empirical Study of Refactoring Challenges and Benefits at Microsoft.” Among other details, Kim and her colleagues find that practitioners often blur the definition of refactoring, and that refactored modules in Windows 7 experienced a relative reduction in complexity.
The final article in this theme is “How We Refactor, and How We Know It,” a large-scale study of refactoring in which Chris Parnin, Andrew P. Black, and I build on our previous award-winning paper. We study refactoring using multiple techniques, in some cases confirming and in others disconfirming prior results about how developers refactor. For example, rather than refactoring in isolation, developers will often interleave refactorings with other types of changes, which makes traditional refactoring tools somewhat tricky to apply.
Refactoring continues to be a frequent and necessary practice, and concurrently, a burgeoning research topic. As you read these articles and explore further, please join me in building a future that enables developers to refactor effectively, efficiently, and correctly.
Emerson Murphy-Hill is an associate professor in North Carolina State University’s Department of Computer Science. His research interests include the intersection between human–computer interaction and software engineering. Murphy-Hill received a PhD in computer science from Portland State University. Contact him at email@example.com; http://people.engr.ncsu.edu/ermurph3.