Refactoring: Improving the Design of Existing Code

Preface

Once upon a time, a consultant made a visit to a development project in order to look at some of the code that had been written. As he wandered through the class hierarchy at the center of the system, the consultant found it rather messy. The higher-level classes made certain assumptions about how the classes would work—assumptions that were embodied in inherited code. That code didn’t suit all the subclasses, however, and was overridden quite heavily. Slight modifications to the superclass would have greatly reduced the need to override it. In other places, an intention of the superclass had not been properly understood, and behavior present in the superclass was duplicated. In yet other places, several subclasses did the same thing with code that could clearly be moved up the hierarchy.

The consultant recommended to the project management that the code be looked at and cleaned up—but the project management wasn’t enthusiastic. The code seemed to work and there were considerable schedule pressures. The managers said they would get around to it at some later point.

The consultant had also shown what was going on to the programmers working on the hierarchy. The programmers were keen and saw the problem. They knew that it wasn’t really their fault; sometimes, a new pair of eyes is needed to spot the problem. So the programmers spent a day or two cleaning up the hierarchy. When finished, they had removed half the code in the hierarchy without reducing its functionality. They were pleased with the result and found that it became quicker and easier both to add new classes and to use the classes in the rest of the system.

The project management was not pleased. Schedules were tight and there was a lot of work to do. These two programmers had spent two days doing work that added nothing to the many features the system had to deliver in a few months’ time. The old code had worked just fine. Yes, the design was a bit more “pure” and a bit more “clean.” But the project had to ship code that worked, not code that would please an academic. The consultant suggested that a similar cleanup should be done on other central parts of the system, which might halt the project for a week or two. All this was to make the code look better, not to make it do anything it didn’t already do.

How do you feel about this story? Do you think the consultant was right to suggest further cleanup? Or do you follow that old engineering adage, “if it works, don’t fix it”?

I must admit to some bias here. I was that consultant. Six months later, the project failed, in large part because the code was too complex to debug or tune to acceptable performance.

The consultant Kent Beck was brought in to restart the project—an exercise that involved rewriting almost the whole system from scratch. He did several things differently, but one of the most important changes was to insist on continuous cleaning up of the code using refactoring. The improved effectiveness of the team, and the role refactoring played, is what inspired me to write the first edition of this book—so I could pass on the knowledge that Kent and others have acquired by using refactoring to improve the quality of software.

Since then, refactoring has become an accepted part of the vocabulary of programming. And the original book has stood up rather well. However, eighteen years is an old age for a programming book, so I felt it was time to go back and rework it. Doing this had me rewrite pretty much every page in the book. But, in a sense, very little has changed. The essence of refactoring is the same; most of the key refactorings remain essentially the same. But I do hope that the rewriting will help more people learn how to do refactoring effectively.

What Is Refactoring?

Refactoring is the process of changing a software system in a way that does not alter the external behavior of the code yet improves its internal structure. It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence, when you refactor, you are improving the design of the code after it has been written.

“Improving the design after it has been written.” That’s an odd turn of phrase. For much of the history of software development, most people believed that we design first, and only when done with design should we code. Over time, the code will be modified, and the integrity of the system—its structure according to that design—gradually fades. The code slowly sinks from engineering to hacking.

Refactoring is the opposite of this practice. With refactoring, we can take a bad, even chaotic, design and rework it into well-structured code. Each step is simple—even simplistic. I move a field from one class to another, pull some code out of a method to make it into its own method, or push some code up or down a hierarchy. Yet the cumulative effect of these small changes can radically improve the design. It is the exact reverse of the notion of software decay.

With refactoring, the balance of work changes. I found that design, rather than occurring all up front, occurs continuously during development. As I build the system, I learn how to improve the design. The result of this interaction is a program whose design stays good as development continues.

What’s in This Book?

This book is a guide to refactoring; it is written for a professional programmer. My aim is to show you how to do refactoring in a controlled and efficient manner. You will learn to refactor in such a way that you don’t introduce bugs into the code but methodically improve its structure.

Traditionally, a book starts with an introduction. I agree with that in principle, but I find it hard to introduce refactoring with a generalized discussion or definitions—so I start with an example. Chapter 1 takes a small program with some common design flaws and refactors it into a program that’s easier to understand and change. This will show you both the process of refactoring and a number of useful refactorings. This is the key chapter to read if you want to understand what refactoring really is about.

In Chapter 2, I cover more of the general principles of refactoring, some definitions, and the reasons for doing refactoring. I outline some of the challenges with refactoring. In Chapter 3, Kent Beck helps me describe how to find bad smells in code and how to clean them up with refactorings. Testing plays a very important role in refactoring, so Chapter 4 describes how to build tests into code.

The heart of the book—the catalog of refactorings—takes up the rest of its volume. While this is by no means a comprehensive catalog, it covers the key refactorings that most developers will likely need. It grew from the notes I made when learning about refactoring in the late 1990s, and I still use these notes now as I don’t remember them all. When I want to do something, such as Split Phase (154), the catalog reminds me how to do it in a safe, step-by-step manner. I hope this is the part of the book that you’ll come back to often.

A Web-First Book

The World-Wide Web has made an enormous impact on our society, particularly affecting how we gather information. When I wrote this book, most of the knowledge about software development was transferred through print. Now I gather most of my information online. This has presented a challenge for authors like myself: Is there still a role for books, and what should they look like?

I believe there still is role for books like this—but they need to change. The value of a book is a large body of knowledge put together in a cohesive fashion. In writing this book, I tried to cover many different refactorings and organize them in a consistent and integrated manner.

But that integrated whole is an abstract literary work that, while traditionally represented by a paper book, need not be in the future. Most of the book industry still sees the paper book as the primary representation, and while we’ve enthusiastically adopted ebooks, they are just electronic representations of an original work based on the structure of a paper book.

With this book, I’m exploring a different approach. The canonical form of this book is its web site or web edition. Access to the web edition is included with the purchase of the print or ebook versions. (See note below about registering your product on InformIT.) The paper book is a selection of material from the web site, arranged in a manner that makes sense for print. It doesn’t attempt to include all the refactorings on the web site, particularly since I may well add more refactorings to the canonical web edition in the future. Similarly, the ebook is a different representation of the web book that may not include the same set of refactorings as the printed book—after all, ebooks don’t get heavy as I add pages and they can be easily updated after they are bought.

I don’t know whether you’re reading the web edition online, an ebook on your phone, a paper copy, or some other form I can’t imagine as I write this. I do my best to make this a useful work, whatever way you wish to absorb it.

For access to the canonical web edition and updates or corrections as they become available, register your copy of Refactoring, Second Edition, on the InformIT site. To start the registration process, go to informit.com/register and log in (or create an account if you don’t have one). Enter the ISBN 9780134757599 and click Submit. You will be asked a challenge question, so be sure to have your copy of the print or ebook available. After you’ve successfully registered your copy, open the “Digital Purchases” tab on your Account page and click on the link under this title to “Launch” the web edition.

JavaScript Examples

As in most technical areas of software development, code examples are very important to illustrate the concepts. However, the refactorings look mostly the same in different languages. There will sometimes be particular things that a language forces me to pay attention to, but the core elements of the refactorings remain the same.

I chose JavaScript to illustrate these refactorings, as I felt that this language would be readable by the most amount of people. You shouldn’t find it difficult, however, to adapt the refactorings to whatever language you are currently using. I try not to use any of the more complicated bits of the language, so you should be able to follow the refactorings with only a cursory knowledge of JavaScript. My use of JavaScript is certainly not an endorsement of the language.

Although I use JavaScript for my examples, that doesn’t mean the techniques in this book are confined to JavaScript. The first edition of this book used Java, and many programmers found it useful even though they never wrote a single Java class. I did toy with illustrating this generality by using a dozen different languages for the examples, but I felt that would be too confusing for the reader. Still, this book is written for programmers in any language. Outside of the example sections, I’m not making any assumptions about the language. I expect the reader to absorb my general comments and apply them to the language they are using. Indeed, I expect readers to take the JavaScript examples and adapt them to their language.

This means that, apart from discussing specific examples, when I talk about “class,” “module,” “function,” etc., I use those terms in the general programming meaning, not as specific terms of the JavaScript language model.

The fact that I’m using JavaScript as the example language also means that I try to avoid JavaScript styles that will be less familiar to those who aren’t regular JavaScript programmers. This is not a “refactoring in JavaScript” book—rather, it’s a general refactoring book that happens to use JavaScript. There are many interesting refactorings that are specific to JavaScript (such as refactoring from callbacks, to promises, to async/await) but they are out of scope for this book.

Who Should Read This Book?

I’ve aimed this book at a professional programmer—someone who writes software for a living. The examples and discussion include a lot of code to read and understand. The examples are in JavaScript, but should be applicable to most languages. I would expect a programmer to have some experience to appreciate what’s going on with this book, but I don’t assume much knowledge.

Although the primary target of this book is a developer seeking to learn about refactoring, this book is also valuable for someone who already understands refactoring—it can be used as a teaching aid. In this book, I’ve put a lot of effort into explaining how various refactorings work, so an experienced developer can use this material in mentoring their colleagues.

Although it is focused on the code, refactoring has a large impact on the design of system. It is vital for senior designers and architects to understand the principles of refactoring and to use them in their projects. Refactoring is best introduced by a respected and experienced developer. Such a developer can best understand the principles behind refactoring and adapt those principles to the specific workplace. This is particularly true when you are using a language other than JavaScript, because you’ll have to adapt the examples I’ve given to other languages.

Here’s how to get the most from this book without reading all of it.

If you want to understand what refactoring is, read Chapter 1—the example should make the process clear.
If you want to understand why you should refactor, read the first two chapters. They will tell you what refactoring is and why you should do it.
If you want to find where you should refactor, read Chapter 3. It tells you the signs that suggest the need for refactoring.
If you want to actually do refactoring, read the first four chapters completely, then skip-read the catalog. Read enough of the catalog to know, roughly, what is in there. You don’t have to understand all the details. When you actually need to carry out a refactoring, read the refactoring in detail and use it to help you. The catalog is a reference section, so you probably won’t want to read it in one go.

An important part of writing this book was naming the various refactorings. Terminology helps us communicate, so that when one developer advises another to extract some code into a function, or to split some computation into separate phases, both understand the references to Extract Function (106) and Split Phase (154). This vocabulary also helps in selecting automated refactorings.

Building on a Foundation Laid by Others

I need to say right at the beginning that I owe a big debt with this book—a debt to those whose work in the 1990s developed the field of refactoring. It was learning from their experience that inspired and informed me to write the first edition of this book, and although many years have passed, it’s important that I continue to acknowledge the foundation that they laid. Ideally, one of them should have written that first edition, but I ended up being the one with the time and energy.

Two of the leading early proponents of refactoring were Ward Cunningham and Kent Beck. They used it as a foundation of development in the early days and adapted their development processes to take advantage of it. In particular, it was my collaboration with Kent that showed me the importance of refactoring—an inspiration that led directly to this book.

Ralph Johnson leads a group at the University of Illinois at Urbana-Champaign that is notable for its practical contributions to object technology. Ralph has long been a champion of refactoring, and several of his students did vital early work in this field. Bill Opdyke developed the first detailed written work on refactoring in his doctoral thesis. John Brant and Don Roberts went beyond writing words—they created the first automated refactoring tool, the Refactoring Browser, for refactoring Smalltalk programs.

Many people have advanced the field of refactoring since the first edition of this book. In particular, the work of those who have added automated refactorings to development tools have contributed enormously to making programmers’ lives easier. It’s easy for me to take it for granted that I can rename a widely used function with a simple key sequence—but that ease relies on the efforts of IDE teams whose work helps us all.

Acknowledgments

Even with all that research to draw on, I still needed a lot of help to write this book. The first edition drew greatly on experience and encouragement from Kent Beck. He first introduced me to refactoring, inspired me to start writing notes to record refactorings, and helped form them into finished prose. He came up with the idea of Code Smells. I often feel he would have written the first edition better than I had done—if we wasn’t writing the foundation book for Extreme Programming instead.

All the technical book authors I know mention the big debt they owe to technical reviewers. We’ve all written works with big flaws that were only caught by our peers acting as reviewers. I don’t do a lot of technical review work myself, partly because I don’t think I’m very good at it, so I have a lot of admiration for those who take it on. There’s not even a pittance to be made by reviewing someone else’s book, so doing it is a great act of generosity.

When I started serious work on the book, I formed a mailing list of advisors to give me feedback. As I made progress, I sent drafts of new material to this group and asked them for their feedback. I want to thank the following for posting their feedback on the mailing list: Arlo Belshee, Avdi Grimm, Beth Anders-Beck, Bill Wake, Brian Guthrie, Brian Marick, Chad Wathington, Dave Farley, David Rice, Don Roberts, Fred George, Giles Alexander, Greg Doench, Hugo Corbucci, Ivan Moore, James Shore, Jay Fields, Jessica Kerr, Joshua Kerievsky, Kevlin Henney, Luciano Ramalho, Marcos Brizeno, Michael Feathers, Patrick Kua, Pete Hodgson, Rebecca Parsons, and Trisha Gee.

Of this group, I’d particularly like to highlight the special help I got on JavaScript from Beth Anders-Beck, James Shore, and Pete Hodgson.

Once I had a pretty complete first draft, I sent it out for further review, because I wanted to have some fresh eyes look at the draft as a whole. William Chargin and Michael Hunger both delivered incredibly detailed review comments. I also got many useful comments from Bob Martin and Scott Davis. Bill Wake added to his contributions on the mailing list by doing a full review of the first draft.

My colleagues at ThoughtWorks are a constant source of ideas and feedback on my writing. There are innumerable questions, comments, and observations that have fueled the thinking and writing of this book. One of the great things about being an employee at ThoughtWorks is that they allow me to spend considerable time on writing. In particular, I appreciate the regular conversations and ideas I get from Rebecca Parsons, our CTO.

At Pearson, Greg Doench is my acquisition editor, navigating many issues in getting a book to publication. Julie Nahil is my production editor. I was glad to again work with Dmitry Kirsanov for copyediting and Alina Kirsanova for composition and indexing.