2 Test-Driven Development

Our discussion of test-driven development (TDD) spans two chapters. We first cover the basics of TDD in a very technical and detailed manner. In this chapter, you will learn about the discipline in a step-by-step fashion. The chapter provides a great deal of code to read and several videos to watch as well.

In Chapter 3, “Advanced TDD,” we cover many of the traps and conundrums that novice TDDers face, such as databases and graphical user interfaces. We also explore the design principles that drive good test design and the design patterns of testing. Finally, we investigate some interesting and profound theoretical possibilities.

Overview

Zero. It’s an important number. It’s the number of balance. When the two sides of a scale are in balance, the pointer on the scale reads zero. A neutral atom, with equal numbers of electrons and protons, has a charge of zero. The sum of forces on a bridge balances to zero. Zero is the number of balance.

Did you ever wonder why the amount of money in your checking account is called its balance? That’s because the balance in your account is the sum of all the transactions that have either deposited or withdrawn money from that account. But transactions always have two sides because transactions move money between accounts.

The near side of a transaction affects your account. The far side affects some other account. Every transaction whose near side deposits money into your account has a far side that withdraws that amount from some other account. Every time you write a check, the near side of the transaction withdraws money from your account, and the far side deposits that money into some other account. So, the balance in your account is the sum of the near sides of the transactions. The sum of the far sides should be equal and opposite to the balance of your account. The sum of all the near and far sides should be zero.

Two thousand years ago, Gaius Plinius Secundus, known as Pliny the Elder, realized this law of accounting and invented the discipline of double-entry bookkeeping. Over the centuries, this discipline was refined by the bankers in Cairo and then by the merchants of Venice. In 1494, Luca Pacioli, a Franciscan friar and friend of Leonardo DaVinci, wrote the first definitive description of the discipline. It was published in book form on the newly invented printing press, and the technique spread.

In 1772, as the industrial revolution gained momentum, Josiah Wedgwood was struggling with success. He was the founder of a pottery factory, and his product was in such high demand that he was nearly bankrupting himself trying to meet that demand. He adopted double-entry bookkeeping and was thereby able to see how money was flowing in and out of his business with a resolution that had previously escaped him. And by tuning those flows, he staved off the looming bankruptcy and built a business that exists to this day.

Wedgwood was not alone. Industrialization drove the vast growth of the economies of Europe and America. In order to manage all the money flows resulting from that growth, increasing numbers of firms adopted the discipline.

In 1795, Johann Wolfgang von Goethe wrote the following in Wilhelm Meister’s Apprenticeship. Pay close attention, for we will return to this quote soon.

“Away with it, to the fire with it!” cried Werner. “The invention does not deserve the smallest praise: that affair has plagued me enough already, and drawn upon yourself your father’s wrath. The verses may be altogether beautiful; but the meaning of them is fundamentally false. I still recollect your Commerce personified; a shrivelled, wretched-looking sibyl she was. I suppose you picked up the image of her from some miserable huckster’s shop. At that time, you had no true idea at all of trade; whilst I could not think of any man whose spirit was, or needed to be, more enlarged than the spirit of a genuine merchant. What a thing it is to see the order which prevails throughout his business! By means of this he can at any time survey the general whole, without needing to perplex himself in the details. What advantages does he derive from the system of book-keeping by double entry! It is among the finest inventions of the human mind; every prudent master of a house should introduce it into his economy.”

Today, double-entry bookkeeping carries the force of law in almost every country on the planet. To a large degree, the discipline defines the accounting profession.

But let’s return to Goethe’s quote. Note the words that Goethe used to describe the means of “Commerce” that he so detested:

A shrivelled, wretched-looking sibyl she was. I suppose you picked up the image of her from some miserable huckster’s shop.

Have you seen any code that matches that description? I’m sure you have. So have I. Indeed, if you are like me, then you have seen far, far too much of it. If you are like me, you have written far, far too much of it.

Now, one last look at Goethe’s words:

What a thing it is to see the order which prevails throughout his business! By means of this he can at any time survey the general whole, without needing to perplex himself in the details.

It is significant that Goethe ascribes this powerful benefit to the simple discipline of double-entry bookkeeping.

Software

The maintenance of proper accounts is utterly essential for running a modern business, and the discipline of double-entry bookkeeping is essential for the maintenance of proper accounts. But is the proper maintenance of software any less essential to the running of a business? By no means! In the twenty-first century, software is at the heart of every business.

What, then, can software developers use as a discipline that gives them the control and vision over their software that double-entry bookkeeping gives to accountants and managers? Perhaps you think that software and accounting are such different concepts that no correspondence is required or even possible. I beg to differ.

Consider that accounting is something of a mage’s art. Those of us not versed in its rituals and arcanities understand but little of the depth of the accounting profession. And what is the work product of that profession? It is a set of documents that are organized in a complex and, for the layperson, bewildering fashion. Upon those documents is strewn a set of symbols that few but the accountants themselves can truly understand. And yet if even one of those symbols were to be in error, terrible consequences could ensue. Businesses could founder and executives could be jailed.

Now consider how similar accounting is to software development. Software is a mage’s art indeed. Those not versed in the rituals and arcanities of software development have no true idea of what goes on under the surface. And the product? Again, a set of documents: the source code—documents organized in a deeply complex and bewildering manner, littered with symbols that only the programmers themselves can divine. And if even one of those symbols is in error, terrible consequences may ensue.

The two professions are deeply similar. They both concern themselves with the intense and fastidious management of intricate detail. They both require significant training and experience to do well. They both are engaged in the production of complex documents whose accuracy, at the level of individual symbols, is critical.

Accountants and programmers may not want to admit it, but they are of a kind. And the discipline of the older profession should be well observed by the younger.

As you will see in what follows, TDD is double-entry bookkeeping. It is the same discipline, executed for the same purpose, and delivering the same results. Everything is said twice, in complementary accounts that must be kept in balance by keeping the tests passing.

The Three Laws of TDD

Before we get to the three laws, we have some preliminaries to cover.

The essence of TDD entails the discipline to do the following:

Create a test suite that enables refactoring and is trusted to the extent that passage implies deployability. That is, if the test suite passes, the system can be deployed.
Create production code that is decoupled enough to be testable and refactorable.
Create an extremely short-cycle feedback loop that maintains the task of writing programs with a stable rhythm and productivity.
Create tests and production code that are sufficiently decoupled from each other so as to allow convenient maintenance of both, without the impediment of replicating changes between the two.

The discipline of TDD is embodied within three entirely arbitrary laws. The proof that these laws are arbitrary is that the essence can be achieved by very different means. In particular, Kent Beck’s test && commit || revert (TCR) discipline. Although TCR is entirely different from TDD, it achieves precisely the same essential goals.

The three laws of TDD are the basic foundation of the discipline. Following them is very hard, especially at first. Following them also requires some skill and knowledge that is hard to come by. If you try to follow these laws without that skill and knowledge, you will almost certainly become frustrated and abandon the discipline. We address that skill and knowledge in subsequent chapters. For the moment, be warned. Following these laws without proper preparation will be very difficult.

The First Law

Write no production code until you have first written a test that fails due to the lack of that production code.

If you are a programmer of any years’ experience, this law may seem foolish. You might wonder what test you are supposed to write if there’s no code to test. This question comes from the common expectation that tests are written after code. But if you think about it, you’ll realize that if you can write the production code, you can also write the code that tests the production code. It may seem out of order, but there’s no lack of information preventing you from writing the test first.

The Second Law

Write no more of a test than is sufficient to fail or fail to compile. Resolve the failure by writing some production code.

Again, if you are an experienced programmer, then you likely realize that the very first line of the test will fail to compile because that first line will be written to interact with code that does not yet exist. And that means, of course, that you will not be able to write more than one line of a test before having to switch over to writing production code.

The Third Law

Write no more production code than will resolve the currently failing test. Once the test passes, write more test code.

And now the cycle is complete. It should be obvious to you that these three laws lock you into a cycle that is just a few seconds long. It looks like this:

You write a line of test code, but it doesn’t compile (of course).
You write a line of production code that makes the test compile.
You write another line of test code that doesn’t compile.
You write another line or two of production code that makes the test compile.
You write another line or two of test code that compiles but fails an assertion.
You write another line or two of production code that passes the assertion.

And this is going to be your life from now on.

Once again, the experienced programmer will likely consider this to be absurd. The three laws lock you into a cycle that is just a few seconds long. Each time around that cycle, you are switching between test code and production code. You’ll never be able to just write an if statement or a while loop. You’ll never be able to just write a function. You will be forever trapped in this tiny little loop of switching contexts between test code and production code.

You may think that this will be tedious, boring, and slow. You might think that it will impede your progress and interrupt your chain of thought. You might even think that it’s just plain silly. You may think that this approach will lead you to produce spaghetti code or code with little or no design—a haphazard conglomeration of tests and the code that makes those tests pass.

Hold all those thoughts and consider what follows.

Losing the Debug-foo

I want you to imagine a room full of people following these three laws—a team of developers all working toward the deployment of a major system. Pick any one of those programmers you like, at any time you like. Everything that programmer is working on executed and passed all its tests within the last minute or so. And this is always true. It doesn’t matter who you pick. It doesn’t matter when you pick them. Everything worked a minute or so ago.

What would your life be like if everything worked a minute or so ago? How much debugging do you think you would do? The fact is that there’s not likely much to debug if everything worked a minute or so ago.

Are you good at the debugger? Do you have the debug-foo in your fingers? Do you have all the hot keys primed and ready to go? Is it second nature for you to efficiently set breakpoints and watchpoints and to dive headlong into a deep debugging session?

This is not a skill to be desired!

You don’t want to be good at the debugger. The only way you get good at the debugger is by spending a lot of time debugging. And I don’t want you spending a lot of time debugging. You shouldn’t want that either. I want you spending as much time as possible writing code that works and as little time as possible fixing code that doesn’t.

I want your use of the debugger to be so infrequent that you forget the hot keys and lose the debug-foo in your fingers. I want you puzzling over the obscure step-into and step-over icons. I want you to be so unpracticed at the debugger that the debugger feels awkward and slow. And you should want that too. The more comfortable you feel with a debugger, the more you know you are doing something wrong.

Now, I can’t promise you that these three laws will eliminate the need for the debugger. You will still have to debug from time to time. This is still software, and it’s still hard. But the frequency and duration of your debugging sessions will undergo a drastic decline. You will spend far more time writing code that works and far less time fixing code that doesn’t.

Documentation

If you’ve ever integrated a third-party package, you know that included in the bundle of software you receive is a PDF written by a tech writer. This document purports to describe how to integrate the third-party package. At the end of this document is almost always an ugly appendix that contains all the code examples for integrating the package.

Of course, that appendix is the first place you look. You don’t want to read what a tech writer wrote about the code; you want to read the code. And that code will tell you much more than the words written by the tech writer. If you are lucky, you might even be able to use copy/paste to move the code into your application where you can fiddle it into working.

When you follow the three laws, you are writing the code examples for the whole system. Those tests you are writing explain every little detail about how the system works. If you want to know how to create a certain business object, there are tests that show you how to create it every way that it can be created. If you want to know how to call a certain API function, there are tests that demonstrate that API function and all its potential error conditions and exceptions. There are tests in the test suite that will tell you anything you want to know about the details of the system.

Those tests are documents that describe the entire system at its lowest level. These documents are written in a language you intimately understand. They are utterly unambiguous. They are so formal that they execute. And they cannot get out of sync with the system.

As documents go, they are almost perfect.

I don’t want to oversell this. The tests are not particularly good at describing the motivation for a system. They are not high-level documents. But at the lowest level, they are better than any other kind of document that could be written. They are code. And code is something you know will tell you the truth.

You might be concerned that the tests will be as hard to understand as the system as a whole. But this is not the case. Each test is a small snippet of code that is focused on one very narrow part of the system as a whole. The tests do not form a system by themselves. The tests do not know about each other, and so there is no rat’s nest of dependency in the tests. Each test stands alone. Each test is understandable on its own. Each test tells you exactly what you need to understand within a very narrow part of the system.

Again, I don’t want to oversell this point. It is possible to write opaque and complex tests that are hard to read and understand, but it is not necessary. Indeed, it is one of the goals of this book to teach you how to write tests that are clear and clean documents that describe the underlying system.

Holes in the Design

Have you ever written tests after the fact? Most of us have. Writing tests after writing code is the most common way that tests are written. But it’s not a lot of fun, is it?

It’s not fun because by the time we start writing after-the-fact tests, we already know the system works. We’ve tested it manually. We are only writing the tests out of some sense of obligation or guilt or, perhaps, because our management has mandated some level of test coverage. So, we begrudgingly bend into the grind of writing one test after another, knowing that each test we write will pass. Boring, boring, boring.

Inevitably, we come to the test that’s hard to write. It is hard to write because we did not design the code to be testable; we were focused instead on making it work. Now, in order to test the code, we’re going to have to change the design.

But that’s a pain. It’s going to take a lot of time. It might break something else. And we already know the code works because we tested it manually. Consequently, we walk away from that test, leaving a hole in the test suite. Don’t tell me you’ve never done this. You know you have.

You also know that if you’ve left a hole in the test suite, everybody else on the team has too, so you know that the test suite is full of holes.

The number of holes in the test suite can be determined by measuring the volume and duration of the laughter of the programmers when the test suite passes. If the programmers laugh a lot, then the test suite has a lot of holes in it.

A test suite that inspires laughter when it passes is not a particularly useful test suite. It may tell you when certain things break, but there is no decision you can make when it passes. When it passes, all you know is that some stuff works.

A good test suite has no holes. A good test suite allows you to make a decision when it passes. That decision is to deploy.

If the test suite passes, you should feel confident in recommending that the system be deployed. If your test suite doesn’t inspire that level of confidence, of what use is it?

Fun

When you follow the three laws, something very different happens. First of all, it’s fun. One more time, I don’t want to oversell this. TDD is not as much fun as winning the jackpot in Vegas. It’s not as much fun as going to a party or even playing Chutes and Ladders with your four-year-old. Indeed, fun might not be the perfect word to use.

Do you remember when you got your very first program to work? Remember that feeling? Perhaps it was in a local department store that had a TRS-80 or a Commodore 64. Perhaps you wrote a silly little infinite loop that printed your name on the screen forever and ever. Perhaps you walked away from that screen with a little smile on your face, knowing that you were the master of the universe and that all computers would bow down to you forever.

A tiny echo of that feeling is what you get every time you go around the TDD loop. Every test that fails just the way you expected it to fail makes you nod and smile just a little bit. Every time you write the code that makes that failing test pass, you remember that once you were master of the universe and that you still have the power.

Every time around the TDD loop, there’s a tiny little shot of endorphins released into your reptile brain, making you feel just a little more competent and confident and ready to meet the next challenge. And though that feeling is small, it is nonetheless kinda fun.

Design

But never mind the fun. Something much more important happens when you write the tests first. It turns out that you cannot write code that’s hard to test if you write the tests first. The act of writing the test first forces you to design the code to be easy to test. There’s no escape from this. If you follow the three laws, your code will be easy to test.

What makes code hard to test? Coupling and dependencies. Code that is easy to test does not have those couplings and dependencies. Code that is easy to test is decoupled!

Following the three laws forces you to write decoupled code. Again, there is no escape from this. If you write the tests first, the code that passes those tests will be decoupled in ways that you’d never have imagined.

And that’s a very good thing.

The Pretty Little Bow on Top

It turns out that applying the three laws of TDD has the following set of benefits:

You will spend more time writing code that works and less time debugging code that doesn’t.
You will produce a set of nearly perfect low-level documentation.
It is fun—or at least motivating.
You will produce a test suite that will give you the confidence to deploy.
You will create less-coupled designs.

These reasons might convince you that TDD is a good thing. They might be enough to get you to ignore your initial reaction, even repulsion. Maybe.

But there is a far more overriding reason why the discipline of TDD is important.

Fear

Programming is hard. It may be the most difficult thing that humans have attempted to master. Our civilization now depends upon hundreds of thousands of interconnected software applications, each of which involves hundreds of thousands if not tens of millions of lines of code. There is no other apparatus constructed by humans that has so many moving parts.

Each of those applications is supported by teams of developers who are scared to death of change. This is ironic because the whole reason software exists is to allow us to easily change the behavior of our machines.

But software developers know that every change introduces the risk of breakage and that breakage can be devilishly hard to detect and repair.

Imagine that you are looking at your screen and you see some nasty tangled code there. You probably don’t have to work very hard to conjure that image because, for most of us, this is an everyday experience.

Now let’s say that as you glance at that code, for one very brief moment, the thought occurs to you that you ought to clean it up a bit. But your very next thought slams down like Thor’s hammer: “I’M NOT TOUCHING IT!” Because you know that if you touch it, you will break it; and if you break it, it becomes yours forever.

This is a fear reaction. You fear the code you maintain. You fear the consequences of breaking it.

The result of this fear is that the code must rot. No one will clean it. No one will improve it. When forced to make changes, those changes will be made in the manner that is safest for the programmer, not best for the system. The design will degrade, and the code will rot, and the productivity of the team will decline, and that decline will continue until productivity is near zero.

Ask yourself if you have ever been significantly slowed down by the bad code in your system. Of course you have. And now you know why that bad code exists. It exists because nobody has the courage to do the one thing that could improve it. No one dares risk cleaning it.

Courage

But what if you had a suite of tests that you trusted so much that you were confident in recommending deployment every time that suite of tests passed? And what if that suite of tests executed in seconds? How much would you then fear to engage in a gentle cleaning of the system?

Imagine that code on your screen again. Imagine the stray thought that you might clean it up a little. What would stop you? You have the tests. Those tests will tell you the instant you break something.

With that suite of tests, you can safely clean the code. With that suite of tests, you can safely clean the code. With that suite of tests, you can safely clean the code.

No, that wasn’t a typo. I wanted to drive the point home very, very hard. With that suite of tests, you can safely clean the code!

And if you can safely clean the code, you will clean the code. And so will everyone else on the team. Because nobody likes a mess.

The Boy Scout Rule

If you have that suite of tests that you trust with your professional life, then you can safely follow this simple guideline:

Check the code in cleaner than you checked it out.

Imagine if everyone did that. Before checking the code in, they made one small act of kindness to the code. They cleaned up one little bit.

Imagine if every check-in made the code cleaner. Imagine that nobody ever checked the code in worse than it was but always better than it was.

What would it be like to maintain such a system? What would happen to estimates and schedules if the system got cleaner and cleaner with time? How long would your bug lists be? Would you need an automated database to maintain those bug lists?

That’s the Reason

Keeping the code clean. Continuously cleaning the code. That’s why we practice TDD. We practice TDD so that we can be proud of the work we do. So that we can look at the code and know it is clean. So that we know that every time we touch that code, it gets better than it was before. And so that we go home at night and look in the mirror and smile, knowing we did a good job today.

The Fourth Law

I will have much more to say about refactoring in later chapters. For now, I want to assert that refactoring is the fourth law of TDD.

From the first three laws, it is easy to see that the TDD cycle involves writing a very small amount of test code that fails, and then writing a very small amount of production code that passes the failing test. We could imagine a traffic light that alternates between red and green every few seconds.

But if we were to allow that cycle to continue in that form, then the test code and the production code would rapidly degrade. Why? Because humans are not good at doing two things at once. If we focus on writing a failing test, it’s not likely to be a well-written test. If we focus on writing production code that passes the test, it is not likely to be good production code. If we focus on the behavior we want, we will not be focusing on the structure we want.

Don’t fool yourself. You cannot do both at once. It is hard enough to get code to behave the way you want it to. It is too hard to write it to behave and have the right structure. Thus, we follow Kent Beck’s advice:

First make it work. Then make it right.

Therefore, we add a new law to the three laws of TDD: the law of refactoring. First you write a small amount of failing test code. Then you write a small amount of passing production code. Then you clean up the mess you just made.

The traffic light gets a new color: red → green → refactor (Figure 2.1).

You’ve likely heard of refactoring, and as I said earlier, we’ll be spending a great deal of time on it in coming chapters. For now, let me dispel a few myths and misconceptions:

Refactoring is a constant activity. Every time around the TDD cycle, you clean things up.
Refactoring does not change behavior. You only refactor when the tests are passing, and the tests continue to pass while you refactor.
Refactoring never appears on a schedule or a plan. You do not reserve time for refactoring. You do not ask permission to refactor. You simply refactor all the time.

Think of refactoring as the equivalent of washing your hands after using the restroom. It’s just something you always do as a matter of common decency.

The Basics

It is very hard to create effective examples of TDD in text. The rhythm of TDD just doesn’t come through very well. In the pages that follow, I try to convey that rhythm with appropriate timestamps and callouts. But to actually understand the true frequency of this rhythm, you just have to see it.

Therefore, each of the examples to follow has a corresponding online video that will help you see the rhythm first hand. Please watch each video in its entirety, and then make sure you go back to the text and read the explanation with the timestamps. If you don’t have access to the videos, then pay special attention to the timestamps in the examples so you can infer the rhythm.

Simple Examples

As you review these examples, you are likely to discount them because they are all small and simple problems. You might conclude that TDD may be effective for such “toy examples” but cannot possibly work for complex systems. This would be a grave mistake.

The primary goal of any good software designer is to break down large and complex systems into a set of small, simple problems. The job of a programmer is to break those systems down into individual lines of code. Thus, the examples that follow are absolutely representative of TDD regardless of the size of the project.

This is something I can personally affirm. I have worked on large systems that were built with TDD, and I can tell you from experience that the rhythm and techniques of TDD are independent of scope. Size does not matter.

Or, rather, size does not matter to the procedure and the rhythm. However, size has a profound effect on the speed and coupling of the tests. But those are topics for the advanced chapters.

Stack

Watch related video: Stack

Access video by registering at informit.com/register

We start with a very simple problem: create a stack of integers. As we walk through this problem, note that the tests will answer any questions you have about the behavior of the stack. This is an example of the documentation value of tests. Note also that we appear to cheat by making the tests pass by plugging in absolute values. This is a common strategy in TDD and has a very important function. I’ll describe that as we proceed.

We begin: