Some thoughts about testing
I well remember when this realization first came on me with full force. The
EDSAC was on the top floor of the building and the tape-punching and editing
equipment one floor below. [...] It was on one of my journeys between the EDSAC
room and the punching equipment that, hesitating at the angles of [the] stairs,
the realization came over me with full force that a good part of the remainder
of my life was going to be spent in finding errors in my own programs.
- from Memoirs
of a Computer Pioneer, by Maurice Wilkes
Testing is one of the great realities of software development. Ever since the
first computers were built, programmers have been finding bugs in their
code.
What can I say about a subject as old, and presumably well-explored, as
testing? Well... kind of a lot, actually.
The Importance of Testing
Tests are the lifeblood of a big software project. Parts of the body that are
denied circulation cannot survive. In the same way, parts of a software
project that are cut off from testing will eventually rot away. The bigger the
project is, the more testing you will need to keep it alive.
Tests need to be first-class citizens in your project. You wouldn't consider
starting a project without selecting a build system or a runtime environment.
So how can you consider starting a project without choosing what test
frameworks you will use?
A lot of people keep their tests in a separate source control repository from
the rest of their code. To me, this makes no sense. As the project changes
and matures, the tests will have to change and mature. New features will be
added that require new tests. In some cases, features will be altered or
removed. Why would you want to inflict all the hassles of version skew on
yourself?
It may be reasonable to put the test framework itself in a different repository
than the project, if you believe that the framework will be useful for other
projects as well. But it certainly doesn't make sense to put the tests
themselves in any repository but the project repository.
Running Tests should be Easy
Anything that stops you from testing your code quickly is a big problem.
Does your code take hours to compile? That's a problem. When Rob
Pike and the other designers announced the
Google Go language, one of the things
they were proud of was its rapid compliation time. Rapid compilation
is a force multiplier. It will allow you to get much more work done in
a shorter amount of time.
Does your test framework require an "expert" to set up? That's a problem.
Bugs that could have been discovered and fixed in a day might require a week or
more of back-and-forth between programmers and "test experts."
It often makes sense to have a division of labor between test writers and other
developers. The former are often called SDETs, or "Software Development
Engineers in Test." However, it doesn't make sense to create a system where
tests can only be run by certain people. All of your developers should be able
to run all of your tests. No exceptions!
For a lot of projects, it makes sense for developers to run a certain set of
tests before submitting any change. These are often referred to as "smoke
tests."
Unit tests
Unit tests really are important. They're important for two reasons: because of
the test coverage they provide, and because they encourage you to write code
that is modular and testable.
If you think your project doesn't need unit tests, you are almost certainly
wrong.
For Java, JUnit is a pretty good unit test framework. There are also
frameworks available for C and C++, but I usually roll my own. One
approach that I've used in the past for C code is simply to make each
unit test a separate executable that returns success or failure.
Long-running tests
Not everything in the world can be unit tested. A lot of bugs show themselves
only when you are running a full stack. To catch these bugs, you are going to
need long-running tests. Some people call these system tests, or end-to-end
tests, or integration tests. Whatever you call them, you are probably going to
need them too.
Some of the system test I've written in the past have simply been shell scripts
that ran an executable with various different options. You can get a lot done
in a shell script without writing a lot of code. For some projects, this
simply isn't enough, and you are going to need something more heavyweight--
like a set of Python scripts.
You Get what you Measure
To paraphrase management consultant
H.
Thomas Johnson, you get what you measure. This is as true in the software
development business as it is in other walks of life.
One big mistake I've seen projects make in the past is to create test
frameworks that were inadequate. For example, one project was creating
software that was designed to run on a huge network of computers. But the test
framework that all the developers used was just a shell script which started a
few processes on their local computer. Needless to say, this was not a very
good test.
In a lot of ways, a bad test framework is worse than no test framework at all.
It encourages you to think that you are doing fine-- when in fact your project
has major problems. You may add lots of features, confident that you can
handle it-- when in fact, your existing codebase is almost untested. Like a
fuel gauge that always points a "full," a bad test framework can do you a lot
of harm.
If your project involves multiple processes, your test framework needs to be
able to create multiple processes. If your project involves multiple
computers, your test framework needs to manage multiple computers. And so
forth. Absolutely do
not compromise on the quality of the test
framework, no matter how tempting it may be.
Strategies for Testing
Testing can be a costly proposition. Isn't there some way we can cut down on
the overall cost?
How about those pesky users? They're always demanding more out of developers.
Perhaps they should be made to bear some of the burden of testing.
A lot of open source projects rely a lot on user testing. In an open source
project, users can engage with the developers directly on mailing lists and
other forums, and have a dialog about bugs. Similarly, a lot of companies like
Google and Facebook have started making products available to consumers while
they're still "in beta"-- meaning unfinished.
Is this a good idea? As with most interesting questions, the answer is "it
depends." What's the worst-case scenario if your software fails? Does the user
lose his progress through the Mushroom Kingdom, or does the nuclear reactor
create a mushroom cloud? If the answer is the latter, you probably want to
avoid sending users out to do your testing.
Certain types of software are notorious for requiring a higher standard of
quality. Users will accept a few crashes out of a game, but the first time a
filesystem loses their data, they will probably decide to avoid it in the
future. If your filesystem or database gets a bad reputation for reliablity,
it might take years to dispel. However, if you are developing something like a
game or a paint program, you are probably better off shoving the software out
the door as soon as you can-- before other projects grab the market share or
mindshare.
The Limits of Testing
I've spent this whole essay talking about how great tests are, and how
essential they are to a well-run software project. So it may seem surprising
that I am including a section on the limits of testing.
Well, it's true. Testing can only do so much for your project. There are
always going to be nooks and crannies where bugs can hide.
Ideally, testing should be like the safety net below the flying trapeze at the
circus. It's good to know that it's there, but you should not rely on it.
A good developer will do everything he can to reduce the testing burden. By
using libraries rather than re-inventing the wheel, you can use code that has
already been tested for you. In languages fortunate enough to have a static
type system, you can use the type system to your advantage. For C code,
Valgrind is essential. For both Java and C, Coverity and other static analysis
tools are great.
The hardest bugs to analyze are always race conditions. Concurrency bugs are
never deterministic, and always a huge burden to debug. You really need to get
all your ducks in a row with respect to concurrency. Know what threads you are
using and why. Know what data these threads are allowed to touch, and what
data they are not. Unfortunately, this is an area where the tooling is still
relatively primitive. You are going to have to
use the force here.
Conclusion
Anyway, to those of you who made it this far-- good luck. Keep in mind the
lessons of the past, and you should be able to build the software of the
future.