Test Driven Development
Overview
Teaching: 15 min
Exercises: 15 minQuestions
Why should you write tests first?
What are the 3 phases of TDD?
Objectives
Start running tests with pytest.
Replace the end-to-end test with a pytest version.
Test Driven Development
Test Driven Development (or TDD) has a history in agile development but put simply you’re going to write tests before you write code (or nearly the same time). This is in contrast to the historical “waterfall” development cycle where tests were developed after code was written e.g. on a separate day or business quarter.
You don’t have to be a TDD purist as long as your code ends up having tests. Some say TDD makes code less buggy, though that may not be strictly true. Still, TDD provides at least three major benefits:
-
It greatly informs the design, making for more maintainable, legible and generally pretty code. Writing code is much faster since you’ve already made the difficult decisions about its interface.
-
It ensures that tests will exist in the first place (b/c you are writing them first). Remember, everyone dislikes writing tests, so you should hack yourself to get them done.
-
It makes writing tests A LOT more fun and enjoyable. That’s because writing tests when there’s no code yet is a huge puzzle that feels more like problem-solving (which is what programmers like) than clerical work or bookkeeping (which programmers generally despise).
Often times when you write tests with existing code, you anchor your expectations based on what the code does, instead of brain storming on what a user could do to break your system.
Red, green, refactor
The basic cycle of work with TDD is called red, green, refactor:
- Start by writing a failing test. This is the red phase since your test runner will show a failure, usually with a red color. Only when you having a failing test can you add a feature.
- Add only as much code to make the test pass (green). You want to work on the code base only until your tests are passing, then stop! This is often difficult if you are struck by inspiration but try to slow down and add a note to come back to. Maybe you identified the next test to write!
- Look over your code and tests and see if anything should be refactored. Refactoring is the process of restructuring or improving the internal structure of existing source code without changing its external behavior. It can involve modifications to the code to make it more readable, maintainable, efficient, and adhering to best coding practices. At this point your test suite is passing and you can really get creative. If a change causes something to break you can easily undo it and get back to safety. Also, having this as a separate step allows you to focus on testing and writing code in the other phases. Remember, tests are code too and benefit from the same design considerations.
You will be repeating these red, green, refactor steps multiple times until you are done with developing code. Therefore we need to automate the execution of tests using testing frameworks. Here we will be using pytest, which is a widely used testing framework for Python. It provides a comprehensive and flexible set of tools and features for writing and executing tests.
Importing overlap.py, red-green-refactor
Getting back to the overlap script, let’s start with a failing test.
Red
You may be surprised how easy it is to fail:
# test_overlap.py
import overlap_v0 as overlap # change version as needed
def test_import():
pass
When you run pytest, you get an error due to the line opening sys.argv[1]
when
we aren’t providing an input file. Note how brittle the original design is,
if we wanted to use part of the file somewhere else we aren’t able to since
the file expects to be called as the main method. If you wanted to change
command line parsers or use some part in a notebook, you would quickly run into
problems.
Green
Maybe you are thinking about how to pull file IO out of this function, but to start we do the least amount possible to pass our test (which is really just importing):
# overlap.py
import sys
def main():
# read input file
infile = open(sys.argv[1], 'r')
outfile = open(sys.argv[2], 'w')
# ...
if __name__ == "__main__":
main()
Now execute test_overlap.py with pytest:
$ pytest test_overlap.py
========================================== test session starts ===========================================
platform linux -- Python 3.9.16, pytest-7.3.1, pluggy-1.0.0
rootdir: ~/projects/testing-lesson/files
plugins: anyio-3.6.2
collected 1 item
test_overlap.py . [100%]
=========================================== 1 passed in 0.01s ============================================
You can be extra thorough and see if end-to-end.sh
still passes.
Refactor
At this point, our pytest function is just making sure we can import the code. But our end-to-end test makes sure the entire function is working as expected (albeit for a small, simple input). How about we move the file IO into the main guard?
Refactor
Change file IO to occur in the guard clause. Your new main method should look like
def main(infile, outfile)
.Solution
def main(infile, outfile): ... if __name__ == "__main__": with open(sys.argv[1], encoding='utf-8') as infile, \ open(sys.argv[2], 'w', encoding='utf-8') as outfile: main(infile, outfile)
That addresses the context manager and encoding of files. Continue to run your tests to see your changes are ok!
After refactoring save the code as overlap_v1.py.
End-to-end testing with pytest
We have a lot to address, but it would be nice to integrate our end to end test with pytest so we can just run one command. The problem is our main method deals with files. Reading and writing to a file system can be several orders of magnitude slower than working in main memory. We want our tests to be fast so we can run them all the time. The nice thing about python’s duck typing is the infile and outfile variables don’t have to be open files on disk, they can be in memory files. In python, this is accomplished with a StringIO object from the io library.
A StringIO object acts just like an open file. You can read and iterate from
it and you can write to it just like an opened text file. When you want to
read the contents, use the function getvalue
to get a string representation.
Red
Let’s write a test that will pass in two StringIO objects, one with the file to read and one for the output.
# test_overlap.py
import overlap_v1 as overlap
from io import StringIO
def test_end_to_end():
## ARRANGE
# initialize a StringIO with a string to read from
infile = StringIO(
'a 0 0 2 2\n'
'b 1 1 3 3\n'
'c 10 10 11 11'
)
# this holds our output
outfile = StringIO()
## ACT
# call the function
overlap.main(infile, outfile)
## ASSERT
output = outfile.getvalue().split('\n')
assert output[0].split() == '1 1 0'.split()
assert output[1].split() == '1 1 0'.split()
assert output[2].split() == '0 0 1'.split()
Since this is the first non-trivial test, let’s spend a moment going over it.
Like before we import our module and make our test function start with test_
.
The actual name of the function is for your benefit only so don’t be worried if
it is “too long”. You won’t have to type it so be descriptive! Next we have the
three steps in all tests: Arrange-Act-Assert (aka Given-When-Then)
- Arrange: Set up the state of the program in a particular way to test the feature you want. Consider edge cases, mocking databases or files, building helper objects, etc.
- Act: Call the code you want to test
- Assert: Confirm the observable outputs are what you expect (i.e. compare the output with the expected output). Notice that the outputs here read like the actual output file.
Again, pytest uses plain assert statements to test results.
But we have a problem, our test passes! This is partially because we are replacing an existing test, but the problem is how can you be sure you are testing what you think? Maybe you are importing a different version of code, maybe pytest isn’t finding your test, etc. Often a test that passes which should fail is more worrisome than a test which fails that should be passing.
Notice the splits in the assert section, they normalize the output so any whitespace will be accepted. Let’s replace those with explicit white space (tabs) and see if we can go red:
assert output[0] == '1\t1\t0'
assert output[1] == '1\t1\t0'
assert output[2] == '0\t0\t1'
Since split('\n')
strips off the newline at the end of each line. Now we
fail (yay!) and it’s due to something we have wanted to change anyways. The
pytest output says
E AssertionError: assert '1\t1\t0\t' == '1\t1\t0'
that is, we have an extra tab at the end of our lines.
Green
Fix the code
Change the overlap file to make the above test pass. Hint, change when and where you write a tab to the file.
Solution
There are a few options here. I’m opting for building a list and using
join
for the heavy lifting.for red_name, red_coords in dict.items(): output_line = [] for blue_name, blue_coords in dict.items(): # check if rects overlap result = '1' red_lo_x, red_lo_y, red_hi_x, red_hi_y = red_coords blue_lo_x, blue_lo_y, blue_hi_x, blue_hi_y = blue_coords if (red_lo_x >= blue_hi_x) or (red_hi_x <= blue_lo_x) or \ (red_lo_y >= blue_hi_x) or (red_hi_y <= blue_lo_y): result = '0' output_line.append(result) outfile.write('\t'.join(output_line) + '\n')
Once you get the test passing you will notice that the end-to-end.sh
will no
longer pass! One of the reasons to test code is to find (and prevent) bugs.
When you work on legacy code, sometimes you will find bugs in published code and
the questions can get more problematic. Is the change something that will alter
published results? Was the previous version wrong or just not what you prefer?
You may decide to keep a bug if it was a valid answer! In this case we will keep
the change and not have a tab followed by a newline.
Refactor
Let’s focus this round of refactoring on our test code and introduce some nice pytest features. First, it seems like we will want to use our simple input file in other tests. Instead of copying and pasting it around, let’s extract it as a fixture. Pytest fixtures can do a lot, but for now we just need a StringIO object built for some tests:
# test_overlap.py
import overlap_v1 as overlap
from io import StringIO
import pytest
@pytest.fixture()
def simple_input():
return StringIO(
'a 0 0 2 2\n'
'b 1 1 3 3\n'
'c 10 10 11 11'
)
def test_end_to_end(simple_input):
# initialize a StringIO with a string to read from
infile = simple_input
...
We start by importing pytest so we can use the @pytest.fixture()
decorator.
This decorates a function and makes its return value available to any function
that uses it as an input argument. Next we can replace our infile
definition
by assigning it to simple_input
in test_end_to_end
.
For this code, the end-to-end test runs quickly, but what if it took several
minutes? Ideally we could run our fast tests all the time and occasionally run
everything, including slow tests. The simplest way to achieve this with pytest
is to mark the function as slow and invoke pytest with pytest -m "not slow"
:
# test_overlap.py
@pytest.mark.slow()
def test_end_to_end(simple_input):
...
pytest -m "not slow" test_overlap.py
========================================== test session starts ===========================================
platform linux -- Python 3.9.16, pytest-7.3.1, pluggy-1.0.0
rootdir: ~/projects/testing-lesson/files
plugins: anyio-3.6.2
collected 1 item / 1 deselected / 0 selected
Note that the one test was deselected and not run. You can (and should) formalize this mark as described in the warning that will pop up!
Key Points
TDD cycles between the phases red, green, and refactor
TDD cycles should be fast, run tests on every write
Writing tests first ensures you have tests and that they are working
Making code testable forces better style
It is much faster to work with code under test