Up: Testing

Interface and Implementation

slide 01 Hello, and welcome to the fourth episode of the Software Carpentry lecture on testing. In this episode, we’ll look at why it’s important to test interfaces, rather than implementations, and why and how to design software so that it’s easy to test.
slide 02 One of the most important ideas in computing is the difference between interface and implementation.
slide 03 Something’s interface specifies how it interacts with the world: what it will accept as input, and what output it produces. It’s like a contract in business: if Party A does X, then Party B guarantees Y.
slide 04 Something’s implementation is how it accomplishes whatever it does. This might involve calculation, database lookups, or anything else. The key is, it’s hidden inside the thing: how it does what it does is nobody else’s business.
slide 05 For example, here’s a function in Python that integrates a function of one variable over a certain interval.
slide 06 Its interface is simple: given a function and the low and high bounds on the interval, it returns the appropriate integral. A fuller definition of its interface would also specify how it behaves when it’s given bad parameters, error bounds on the result, and so on.
slide 07 Its implementation could use any of a dozen algorithms. In fact, its implementation could change over time as new algorithms are developed. As long as its contract with the outside world stays the same, none of the programs that use it should need to change. This allows users to concentrate on their tasks, while giving whoever wrote this function the freedom to tweak it without making work for other people.
slide 08 We often use this idea—the separation between interface and implementation—to simplify unit testing.
slide 09 The goal of unit testing is to test the components of a program one by one—that’s why it’s called “unit” testing.
slide 10 But the components in real programs almost always depend on each other: this function calls that one, this data structure refers to the one over there, and so on.
slide 11 How can we isolate the component under test from the rest of the program so that we can test it on its own?
slide 12 One technique is to replace the components we’re not currently testing with simplified versions that have the same interfaces, but much simpler implementations, just as a director would use a stand-in rather than a star when fiddling with the lighting for a show.
slide 13 Doing this for programs that have already been written sometimes requires some reorganization, or refactoring.
slide 14 But once you understand the technique, you can build programs with it in mind to make testing easier.
slide 15 Let’s go back to our photographs of fields in Saskatchewan.
slide 16 We want to test a function that reads a photo from a file. (Remember that a photo is just a set of rectangles.)
slide 17 Here’s a plausible outline of the function. It creates a set to hold the rectangles making up the photo, opens a file, and then reads rectangles from the file and puts them in the set. When the input is exhausted, the function closes the file and returns the set.
slide 18 And here’s a unit test for that function. It reads data from a file called unit.pht, then checks that the result is a set containing exactly one rectangle.
slide 19 This is pretty straightforward, but experience teaches us that it’s a bad way to organize things.
slide 20 First, this test depends on an external file, and on that file being in exactly the right place. Over time, files can be lost, or moved around, which makes tests that depend on them break.
slide 21 Second, it’s hard to understand a test if the fixture it depends on isn’t right there with it. Yes, it’s easy to open the file and read it, but every bit of extra effort is a bit less testing people will actually do.
slide 22 Third, file I/O is slower than doing things in memory—tens or hundreds of thousands of times slower.
slide 23 If your program has hundreds of tests, and each one takes a second to run, developers will have to wait several minutes to find out whether their latest change has broken anything that used to work. The most likely result is that they’ll run the tests much less frequently…
slide 24 …which means they’ll waste more time backtracking to find and fix bugs that could have been caught when they were fresh if the tests only took seconds to run.
slide 25 Here’s how to fix this. Imagine that instead of reading rectangles, we’re just counting them.
slide 26 This simple function assumes the file contains one rectangle per line, with no blank lines or comments.
slide 27 Of course, a real rectangle counting function would probably be more sophisticated, but this is enough to illustrate our point.
slide 28 Here’s the function after refactoring.
slide 29 We’ve taken the inner core of the original function and made it a function in its own right. This new function does the actual work—i.e., it counts rectangles—but it does not open the file that the rectangles are read from.
slide 30 That is still done by the original function. It opens the input file, calls the new function that we extracted, then closes the file and returns the result.
slide 31 Notice that this function keeps the name of the original function, so that any program that used to call count_rect can still do so.
slide 32 Now let’s write some tests.
slide 33 This piece of code checks that count_rect_in—the function that actually does the hard work—handles the three-rectangle case properly.
slide 34 Instead of an external file, we’re using a string in the test program as a fixture.
slide 35 To make this string look like a file, we’re relying on a Python class called StringIO. As the name suggests, this acts like a file, but uses a string instead of the disk for storing data. StringIO has all the same methods as a file, like readline
slide 36 …so count_rect_in doesn’t know that it isn’t reading from a real file on disk.
slide 37 We can use this same trick to test functions that are supposed to write to files as well.
slide 38 Instead of opening a file, filling it, and closing it, we create a StringIO object and “write” to that.
slide 39 We then use StringIO‘s getvalue method—one of the few things it has that real files don’t—to get back the text we’re written and check that it’s correct.
slide 40 For example, here’s a unit test to check that another function, photo_write_to, can correctly write out a photo containining only the unit square. Once again, we create a StringIO and pass that to the function instead of an actual open file.
slide 41 If photo_write_to only writes to the file using the methods that real files provide, it won’t know that it’s been passed something else.
slide 42 Once we’re finished writing, we can call getvalue to get the text that we wrote, and check it to make sure it’s what it’s supposed to be.
slide 43 In order to make output testable, though, there’s one more thing we have to do.
slide 44 Here’s a possible implementation of photo_write_to. It puts the rectangles in the photo into a list, sorts that list, then writes the rectangles one by one.
slide 45 This is simple enough, but why do the extra work of sorting? Why not just loop over the set and write the rectangles out directly?
slide 46 Please take a moment and see if you can think of the reason.
slide 47 Let’s work backwards to the answer. This version of photo_write_to is shorter and faster than the previous one.
slide 48 But there is no way to predict its output for any photo that contains two or more rectangles.
slide 49 For example, here’s a simple photo showing two fields of corn ready for harvest.
slide 50 And here are two lines of Python that we might put in a unit test to represent the photo, and write it to a file or a StringIO.
slide 51 You probably think the function’s output will look like this…
slide 52 …but it could equally well look like this, with the rectangles in reverse order.
slide 53 These two representations are conceptually the same, but they’re very different as text.
slide 54 The problem, of course, is that sets are unordered.
slide 55 Or rather, the elements in a set are stored in an arbitrary order that’s under the computer’s control.
slide 56 Since we don’t know what that order is, we can’t predict the output if we loop over the set directly, which means we don’t know what to compare the output to. If we sort the rectangles, on the other hand, they’ll always be in the same order, and to sort them, we have to put them in a list first.
slide 57 One final lesson for this lecture: you probably haven’t noticed, but the tests we’ve written in this episode are inconsistent.
slide 58 Here’s the fake “file” we created for testing the photo-reading function.
slide 59 And here’s the string we used to check the output of our photo-writing function.
slide 60 Please take a moment and see if you can see the inconsistency.
slide 61 That’s right: one string has a newline at the end, and the other doesn’t.
slide 62 It doesn’t matter whether we require this or not—either convention is better than saying “maybe”, because if we allow both, our code becomes more complicated, and more testing will be required.
slide 63 Stepping back, the most important lesson in this episode isn’t how to test functions that do I/O. The most important idea is that you should design your programs so that their components can be tested.
slide 64 To do this, you should depend on interfaces, not implementations: on the contracts that functions provide, not on the details of how they accomplish whatever they do.
slide 65 Following this rule will make it easy for you to replace components that you’re not currently testing with simplified versions to make it easier to test the ones you are interested in.
slide 66 It will also save you from writing your tests over and over as the internals of the functions you are testing are changed. Empirical studies have shown that interfaces are longer-lived than implementations: if you rely on the former rather than the latter, you’ll spend less time rewriting tests, and more time figuring out what effect climate change is having on fields in Saskatchewan.
slide 67 The other rule when you’re designing programs to be testable is to isolate interactions with the outside world.
slide 68 For example, code that opens file should be separated from code that reads data, so that you can test the latter without needing to do the former.
slide 69 Finally, you should make the things you are going to examine to check the result of a test deterministic, i.e., the result of a particular function call should always be exactly the same value, so that you can compare it directly to the expected result.
slide 70 Unfortunately, this last rule can sometimes be hard to follow in scientific programs. Our next episode will explain why.

  1. Ross Dickson
    September 4th, 2010 at 03:06 | #1

    The last line on slide 07, which says “Interface: We don’t (have to) care” — I think you meant that to be “Implementation: We don’t (have to) care”? Perhaps?

  2. September 8th, 2010 at 15:39 | #2

    There is a typo in slide 7 (http://software-carpentry.org/lectures/test/test-interface/slide-07.png): it says the word ‘interface’ twice, the second should be ‘implementation’

  3. klahnb
    January 18th, 2011 at 21:10 | #3

    Are the “Now write tests” slides (33-36) supposed to show testing of the “count_rect_in” function, instead of “count_rect”? (cut and paste error?)

  4. Davi Post
    April 28th, 2011 at 05:46 | #4

    “One technique is to replace the components we’re not currently testing with simplified versions that have the same interfaces, but much simpler implementations, just as a director would use a stand-in rather than a star when fiddling with the lighting for a show.”

    Not an accurate analogy. This technique is like rehearsing with one actor, using stand-ins for all the other actors in the scene.

  5. Davi Post
    April 28th, 2011 at 06:08 | #5

    In slide 34 and following, the statement:
    assert count_rect(reader) == 3
    should use count_rect_in, not count_rect.

  6. Davi Post
    April 28th, 2011 at 06:15 | #6

    Similarly, in slide 40 and folowing:
    “photo_write(fixture, writer)” should be photo_write_to, as referred to in the narration text.
    And the same in slide 44 and following:
    “def photo_write(photo, writer):”.

  1. No trackbacks yet.