Teaching basic lab skills
for research computing

Patterns Wanted

At some point or other, most programmers have encountered the idea of design patterns in software, and many (including myself) have been zealous about them, at least for a while. They haven't actually revolutionized either the practice of software development or the way we teach it, but becoming familiar with them is to programmers what learning the Beatles' greatest hits is to musicians.

That presents us with a problem. We have deliberately chosen not to include object-oriented programming in the core of Software Carpentry: it's too big to fit into the time we have, and too far beyond what our learners bring with them. However, almost all discussion of design patterns is phrased in terms of classes and objects. It doesn't have to be—the ideas behind Proxy, Singleton, and Iterator are frequently used in procedural languages like C—but:

  1. patterns rose to prominence in the early 1990s partly because they helped procedural programmers make sense of OOP;
  2. most professional programmers use OOP, so that's the right way to talk to them; and
  3. some patterns really do only make sense in OO languages.

The biggest problem, though, is that most discussion of patterns is over our learners' heads, i.e., it addresses problems they haven't reached yet. The scientists we're helping are still trying to figure out what aliasing is, or why it's usually better for a function to take an open stream as an argument rather than a filename. The patterns they need are so simple that most programmers have forgotten that they need to be learned.

There are a couple of exceptions, though. One is the "Roles of Variables" work that Sajaniemi and others did a few years ago. By looking at the kinds of programs people write in introductory courses, they classified variables as follows:

Fixed value and organizer contain the same data throughout the program; only the order of data elements may be changed. Most-recent holder and stepper record data flow sources; either coming from outside or generated internally. The net effect of all items in a data flow is represented by a one-way flag, most-wanted holder, or gatherer; while a manipulation of a single element is recorded in a follower or temporary. Data may be stored in a container which can be traversed with a walker. Finally, a data entity not covered by any of the previous roles is considered to have the role other.

Their classification scheme is not unambiguous (i.e., different experts can label a particular variable in different ways) but the same thing happens with design patterns, but that's OK—defensible differences are informative. The real benefit of this scheme is that it gives novices a way to organize and plan programs: once they've learned to recognize roles, they can start to create variables with roles in mind, which saves them from having to reinvent or rediscover the idioms that distinguish experts from novices.

Another piece of work aimed at the same level is Michael de Raadt's dissertation on novice-level programming plans. He described 18 of them:

  • Average
  • Divisibility
  • Cycle Position
  • Number Decomposition
  • Initialisation
  • Triangular Swap
  • Guarded Exceptions
  • Counter Controlled Loop
  • Primed Sentinel Controlled Loop
  • Sum and Count
  • Validation
  • Min/Max
  • Tallying
  • Search Algorithm
  • Bubble Sort Algorithm
  • Command Line Arguments
  • File Use
  • Recursion (single- and multi-branching)

Both pieces of work are a great start, but we want to teach people the craft of (scientific) programming, we need more. That's where you come in: what patterns have you used in your programs? When do you use them? When don't you (i.e., what are their boundary or limiting cases)? And do you know of any other catalogs or summaries that we could link to?

Dialogue & Discussion

You can review our commenting policy here.