Teaching basic lab skills
for research computing

Not Really Disjoint

The twinned discussions in bioinformatics about openness and software quality are heating up. A recent salvo on Gas Stations Without Pumps is titled "Accountable research software", and one statement in particular caught my eye:

The rapid prototyping skills needed for research programming and the careful specification, error checking, and testing needed for software engineering are almost completely disjoint.

I might agree that careful specification isn't needed for research programming, but error checking and testing definitely are. In fact, if we've learned anything from the agile movement in the last 15 years, it's that the more improvisatory your development process is, the more important careful craftsmanship is as well—unless, of course, you don't care whether your programs are producing correct answers or not.

The sentence quoted above is commentary on a post by a different writer which is summarized as:

...most research software is built by rapid prototyping methods, rather than careful software development methods, because we usually have no idea what algorithms and data structures are going to work when we start writing the code. The point of the research is often to discover new methods for analyzing data, which means that there are a lot of false starts and dead ends in the process... The result, however, is that research code is often incredibly difficult to distribute or maintain.

The first part is equally true of software developed by agile teams. What saves them from the second part is developers' willingness to refactor relentlessly, which depends in turn on management's willingness to allow time for that. Developers also have to have some idea of what good software looks like, i.e., of what they ought to be refactoring to. Given those things, I think reusability and reproducibility would be a lot more tractable.

Dialogue & Discussion

You can review our commenting policy here.