Teaching basic lab skills
for research computing

Interview: Andrew Lumsdaine of Indiana University

Today's interview is with Indiana University's Professor Andrew Lumsdaine.

Tell us a bit about your organization and its goals.

One of the reasons the School of Informatics and Computing exists is that computing of various kinds has become part of almost all academic disciplines. The school's goal is to educate students in computer-related areas, as well as traditional computer science, and it's important that the principles of software development be taught. The same holds for my research group, which works in HPC (where the 'P' means both "performance" and "productivity"). There's a huge need for reproducibility in CS research, both for the sake of sound science and also so that people actually can build on each other's work. In order for that to happen, we need some guarantees about quality and reusability, and improving basic skills is a necessary prerequisite.

Tell us a bit about the software your group uses.

The first group (students in the school) uses every kind of off-the-shelf application you can think of. It's mostly closed source; they do relatively little development. My research group mostly build their own tools, and tend to be pretty zealous about open source.

Tell us a bit about what software your group develops.

We started by building a version of MPI, and are now part of the Open MPI collaboration. Working with dozens of collaborators around the world requires the same skill set as other open source projects: having a software repository, nightly build, regression tests, and proper licensing protocols is essential. We also contribute to the BOOST C++ library, in particular a parallel version of the BOOST graph library.

How do you hope the course will help them?

Bill Gropp once said, "Computers should be a labor saving device," but it often doesn't feel that way. We think that adopting basic development practices will allow people to do more and better science. We also think that organizing this material, instead of having grad students tutoring each other erratically, will give us a common base of knowledge that we can then rely on.

It's important for us institutionally that the course is self-contained. Introducing a bit of computing here and there across the curriculum is an idea that comes up a lot in faculty meetings, but I don't know of any successful across-the-curriculum efforts. Putting this training in one place is more efficient, and makes it someone's job to ensure success.

How do you plan to evaluate the impact the course has had?

(laughs) That's a forbidden question in Computer Science.