Teaching basic lab skills
for research computing

Using a Package Manager for Lessons and Papers

I've been musing for a couple of years now about ways in which we could re-purpose off-the-shelf software engineering tools and techniques to serve the needs of teachers. One theme, which I touched on in my SciPy 2014 talk, is to get people to patch shared learning materials in the way they patch Wikipedia articles and open source code. Another is to use package managers like RPM, Homebrew, and Conda to track dependencies between lessons, so that I could say something like conda install suffragette_movement and get a lesson on the struggle for women's voting rights, along with the other lessons and materials it depends on (or updates and links to those other lessons if I already have some of them installed).

Setting aside for the moment the risk that this would be a technical "solution" to a social problem (because we all know how well those work), I've started wondering: should we be using package managers to aid reproducible research? In particular, if I know the DOI that identifies a particular paper, should I be able to type:

  $ conda install doi://10.1109/SECSE.2009.5069155

and get the LaTeX source for the article, the style files it depends on, its data (or links to such), and the code used to generate the results (including updates to third-party libraries that code depends on)? It begs the question of how results are regenerated, of course, and mutter mutter virtual environments to avoid conflicts between papers mutter mutter, but it seems like a way to manage the kinds of things that some of our colleagues are already thinking about.

Several people are taking a run at this idea as part of our two-day sprint this week: see this Etherpad for details, and this GitHub repo for further discussion. We'd be grateful for your thoughts and your help.

Dialogue & Discussion

You can review our commenting policy here.