Teaching basic lab skills
for research computing

Software Carpentry at TGAC

In February The Genome Analysis Centre (TGAC) in Norwich hosted their first Software Carpentry bootcamp. TGAC is keen to intensively develop their training programme and facilities and Vicky Schneider who leads the Scientific Training, Education & Learning Programme strongly supports Software Carpentry, we are looking forward to more bootcamps in Norwich, UK.

The bootcamp at TGAC was open to biologists and, in total, 25 of them attended the event. We covered the core Software Carpentry topics, such as using command line to automate tasks, good programming practice with Python, debugging and testing, and version control. We also added an introduction to data analysis with pandas, the Python Data Analysis Library.

The audience turned out to be quite mixed. There were several participants who had never programmed or used command live before, whereas there were others who regularly write code to conduct their research. It is always a challenge to accommodate the needs of such a polarised audience. However, the post-bootcamp feedback questionnaire showed that both beginners and experienced coders found these two days useful. One beginner said that she doesn't program herself at all and doesn't expect it would change much in the future but her main goal was to understand what programming is, as she works a lot with developers. The more advanced attendees were interested to learn about the pandas library and unit testing.

Rob Davey from TGAC joined me to teach at the bootcamp. Rob has run some computing training for scientists in the past but that was the first time he taught at a Software Carpentry bootcamp. And it looks like Rob really enjoyed it as he is going to attend the Instructor Training in Three Days in Toronto.

The bootcamp started with Rob teaching command line, with the attendees working on computers at the TGAC Training Lab with the Software Carpentry virtual machine installed. The VM prevented the common problems with setting up the attendees' machines and also provided an opportunity for some participants to use Linux for the first time. The shell module was followed by a lesson on version control. At the end of this module the participants were asked to work in pairs on a repository they hosted on GitHub. Rob's observation on Twitter was "Pushing to someone else's github repository == a lot of excited talk, and some worryingly evil grins".

At the end of Day 1, we introduced the attendees to programming with Python. We used IPython Notebook for teaching, which most attendees really enjoyed as a tool. It allowed them to interact with the code and see the lecture notes at the same time. However, things got a bit more confusing, especially for the beginners, when we ran the nosetests from the command line.

The bootcamp was very intense and we only managed to make a short introduction to the pandas library. Nonetheless, all bootcamp material was, as usually, made freely available to all attendees in an online and printed version. The TGAC Events Team did a great job in preparing all materials in printed binders. The bootcamp finished with an hour-long exercise in which the attendees needed to use the skills they had been taught to fork and clone a repository, fix the broken code in Python and process some files using shell commands. The pull requests with the solution were coming in until the day after the bootcamp so we definitely managed to get the participants really engaged in the material!

A part of this post originally appeared on the Software Sustainability blog.

Dialogue & Discussion

You can review our commenting policy here.