Teaching basic lab skills
for research computing

Summary of May 2014 Lab Meeting

Our monthly online lab meeting took place this past Thursday (May 22), and for the first time it included voting on pull requests and other issues. All the notes from the Etherpad are included below, but the high points are:

  1. The Mozilla Science Lab is hiring a developer and a community manager.
  2. People and projects that would like to take part in our global sprint on July 22-23 are invited to sign up on this Etherpad. We won't just work on Software Carpentry curriculum and tooling: related projects are welcome to use this opportunity to bring their communities together as well.
  3. Arliss Collins and others will try to simplify the workflow for creating and managing bootcamps. (Right now, we rely on five different online systems, and most of the administrators we work with at host institutions can't make heads or tails of them.) If you would like to help, please let us know.
  4. We will add people who want to be helpers at bootcamps to our instructors mailing list rather than creating yet another list for reaching them.
  5. We voted on the following pull requests:
    • Extra lessons will go under existing directories (e.g., novice/git/) rather than in a top-level extras directory.
    • We won't try to standardize usage of "parameter" and "argument", since most instructors use them idiosyncratically and/or interchangeably.
    • We will incorporate the new lessons on Mercurial as soon as they're done.
    • We will merge the lessons on scikit-learn, Python string formatting, common Python error messages, and setting up SSH keys for GitHub.
    • We won't include the lesson on tmux—people felt it was too specialized—and will ask that the lesson on text data mining in the shell be re-worked.
  6. We also voted on the following proposals for new lessons:
    • Using Excel properly, using Make to manage data pipelines, and regular expressions were all approved.
    • The draft lesson on creating and syndicating data on the web was deferred (only a few people had looked at it).
    • People liked the idea of lessons on statistics with Pandas and managing geospatial data, but we will need volunteers to take the lead.

Our next lab meeting, on June 26, will primarily be devoted to planning for our July 22-23 sprint. We look forward to seeing lots of you at both.

Agenda

  • Announcements
  • Votes on pull requests
  • Votes on possible additions
  • Plans for the July 22-23 sprint (https://etherpad.mozilla.org/swc-sprint-2014 )
  • Tracking bootcamp tasks
  • Adding helpers to the 'instructors' list
  • How to engage departments?
  • SWC Administration Tooling and Processes (see note below and line 88)

Attendees

  • Greg Wilson (Mozilla, Toronto)
  • Xu Fei (New York, NY)
  • Paul Wilson (U. Wisconsin-Madison)
  • Mike Jones(Pittsburgh, PA)
  • Devasena Inupakutika (Southampton, UK)
  • Trevor King (Olympia, WA)
  • Jason Williams (Cold Spring Harbor Laboratory)
  • Damien Irving (University of Melbourne)
  • Hamid Mokhtarzadeh (University of Minnesota, MN)
  • Dan Warren (Macquarie University, Sydney, Australia)
  • Ana Malagon (Yale University, CT)
  • Cliburn Chan (Duke University, NC)
  • Emily Davenport (UChicago, IL)
  • Christian Jacobs (Imperial College London, UK)
  • Matthias Bussonnier (institut curie, paris)
  • Ethan White (A car somewhere in Indiana)
  • Cam Macdonell (Edmonton, Alberta)
  • Rob Beagrie (Imperial College London)
  • John Blischak (University of Chicago)
  • Paul Ivanov (UC Berkeley)
  • Gabriel Devenyi (McMaster)
  • Jessica Ruyle (University of Oklahoma)
  • Doug Latornell (UBC, Vancouver)
  • Marian Petre (Open University, UK)
  • Francois Michonneau (University of Florida)
  • Jeramia Ory (King's College)
  • Brad Taber-Thomas (Penn State)
  • Brian Glanz (Honolulu)
  • Pauline Barmby (Western U, London Ont)
  • Tracy Teal (MSU)
  • Gavin Simpson (URegina, Canada)
  • Joshua Adelman (University of Pittsburgh)
  • Kaitlin Thaney (Mozilla Science Lab)
  • Arliss Collins (Mozilla, TO)
  • Chris Friedline (VCU, Richmond, VA)
  • Will Trimble (Argonne National Laboratory, Chicago)
  • Denis Haine (U of Montreal, QC)
  • JC Leyder (ESA, Spain)
  • Christina Koch (Vancouver)
  • April Wright (University of Texas, Austin)
  • Massimiliano Picone (Italy)
  • Raniere Silva (University of Campinas, Campinas, Sao Paulo, Brazil)
  • Remi Emonet (Univ Saint Etienne, France)
  • Sarah Simpkin (University of Ottawa, Canada)
  • Abigail Cabunoc (OICR)
  • Amanda Harlin (University of Oklahoma)
  • Matt Gee (University if Chicago)
  • Dmitri Novikov (Concordia University, Montreal, Canada)
  • Marcello Barisonzi (Montreal)
  • Daisie Huang (UBC, Vancouver, Canada)
  • Philipp Bayer (University of Queensland, Australia)
  • Matt Davis (Synthicity)

Announcements

July Sprint - July 22-23

Votes on Pull Requests

Votes on New Topics

  • using Excel properly
    • Morning voting: +1 +1 good for Data Carpentry too-1,0+1+1+1+1 +1+1+1+1+1 +1 0
    • Evening voting: +1+1+1(openoffice)+1+1+1+1+10
  • using Make to manage pipelines
    • Morning voting: +1+1 +1 +1 1 0 +1 +1 0 +1 0 0+1+1+1
    • Evening voting: +1+1+1+10+100+1
  • regular expressions
    • Morning voting: +1+1+1 +1+1+1+1+1 +1 +1 +1+1+1+1+1
    • Evening voting: +1+1+1+1+10
  • creating web-accessible content/data (?) (see https://github.com/swcarpentry/bc/pull/502 )
    • Morning voting: +1 +1 0 +1+1+1+1(intermediate level)+1
    • Evening voting: +1+10
  • Statistics and Pandas (see https://github.com/swcarpentry/bc/pull/432 and https://github.com/swcarpentry/bc/pull/266 )
    • Morning voting: we need someone to take the lead
      • keen to contribute but a little too busy to lead atm (rob beagrie)
      • ditto to what Rob said (Pauline)
      • Chris Friedline can help too
      • Happy to help with aligning Pandas material with the R lessons (Gavin Simpson)
      • Willing to take lead on Excel --> Pandas (April Wright + Christina Koch and Pauline Barmby?)
    • Evening voting:
      • Happy to contribute (Cliburn) - e.g. can write lesson on split-apply-combine
  • Geospatial data (see https://github.com/swcarpentry/bc/pull/387 )
    • I'd be happy to do a very basic lesson on handling GIS data in R (Dan Warren)
    • Matt Davis can help with Python map stuff.

Tracking Bootcamp Tasks

  • See also Administration Tooling and Process below
  • Checklists are useful, but not actionable
  • Conclusion: we'll use a Google Doc spreadsheet per bootcamp for the next couple of months and see how that goes
    • Easy to set up
    • Non-technical university admins and others can drive it
  • Other options:
    • Automatically populate each bootcamp repo with a bunch of tickets?
      • Rob Beagrie: is it easy to populate a new repo with a whole bunch of issues?
      • Morning voting: +1 +1+1+1 -1 if university admins have to use it-1, same, -1 for admins
    • Github markdown supports to-do lists (see https://github.com/blog/1375-task-lists-in-gfm-issues-pulls-comments) - could we put a default to-do list in the main bc repo?
      • +1+1 checklist in issue -1 if university admins have to use it1, same
      • Updates in comments don't give watches notification by email (Raniere already request this feature)
        • If you put the list in the main repo it needs a new commit to check things off (which would presumably send an email)

Adding Helpers to the Instructors List

  • The 'instructors' mailing list currently includes exactly and only certified instructors
  • We don't have any systematic way to track would-be helpers
  • Proposal: add them to the 'instructors' list+1+1+1+1

SWC Administration Tooling and Processes

  • See also Tracking Bootcamp Tasks above
  • Neil Chue Hong: I'd like to raise the issue that we should revisit the issue of improving and streamlining our tooling and processes for administrating Software Carpentry across the world [unfortunately I can't join the call, but I hope that Aleksandra can]
  • As Software Carpentry grows, we will need to get more people contributing to the administrative effort to support the growth +1
  • The current administration process requires the use of five separate pieces of infrastructure and some parts are not easy for people without a software development background+1
  • It also doesn't necessarily scale well (*empirical evidence required*) <- trust me, this is an understatement...+1
  • Whilst we have very good guides for hosts, instructors and helpers, we don't yet for administrators
  • Proposal:
    • we get the current SWC administrators: Amy, Arliss, Aleksandra, Giacomo, others? to note the current benefits and drawbacks of the process for administrating SWC as stands
    • we set up a small group to look at how we might improve the administration infrastructure
    • we set up a small group to develop a better guide for administrators (with added flowcharts!)
  • We currently have a Mozilla colleague pulling together a prototype that should help with instructor matching, and are looking to revamp some of the pre and post bootcamp emails, followups, and discuss how to roll out comprehensively (possibly pre-set for instructors to fine tune and send through the eventbrites, for example).
    • Here's the pre-bootcamp email that we'd like to have sent with the install instructions for all events
    • Would welcome comments and thoughts. Will be working on post-event mailings, as well. (Feel free to ping me directly at kaitlin@mozillafoundation.org)