Teaching basic lab skills
for research computing

The Big Picture

I'm trying to be systematic about re-designing the core curriculum of Software Carpentry. So far, I've identified 11 common questions:

Q01: How can I write a simple program?
Q02: How can I make the program I've written easier to reuse?
Q03: How can I reuse code that other people have written?
Q04: How can I share my work with other people?
Q05: How can I keep track of what I've done?
Q06: How can I tell if my program is working correctly?
Q07: How can I find and fix bugs when it isn't?
Q08: How can I get data into my program?
Q09: How can I manage my data?
Q10: How can I automate this task?
Q11: How can I make my program faster?

whose answers depend on three fundamental principles:

F01: It's all just data.
F02: Programming is a human activity.
F03: Better algorithms are better than better hardware.

These break down into 11 more specific principles:

P01: Code is just a kind of data.
P02: Metadata makes data easier to work with.
P03: Separate models and views.
P04: Trade human time for machine time and vice versa.
P05: Anything that's repeated will eventually be wrong somewhere.
P06: Programming is about creating and composing abstractions.
P07: Programming is about feedback loops at different timescales.
P08: Good programs are the result of making good techniques a habit.
P09: Let the computer decide what to do and when.
P10: Sometimes you copy, sometimes you share.
P11: Paranoia makes us productive.

which in turn translate into 11 recommendations:

R01: Use the right algorithms and data structures.
R02: Use a version control system.
R03: Automate repetitive tasks.
R04: Use a command shell.
R05: Use tests to define correctness.
R06: Reuse existing code.
R07: Design code to be testable.
R08: Use structured data and machine-readable metadata.
R09: Separate interfaces from implementations.
R10: Use a debugger.
R11: Design code for people to read.

Here's how I see all this mapping onto the curriculum (assuming we replace agile development with number crunching):

  • The Shell: files and directories; creating things; pipes and filters; permissions; shell scripts; finding things; variables; loops
    • Q03: How can I reuse code that other people have written?
    • Q10: How can I automate this task?
    • P04: We can trade human time for machine time and vice versa.
    • P06: Programming is about creating and composing abstractions.
    • R03: Automate repetitive tasks.
    • R04: Use a command shell.
    • R06: Reuse existing code.
  • Version control: update, edit, commit, and history; merging conflicts; recovering old versions; setting up a repository
    • Q04: How can I share my work with other people?
    • Q05: How can I keep track of what I've done?
    • Q09: How can I manage my data?
    • F01: It's all just data.
    • F02: Programming is a human activity.
    • P01: Code is just a kind of data.
    • P02: Metadata makes data easier to work with.
    • P05: Anything that's repeated will eventually be wrong somewhere.
    • P07: Programming is about feedback loops at different timescales.
    • P11: Paranoia makes us productive.
    • R02: Use a version control system.
    • R03: Automate repetitive tasks.
    • R08: Use structured data and machine-readable metadata.
  • Basic Programming in Python: variables and assignment; repeating things; lists; reading and writing; conditionals; nesting control structures; design patterns
    • Q01: How can I write a simple program?
    • Q02: How can I tell if my program is designed well?
    • Q08: How can I get data into my program?
    • P04: We can trade human time for machine time and vice versa.
    • P05: Anything that's repeated will eventually be wrong somewhere.
    • P06: Programming is about creating and composing abstractions.
    • R01: Use the right algorithms and data structures.
    • R11: Design code for people to read.
  • Interlude: aliasing
    • P10: Sometimes you copy, sometimes you share.
  • Interlude: text
    • F01: It's all just data.
  • Interlude: Booleans and while loops
    • R11: Design code for people to read.
  • Interlude: Using a debugger
    • Q01: How can I write a simple program?
    • Q07: How can I find and fix bugs when it isn't?
    • F01: It's all just data.
    • R10: Use a debugger.
  • Functions and Libraries in Python: how functions work; aliasing (again); multiple arguments; returning values; libraries; standard libraries; functions as objects
    • Q01: How can I write a simple program?
    • Q02: How can I tell if my program is designed well?
    • Q02: How can I make the program I've written easier to reuse?
    • F01: It's all just data.
    • P05: Anything that's repeated will eventually be wrong somewhere.
    • P06: Programming is about creating and composing abstractions.
    • P10: Sometimes you copy, sometimes you share.
    • R06: Reuse existing code.
    • R09: Separate interfaces from implementations.
    • R11: Design code for people to read.
  • Interlude: provenance
    • Q05: How can I keep track of what I've done?
    • Q09: How can I manage my data?
    • Q10: How can I automate this task?
    • F01: It's all just data.
    • P09: Let the computer decide what to do and when.
    • R03: Automate repetitive tasks.
    • R08: Use structured data and machine-readable metadata.
  • Program Development: creating a grid; randomness; neighbors; handling ties; putting it all together; fixing bugs; refactoring
    • Q01: How can I write a simple program?
    • Q02: How can I tell if my program is designed well?
    • Q11: How can I make my program faster?
    • F02: Programming is a human activity.
    • P06: Programming is about creating and composing abstractions.
    • P07: Programming is about feedback loops at different timescales.
    • P08: Good programs are the result of making good techniques a habit.
    • R01: Use the right algorithms and data structures.
    • R06: Reuse existing code.
    • R07: Design code to be testable.
    • R09: Separate interfaces from implementations.
    • R11: Design code for people to read.
  • Interlude: configuring programs
    • F01: It's all just data.
  • Interlude: assertions; exceptions
    • P11: Paranoia makes us productive.
  • Testing: goals; tests as specifications; structuring unit tests; using a unit testing framework; design for test
    • Q02: How can I tell if my program is designed well?
    • Q06: How can I tell if my program is working correctly?
    • Q07: How can I find and fix bugs when it isn't?
    • Q10: How can I automate this task?
    • F02: Programming is a human activity.
    • P01: Code is just a kind of data.
    • P07: Programming is about feedback loops at different timescales.
    • P08: Good programs are the result of making good techniques a habit.
    • P09: Let the computer decide what to do and when.
    • P11: Paranoia makes us productive.
    • R03: Automate repetitive tasks.
    • R05: Use tests to define correctness.
    • R07: Design code to be testable.
  • Sets and Dictionaries: sets; storage; dictionaries; simple examples; longer examples
    • F03: Better algorithms are better than better hardware.
    • Q11: How can I make my program faster?
    • R01: Use the right algorithms and data structures.
  • Interlude: numbers
    • F01: It's all just data.
  • Number Crunching; basics; indexing; linear algebra; making recommendations; statistics
    • Q03: How can I reuse code that other people have written?
    • Q11: How can I make my program faster?
    • F03: Better algorithms are better than better hardware.
    • P04: We can trade human time for machine time and vice versa.
    • P09: Let the computer decide what to do and when.
    • R01: Use the right algorithms and data structures.
    • R06: Reuse existing code.
  • Databases: selecting; removing duplicates; calculating new values; filtering; sorting; aggregation; joins; missing data; nested queries; transactions; programing with databases
    • Q08: How can I get data into my program?
    • Q09: How can I manage my data?
    • F01: It's all just data.
    • P02: Metadata makes data easier to work with.
    • P03: Separate models and views.
    • P05: Anything that's repeated will eventually be wrong somewhere.
    • P09: Let the computer decide what to do and when.
    • R08: Use structured data and machine-readable metadata.

Comments and suggestions would be very welcome.

Dialogue & Discussion

You can review our commenting policy here.