I was first introduced to William Stafford Noble's paper "A Quick Guide to Organizing Computational Biology Projects" when Ivan Gonzalez and I taught Harvard last November. Noble describes how scientists in Computational Biology should set up their project folders so code, results, outputs, figures, and papers are all in easily understandable locations. He also writes about how one should run experiments (using driver scripts) to make workflows reproducible, readable, and understandable to others (and your future self).
I recently reorganized a project I was working on to follow Nobe's folder structure, and when I started a new computational project, I realized I would have to create the same folder structure again. Since, we're in the business of not repeating ourselves, I created a short Bash script that implements the basic folder structure outlined by Noble. When I brought up the project to our discussion board, Pat Schloss chimed in with a similar project based on the same motivational paper, and my project even got a pull request!
For Software Carpentry, it serves as a way for us to teach our attendees some best practices while showing them the importance of reproducibility and scripts. It also allows a low barrier of entry to actually set up a project folder structure, since re-organizing folders when a project is fully developed will most likely not happen.
REPRODUCIBILITY · TOOLS
Dialogue & Discussion
You can review our commenting policy here.