Camille Avestruz, Ivan Gonzalez, Timothy Cerino, and Daniel Chen all had the great opportunity to teach Software Carpentry's first zero-entry workshop to high school students. We were able to teach at Rockefeller thanks to the scientific foresight of Jeanne Garbarino and the rest of the Rockefeller team along with Arliss Collins, Greg Wilson and the SWC team. Lastly, thanks to Gabriel Perez-Giz for volunteering his time to help during the workshop.
The main goal of this workshop was to expose tomorrow's scientists to scientific computing as early as possible. For example, as genomics data for biology continues to grow, we are beginning to see a shift of biologists from the pipetter to the data scientist. Our goal was not to teach everyone all the skills needed so they can dive into retrieving, cleaning, and analyzing genomics or astronomy data the next day, but rather show them what is possible with computers, and expose them for the first time that the GUI may not always be the best tool for the job; and give some foundation of knowledge and concepts for perpetual self learning.
We followed the traditional SWC workshop materials, adapting the pace as needed. Bash, Python, and Git were covered.
What Went Well, and What Didn't
High School Students
Since there was a lot of material to cover in two days, the lessons were given in 90 minute blocks. For the typical NYC high school (I'm now speaking from personal experience), there are 10 blocks of 45 minutes each throughout each day. Some classes may take a 2 period block (90 minutes), but almost never is the entire day solely taught in 90 minute blocks. In higher education a 90 minute class is almost considered the baseline and is very common with many other classes lasting much longer. This difference was echoed to us when we finished our workshop and had a 15 minute open discussion of what worked for the students, and what did not. Many suggested having 4 half-day workshops, each covering 1 topic with many small in-class practice problems and a 'homework'. Reflecting on this point, this makes sense and should be tested the next time we teach high school students.
Teaching these workshops usually has a bimodal distribution of students in terms of proficiency, experience, and familiarity of the materials being covered. In our particular case, we pre-screened the class, but only about 2 students had prior experience with the material.
Sending these students to a site with short challenge exercises so they don't get bored (i.e. Rosalind or Project Euler for Python; Learn Git Branching for Git) would have helped for those few who thought the pace was too slow.
The world cup probably didn't help either :D
Sticky notes, the Etherpad, and raising your hand
The sticky notes (red up for a problem, green up for no problem), Etherpad document, Etherpad chat, and the basic raising of the hand to ask for help were extremely effective. Students were using all means of getting help, although I would have to say most were doing what they were used to in the classroom, which is raising their hands. For those that were shy, the Etherpad chat was a way for the instructor to clarify an idea or answer. Once again, the sticky notes also provided a great means for honest feedback.
We began in Bash with the basic
ls commands, eventually moving to
The helpers realized very quickly that the students were struggling to
keep up. Common questions were:
- Did I do this right?
- It doesn't look like what's up there
- I don't understand what is going on
For those of us who live in Bash, or use the terminal regularly, we
customize the look of the prompt, removing extraneous information,
adding colors, and etc. From the student's point of view, they've
never seen a blinking cursor in its current context. Trying to follow
along while blindly typing what they see on the screen, except what
they see on the presentation screen was not what they saw on their own
monitor; thus confusion arose, even though many of them created and
moved into the correct directory. A possible workaround is to have
all students copy/paste an
colors directories (this is not a default in OSX), and to have everyone
export the same
PS1. The former can be done
right after the
ls command is brought up, and searching through the
man pages for the
--color flag. Experimentation with
when the optimal time to export the PS1 variable would have to be
conducted, but it seems that a black box 'please copy/paste this so
your terminal looks like mine' may be sufficient.
This brings me to my next point, when using the Etherpad, it will
be useful to not copy the
$ before the actual
Bash command. Many of the students are still trying to get some idea
of what is going on that they will copy the entire line, then attempt
to copy/paste and when they finally remember the correct terminal
copy/paste commands, the command will error out and will not
understand why. This could be a failure on our part in not properly
explaining the components of a Bash command (e.g., the first 'token'
is always a 'verb'), but at this level, it may be too much information
trying to navigate the file system and the nitty-gritty of Bash all at
The students were very literal, as they should be when exposed to a new concept. However, as literal as they were, spelling mistakes were common, and learning to tab complete was a lesson they were learning the hard way.
When using Bash to navigate and create files, zero-entry
bootcampers do not have a mental image of the directory
structure/hierarchy. A GUI file system window that is shown
simultaneously with the basic navigation and
mkdir commands solidify and contextualize what
exactly is going on in the file system them that
touch actually create files in the GUI
interface as well. This can be accomplished by running the following
terminal commands for Windows, OSX, and Linux operating systems,
nautilus .in Ubuntu)
From the first series of feedback, we learned that repetition is
extremely important. For example, when we are teaching
cd is and when we say '
cd into a new
directory' it was important for the instructors to say
cd, change directory into your SWC folder'. This
applied to the other topics cover as well and is especially important
since these are new terms for the students and it naturally slows down
the lesson (having tons of spelling mistakes work just as well as a
After reading the first round of feedback, we put up a small exercise on the screen during lunch. This gave the students to read and see what the task was when they return. Additionally, with Gabe's help, we came up with a more practical exercise that show cases the necessity of learning Bash. We created a directory of 48,000 files of mixed names and file types and asked the students how they would move a certain pattern of files (e.g. I want my 2013 pictures that are some form of .jpg into a pics/2013 subfolder). This was a powerful example since the 'usual' way they would have done the task is through the GUI. However with that many files in the directory, the GUI actually had problems drawing all the icons to display. By the time the GUI loaded up the icons, we were almost finished with the entire task. That also included explaining the problem in Bash, giving them time to come up with the wildcard expression to move the files, and watching the GUI crash a few times. This example really gave them a real world context on why the GUI is bad and we reiterated that a massive file dump like the example shown is not uncommon in science.
We took the feedback regarding pacing in Bash and paced the Python lessons in a much more digestible fashion. We finished off the day covering conditionals, with some feedback about the section on dictionaries to be too long. By the second day I overheard students being excited about their newfound knowledge, as if they are thinking in terms of a logical set of commands to give a computer to accomplish a task. It was inspiring to know that at the same time 'it is just the tip of the iceberg and there's a lot more out there.' We began the second day reviewing everything in the previous day. Go to the SWC work directory, create a folder for today's work, open up a ipython notebook, and review some questions that were brought up in the preview day's feedback. It was fascinating to see that even amongst all the confusion regarding Bash the previous day, filesystem navigation was not an issue on the next day.
We finished the session off with loops, functions, basic file I/O, and we gave them a practical exercise to read in a dataset of animals and their brain mass and body mass. The task was to calculate each animal's brain:body mass ratio and save the information so it can be used later.
Some things we learned teaching Python:
- Students had trouble understanding the value of toy exercises. When I was helping with during the Python session, some seemed to be very worried about having the example or exercise "right", as the value of the task was just this.
- Exercises/examples should be tuned to connect to their previous experiences or otherwise have some sense of completion (e.g. ipythonblocks).
- During the last capstone exercise, many of the students were simply stumped, overwhelmed, and had 'no idea where to start', even though the basic components were covered. This became a very powerful example of breaking down the problem (engineering), and the use of comments. Showing them comments and how they can be used to pesudo-write code and get some process out without be burdened by actual code
- forces them to actually break down a problem into manageable practices and makes the problem less daunting.
- shows a real practical use case of comments and how they can be used before any code is written, not just documenting what you already have done. I think we did a great job introducing good practices in programming. Hopefully the next time they are asked to implement their knowledge it becomes less daunting.
%%bashto run Bash commands such as
Version control is probably one of the more difficult topics to both cover, preach, and attempt to convert non-users into using. We began by mentioning 2 common problems, the first of which is nicely depicted in this PhD Comic on final documents
The second was an example of backing up data while working on a thesis. The biggest barrier to entry, is showing them why it is better than their current workflow: dropbox, track changes, save as with a number or date, collaboration, etc. We began by diagramming Git on an oversize notepad. However, due to the nature of the room, it was difficult to see the drawn figures from the back. We went back and fourth between the Git diagram and the actual committing and checkouts in Git to get a sense of how things are being tracked using 2 files keeping track of guacamole ingredients and instruction on how to make guacamole. We mostly went over version control on a local and individual level. During the final section of the workshop we went over collaboration using the guacamole recipe on GitHub.
Here we also ran into many technical problems regarding the older
versions of OSX, namely Snow Leopard. The problem arises when one
git init the directory to be tracked. Simply
checking the Git installation by using
which git does not
accurately diagnose the problem. The fix was to use an older version
of Git than the one posted on the main website. However, it seems
that an even more universal and simpler solution is to install the
GitHub application and within preferences, have the program install
the command line tools.
From the student feedback and instructor observations, the workshop was a success. The students asked very good questions about practical use cases for each of the topics covered. The turnaround between Bash proficiency between the 2 days was astounding. We also spent some time throughout the workshop referencing a few good links on where to practice their new found skills, explaining that the problems do not have to seem 100% applicable to your current work (although that may help), practice doing various (unrelated) tasks train the mind to see how a solution can be applied to other problems because of an underlying pattern. Thus, we directed them to a few Python and Git websites to give them practice.
Our goal was to give them enough instruction to get over the big initial learning hurdle into Bash, Python, and Git so that they have the foundation to explore and learn on their own.