Archive

Archive for March, 2011

Harder Than It Should Be

March 31st, 2011 1 comment

Someone once said, “Chemistry is basically anything chemists will give each other awards for doing.” Or something like that—Google doesn’t find matches for that exact quote. Even if I’ve mangled it, the idea is sound: art is no more and no less than what great artists accept as being art.

So what is computer science? More particularly, what constitutes the core of computer science? What’s the stuff that everyone who calls themselves a “computer scientist” should know, or at least have seen? One way to answer the question would be to look at what people are given prizes for, but that’s turning out to be harder than I expected, and the reason highlights a gap in this course.

Let’s start with the two biggest academic prizes open to the whole spectrum of CS: the ACM Doctoral Dissertation Award, and the A. M. Turing Award, which is often called “the Nobel Prize of computing”. The page I linked to lists the names of the Dissertation Award winners from 1978 to the present, but those links take you to pages that have nothing more on them than the name of the prizewinning thesis (and in some cases, a press release or a photo of the winner accepting a check). There’s no useful metadata anywhere to be seen: not keywords (which is what I’m after), not links to scholarly databases (so that I could write a script to harvest keywords), nothing. I could write a script to googlewhack the author’s name and thesis title, but the half-dozen pages I looked at were formatted in three different ways, so that smells like a lot more effort than I’m willing to put in to do something that my local second-hand stereo parts store has supported since 2005 (or maybe even earlier).

The Turing Award site is a bit better: once you figure out that you have to select a sorting order to get the landing page to display more than the most recent winner, the sub-pages that the main page links to do contain a few sentences explaining why each winner won. There’s still no structured metadata, though, so something that I know could be done in 10 minutes looks like it would take half a day, which means I’m not going to do it.

Software Carpentry doesn’t really talk about this issue anywhere. It shows you how to use a database, and the essay on provenance nods to the value of structured metadata without going over to say hello, but that’s about it. I’m constantly taken aback by how much time real scientists spend looking things up and chasing things down (journal editors are unlikely to take “or something like that” as sufficient citation for a quote like the one that started this post). We really should include something about the computational side of knowledge management and discovery in this course, but for the life of me, I don’t know what—if you do, please tell me. And if you have any clout with the ACM, please point out that since they require people to specify topic keywords when submitting papers for publication, it would be only fair of them to give us back a few keywords when we need them…

Categories: Opinion Tags:

Using Bein

March 31st, 2011 No comments

Many thanks to Frederick Ross for putting together a short screencast on Bein, a workflow manager and miniature laboratory information management system (LIMS) built in Python that fills the gap for the working scientist between the classical shell and big workflow managers like Galaxy and major LIMS systems like OpenBIS.  Please have a look and see if it could help make your computational life easier.

Categories: Content, Version 4 Tags:

Practical Computing for Scientists at Stanford

March 30th, 2011 No comments

Prof. Risa Wechsler writes from Stanford:

We are teaching a course at Stanford this term inspired by Software Carpentry. The course, “Practical Computing for Scientists”, will be a Student Initiated Course, led by two physics undergrads (former summer research students with my group)—the course will be targeted at physics undergrads with a bit of programming experience to prepare them for summer research, but we expect participation from undergrads from other fields and some grad students and postdocs as well. I am hopeful that we will be able to keep this going as a regular student-led course with richer material as it develops. We expect that we will use some of your materials as well as some materials developed here.

The website for the course is http://physics91si.stanford.edu/, where you can find the syllabus and the preliminary handouts. We start tomorrow—it’s a 10 week course. We’ll be happy to keep you posted on how it all goes if you are interested.

Categories: Stanford University Tags:

Spring 2011 Course Over

March 30th, 2011 No comments

The Spring 2011 course is now over! We had a wonderful time, learned a lot, and hope the same is true for everyone who participated in the course.

We would love it if you would take a minute or two and give us any comments, feedback, and suggestions you have regarding the course and your experience in it.

To start the ball rolling, Erin Osborne gave us some awesome feedback, posted with permission:

Orion,

I’m really glad I took this class. I often had experiences where something that we covered in the Software Carpentry course was brought up in a lecture or in lab the following week. Lucky me! So I think I was at the perfect level to benefit from the class.

I think the most difficult lectures for me were:
— Python
— Testing
— Objects and Classes — This one was challenging, but I was used to the pace by the time we got to this part of the course.

The easiest lectures for me were:
— The shell command
— Regex
— mysql

I think the biggest determining factor as to whether a module was easy or hard for me was based on previous experience. This must make it challenging for you guys to teach this class since everyone has some different set of previous experience.

I don’t think I would have been able to get through the course without 1) the TA’s and 2) outside reading materials. Once I realized I was in for more than I expected, I went to the library and rented a lot of books listed on the web pages… especially python books. These were really helpful.

The sections I will make use of most from here on out are:
– svn — A lot of my programs were already set up by labmates using SVN, but I wasn’t taking full advantage of SVN’s capabilities.
– regex — I use regex all the time, but some of the themes covered in the regex lecture helped me to branch out of my typical searches.
– piping in the shell — My shell commands are much more streamlined and efficient now
– testing — I have incorporated tests into some of my existing programs
– sql — I would really like to use this more, and there are some existing sql databases available in my field!

Though I really don’t think I’ll be using python too much after this class, I’m glad I was exposed to it and I wish I had learned it earlier. My lab is deeply entrenched in perl, so I’ll probably stick with that. However, I really found it fascinating to understand the python way of thinking. Very elegant and nice! Thanks for sharing.

I tried to think of a things that could make the class more effective. Some of these issues may purely have been me missing something obvious, but it may help.

— A clear syllabus with dates on it. I didn’t know until the end what the syllabus was. Maybe it was just somewhere obvious but I couldn’t find it.
— I could have really used the answers to the previous week’s homework at some point. I think I learn a lot from reading other people’s scripts and from deciphering how the codes are different from my own. In previous courses I have taken, the TA’s just dumped a selection of different students’ work into a folder for perusal. It was nice to read the different strategies.
— I could have used a little more intro and instruction into python and in the testing lecture. At the very least, some links to web pages or materials with background would have been helpful.

You guys did a great job. Thanks so much!
Erin.

This is exactly the sort of information that is crucial to making this course as beneficial as it can be. What worked? What didn’t? We intended to highlight a variety of student solutions on the forum each week, but it never ended up happening. What else did we miss that would have helped you?

Thank you all again. It has been a pleasure,
Orion

Categories: Education, Evaluation, Version 4 Tags:

And I’m on a Horse

March 26th, 2011 No comments

Patrick Mackenzie (whom I’ve never met) gave a good lightning talk at the Business of Software that sums up a lot of what we haven’t done for Software Carpentry. Any reworking of the material really (really) has to be built around what you (the researchers) need, rather than what we (the programmers) know.

Categories: Content, Noticed Tags:

A Better Way to Teach Programming to Scientists

March 24th, 2011 No comments

This year’s SIGCSE conference on computer science education featured a very cool paper by Robbins, Senseman, and Pate, of the University of Texas at San Antonio, called “Teaching Biologists to Compute using Data Visualization”.  In it, they describe CS 1173: Data Analysis and Visualization using MATLAB, which introduces students in the life sciences to programming using a problem-first approach. Like the media-first approach that Mark Guzdial and his colleagues introduced at Georgia Tech, this rewards students right away—they don’t have to wade through a “CS first” morass of data types and arcane rules about Boolean expressions for three or four weeks in order to get to the useful bits. Given another year, I’d have rewritten our intro to Python this way…

Categories: Content, Education, Noticed Tags:

Our First Episode on Microsoft Access

March 23rd, 2011 No comments

We have just posted our first episode on using a database with Microsoft Access — many thanks to Utah State’s Ethan White for creating it.

Categories: Content, Version 4 Tags:

I’d Settle for 0.1%

March 22nd, 2011 1 comment

In a recent article about computational thinking, Carnegie-Mellon’s Jeannette Wing says:

…every scientific directorate and office at the National Science Foundation participates in the Cyber-enabled Discovery and Innovation, or CDI, program, an initiative started four years ago with a fiscal year 2011 budget request of $100 million. CDI is in a nutshell “computational thinking for science and engineering.”

0.1% of that would keep Software Carpentry going for another year…  Or if I’m allowed to be a curmudgeon for a moment, hands up those people who believe that $100 million is going to do 1000 times more for science and engineering than another year of work on these materials? *sigh*

Categories: Noticed, Opinion Tags:

You’ll Need a Large Screen

March 22nd, 2011 1 comment

I’ve been working on a graph showing

  • the connections between the questions this course tries to address,
  • our answers to them,
  • the knowledge and skills needed to understand and apply those answers, and
  • the big concepts behind that knowledge and those skills.

The latest version is linked from the thumbnail below.  I know it’s tangled, but I hope it’s the first step toward a roadmap for redesigning this course to serve your needs better.  Feedback and assistance would both be very welcome.

Categories: Content, Version 4.1 Tags:

Using a Debugger

March 21st, 2011 1 comment

We’ve just uploaded a new video showing how to use a debugger to track down a problem. Using a debugger instead of ‘print’ statements will save you a lot of time, and the skill transfers directly to pretty much any language. Please let us know what you think.

Categories: Content, Version 4 Tags: