Most musicians can play along with a twelve-bar blues once they know what the key and tempo are. Many kinds of scientific work are equally well structured: the results aren't predictable—it wouldn't be research if they were—but the equipment setup, sample preparation, note-taking, statistics, and write-up are (mostly) structured in ways that other scientists are familiar with. This lets them pick up each other's projects more quickly; it also gives scientists more time to do what's unique about a particular project because they don't have to spend time thinking about the things that aren't.
Open science—by which I mean all the new ways of doing science that the web has inspired—isn't there yet. Many people have tools and techniques that work well for them, but every setup is one-of-a-kind, and everyone has had to assemble the pieces for themselves. I think we could do better, and the thing I think about when I think that is Ruby on Rails.
A lot of Ruby on Rails' original simplicity is now hidden under a mass of extensions and auxiliary tools, but when it first appeared in 2004, it was a minor revolution. One reason was the "create a blog in 15 minutes" screencast that showed people just how easy simple things could be. Another was its emphasis on convention over configuration: instead of letting web developers do things however they wanted, it said, "Here's where stuff is and here's how it works." Once a competent developer knew what the application was supposed to do, she could (almost) immediately start building the things that were specific to it, and other competent developers could join in (more) easily. What's more, Rails' predictable structure and workflow made it easier for newcomers to adopt—certainly easier than contemporary competitors, which all-too-often required novices to make key decisions before they had the knowledge or experience to do so well.
So what's the equivalent for science? What would be in a combined template/tool/framework/worldview for small- to medium-sized scientific projects that would let scientists do science our way with near-zero startup overhead? (Note that I'm not asking, "What web programming framework should they use?") A few things I want are:
rails new projectshould automatically generate the wrappers needed to pull a project's data and metadata. Everything else should automatically be accessible too: every paper, figure, or table should have a DOI and be searchable and fetchable. And note that "accessible" doesn't just mean "the bits are available": without correct documentation of formats and semantics—which most scientists never quite get around to writing by hand—data rusts almost as quickly as code.
rails new projecta moment ago; Rails, Django, and other frameworks of the same ilk create a command application for each project so that people can add new features in a reliable, findable way with a few keystrokes. Science projects should have something similar for adding new data sets, producing new results, etc.
What else would you add to this list? What would you want to see in a framework for 4×8 projects (i.e., ones where four people work together for eight months to produce a result)? What pieces of the solution are already out there, waiting to be integrated?
Originally posted 2013-06-19 by Greg Wilson in Community.comments powered by Disqus