Teaching basic lab skills
for research computing

Checking the Balance

Added 2016-02-22: this strong critique of the Terrell et al preprint mentioned in the opening paragraph of this post is worth a careful read.

It's been a depressing couple of weeks. On top of yet more reports of universities turning a blind eye to sexual harassment for years, a new paper of gender bias in open source shows that, "...women's contributions tend to be accepted more often than men's. However, when a woman's gender is identifiable, they are rejected more often." This comes on top of earlier studies (like this one) showing that women are substantially under-represented in online forums like Stack Overflow, even when compared to computing as a whole.

This prompted me to take another look at how Software Carpentry is doing. To start, here are the number and percentage of qualified Software Carpentry instructors broken down by gender:

Qualified Instructors by Gender
FemaleMaleOtherUnknown
Number136355326
%age26.2%68.3%0.6%5.0%

Let's compare that to the number of people contributing to our core lesson repositories in January 2016 by gender. (I'd like to show number for a whole year, but my script won't fetch stuff from that far back.)

Repository Contributors by Gender
FemaleMaleOtherUnknown
Number1979-14
%age17.0%70.5%-12.5%

17% female is better than average for GitHub and Stack Overflow, but still pretty poor. The proportion is even worse when we count number of contributions rather than number of contributors:

Repository Contributions by Gender
FemaleMaleOtherUnknown
Number69532-28
%age11.0%84.6%-4.5%

This allows us to calculate contributions per person:

Repository Contributions per Person by Gender
FemaleMaleOtherUnknown
Number3.66.72-

But now let's compare this to the stats for Software Carpentry's core mission: delivering workshops. Our next table shows the number of people who taught workshops in 2015:

Workshop Instructors by Gender
FemaleMaleOtherUnknown
Number891931213
%age21.9%47.5%29.8%0.7%

while this one shows the actual number of workshops taught (e.g., if I taught three times, I count as three points in the male column):

Workshops Taught by Gender
FemaleMaleOtherUnknown
Number1824061634
%age24.1%53.8%21.6%0.5%

and this one shows the average number of workshops taught by each instructor:

Repository Contributions per Person by Gender
FemaleMaleOtherUnknown
Number2.02.11.31.3

Finally, here's the breakdown of contributors to our discussion mailing list for the three months Nov 2015 - Jan 2016:

Email Contributors by Gender
FemaleMaleOtherUnknown
Number2295--
%age18.8%81.2%--

of messages sent:

Email Messages by Gender
FemaleMaleOtherUnknown
Number73277--
%age20.9%79.1%--

and of messages per person:

Email Messages per Person by Gender
FemaleMaleOtherUnknown
Number3.32.9--

I draw some comfort from the fact that our online balance isn't dramatically different from our in-person balance, and that both are much better than GitHub's or Stack Overflow's (though it would be hard for us to do worse), It's still clear, though, that women and other people who do not identify as male are under-represented both online and in person. What's worse is that as we grow, we're regressing to computing's unbalanced mean: over a third of our instructors were women in the summer of 2013. While I worry about the number of people who complete instructor training but never teach for us, I worry a lot more about that, and if we're going to try to fix something this year, that's what I'd like it to be.