What if it’s never decorative gourd season?

If it rains, now we’ll change
We’ll hold and save all of what came
We won’t let it run away
If it rains — Robert Forster

I’ve been working recently as part of a team of statisticians based in Toronto on a big, complicated applied problem. One of the things about working on this project is that, in a first for me, we know that we need to release all code and data once the project is done. And, I mean, I’ve bolted on open practices to the end of an analysis, or just released a git repo at the end (sometimes the wrong one!). But this has been my first real opportunity to be part of a team that is weaving open practices all the way through an analysis. And it is certainly a challenge.

It’s worth saying that, notoriously “science adjacent” as I am, the project is not really a science project. It is a descriptive, explorative, and predictive study, rather than one that is focussed on discovery or confirmation. So I get to work my way through open and reproducible science practices without, say, trying desperately to make Neyman-Pearson theory work.

A slight opening

Elisabeth Kübler-Ross taught us that there are five stages in the transition to more open and reproducible scientific practices: 

  • Denial (I don’t need to do that!)
  • Anger (How dare they not do it!)
  • Bargaining (A double whammy of “Please let this be good enough” and “Please let other people do this as well”)
  • Depression (Open and reproducible practices are so hard and no one wants to do them properly)
  • Acceptance (Open and reproducible science is not a single destination, but a journey and an exercise in reflective practice)

And, really, we’re often on many parts of the journey simultaneously. (Although, like, we could probably stop spending so long on Anger, because it’s not that much fun for anyone.)  

And a part of this journey is to carefully and critically consider the shibboleths and touchstones of open and reproducible practice. Not because everyone else is wrong, but more because these things are complex and subtle and working out how to weave them into our idiosyncratic research practices.

So I’ve found myself asking the following question.

Should we release code with our papers?

Now to friends and family who are also working their way through the Kübler-Ross stages of Open Science, I’m very sorry but you’re probably not going to love where I land on this. Because I think most code that is released is next to useless. And that it would be better to release nothing than release something that is useless. Less digital pollution.

It’s decorative gourd season!

A fairly well known (and operatically sweary) piece in McSweeney’s Internet Tendency celebrates the Autumn every year by declaring It’s decorative gourd season, m**********rs! And that’s the piece. A catalogue of profane excitement at the chance to display decorative gourds. Why? Because displaying them is enough!

But is that really true for code? While I do have some sympathy for the sort of “it’s been a looonnng day and if you just take one bite of the broccoli we can go watch Frozen again”-school of getting reluctant people into open science, it’s a desperation move and at best a stop-gap measure. It’s the type of thing that just invites malicious compliance or, perhaps worse, indifferent compliance.

Moreover, making a policy (even informally) that “any code release is better than no code release” is in opposition to our usual insistence that manuscripts reach a certain level of (academic) clarity and that our analyses are reported clearly and conscientiously. It’s not enough that a manuscript or a results section or a graph simply exist. We have much higher standards than that.

So what should the standard for code be?

The gourd’s got potential! Even if it’s only decorative, it can still be useful.

One potential use purely decorative code is the idea that the code can be read to help us understand what the paper is actually doing.

This is potentially true, but it definitely isn’t automatically true. Most code is too hard to read to be useful for this purpose. Just like most gourds aren’t the type of decorative gourd you’d write a rapturous essay about.

So unless code meets a certain standard, it’s going to need do something than just sit there and look pretty, which means we will need our code to be at least slightly functional. 

A minimally functional gourd?

This is actually really hard to work out. Why? Well there are just so many things we can look at. So let’s look at some possibilities. 

Good code “runs”. Why the scare quotes? Well because there are always some caveats here. Code can be good even if it takes some setup or a particular operating system to run. Or you might need a Matlab license. To some extent, the idea of whether “the code runs” is an ill-defined target that may vary from person to person. But in most fields there are common computing set ups and if your code runs on one of those systems it’s probably fine.

Good code takes meaningful input and produces meaningful output: It should be possible to, for example, run good code on similar-but-different data.  This means it shouldn’t require too much wrangling to get data into the code. There are some obvious questions here about what is “similar” data. 

Good code should be somewhat generalizable. A simple example of this: good code for a regression-type problem should not assume you have exactly 7 covariates, making it impossible to use when there data has 8 covariates. This is vital for dealing with, for instance, the reviewer who asks for an extra covariate to be added, or for a graph to change.

How limited can code be while still being good? Well that depends on the justification. Good code should have justifiable limitations.

Code with these 4 properties is no longer decorative! It might not be good, but it at least does something.  Can we come up with some similar targets for the written code to make it more useful? It turns out that this is much harder because judging the quality of code is much more subjective.

Good gourd! What is that smell?

The chances that a stranger can pick up your code and, without running it, understand what the method is doing are greatly increased with good coding practice. Basically, if it’s code you can come back to a year later and modify as if you’d never put it down, then your code is possibly readable. 

This is not an easy skill to master. And there’s no agreed upon way to write this type of code. Like clearly written prose, there are any number of ways that code can be understandable. But like writing clear prose, there are a pile of methods, techniques, and procedures to help you write better code.

Simple things like consistent spacings and doing whatever RStudio’s auto-format does like adding spaces each side of “+” can make your code much easier to read. But it’s basically impossible to list a set of rules that would guarantee good code. Kinda like it’s impossible to list a set of rules that would make good prose. 

So instead, let’s work out what is bad about code. Again, this is a subjective thing, but we are looking for code that smells.

If you want to really learn what this means (with a focus on R), you should listen to Jenny  Bryan’s excellent keynote presentation on code smell (slides etc here). But let’s summarize.

How can you tell if code smells? Well if you open a file and are immediately moved to not just light a votive candle but realize in your soul that without intercessory prayer you will never be able to modify even a corner of the code, then the code smells.  If you can look at it and at a glance see basically what the code is supposed to do, then your code smells nice snd clean.

If this sounds subjective, it’s because it is. Jenny’s talk gives some really good advice about how to make less whiffy code, but her most important piece of advice is not about a specific piece of bad code. It’s the following:

Your taste develops faster than your ability. 

To say it differently, as you code more you learn what works and what doesn’t. But a true frustration is that (just like with writing) you tend to know what you want to do before you necessarily have the skills to pull it off. 

The good thing is that code for academic work is iterative. We do all of our stuff, send it off for review, and then have to change things. So we have a strong incentive to make our code better and we have multiple opportunities to make it so.

Because what do you do when you have to add a multilevel component to a model? Can you do that by just changing your code in a single place? Or do you have to change the code in a pile of different places? Because good smelling code is often code that is modular and modifiable.

But because we build our code over the full lifecycle of a project (rather than just once after which it is never touched again), we can learn the types of structures we need to build into our code and we can share these insights with our friends, colleagues, and students.

A gourd supportive lab environment is vital to success

The frustration we feel when we want to be able to code better than our skills allow is awful. I think everyone has experienced a version of it. And this is where peers and colleagues and supervisors have their chance to shine. Because just as people need to learn how to write scientific reports and people need to learn how to build posters and people need to learn how to deliver talks, people need to learn how to write good code.

Really, the only teacher is experience. But you can help experience along. Work through good code with your group. Ask for draft code. Review it. Just like the way you’ll say “the intro needs more “Piff! Pop! Woo!” because right now I’m getting “*Sad trombone*” and you’ve done amazing work so this should reflect that”, you need to say the same thing about the code. Fix one smell at a time. Be kind. Be present. Be curious. And because you most likely were also not trained in programming, be open and humble.

Get members of your lab to swap code and explain it back to the author. This takes time. But this time is won back when reviews come or when follow up work happens and modifications need to be made. Clean, nice code is easy to modify, easy to change, and easy to use.

But trainees who are new at programming are nervous about programming.

They’re usually nervous about giving talks too. Or writing. Same type of strategy.

But none of us are professional programmers

Sadly, in the year of our lord two thousand and nineteen if you work in a vaguely quantitative field in science, social science, or the vast mire that surrounds them, you are probably being paid to program. That makes you a professional programmer.  You might just be less good at that aspect of your job than others.

I am a deeply mediocre part time professional programmer. I’ve been doing it long enough to learn how code smells, to have decent practices, and to have a bank of people I can learn from. But I’m not good at it. And it does not bring me joy. But neither does spending a day doing forensic accounting on the universities bizarre finance system. But it’s a thing that needs to be done as part of my job and for the most part I’m a professional who tries to do my best even if I’m not naturally gifted at the task.

Lust for gourds that are more than just decorative

In Norwegian, the construct “to want” renders “I want a gourd” as “Jeg har lyst på en kalebas” and it’s really hard, as an english speaker, not to translate that to “I have lust for a gourd”. And like that’s the panicking Norwegian 101 answer (where we can’t talk about the past because it’s linguistically complex or the future because it’s hard, so our only verbs can be instantaneous. One of the first things I was taught was “Finn er sjalu.” (Finn is jealous.) I assume because jealousy has no past or future).

But it also really covers the aspect of desiring a better future. Learning to program is learning how to fail to program perfectly. Just like learning to write is learning to be clunky and inelegant. To some extent you just have to be ok with that. But you shouldn’t be ok with the place you are being the end of your journey.

So did I answer my question? Should we release code with our papers?

I think I have an answer that I’m happy with. No in general. Yes under circumstances.

We should absolutely release code that someone has tried to make good code. Even though they will have failed. We should carry each other forward even in our imperfection. Because the reality is that science doesn’t get more open by making arbitrary barriers. Arbitrary barriers just encourages malicious compliance. 

When I lived in Norway as a newly minted gay (so shiny) I remember once taking a side trip to Gay’s The Word, the LGBTQIA+ bookshop in London and buying (among many many others) a book called Queering Anarchism. And I can’t refer to it because it definitely got lost somewhere in the nine times I’ve moved house since then.

The thing I remember most about this book (other than being introduced to the basics of intersectional trans-feminism) was its idea of anarchism as a creative force. Because after tearing down existing structures, anarchists need to have a vision of a new reality that isn’t simply an inversion of the existing hierarchy (you know. Reducing the significance threshold. Using Bayes Factors instead of p-values. Pre-registering without substantive theory.) A true anarchist, the book suggested, needs to queer rather than invert the existing structures and build a more equitable version of the world.

So let’s build open and reproducible science as a queer reimagining of science and not a small perturbation of the world that is. Such a system will never be perfect. Just lusting to be better.

Extra links: