There has been some discussion about whether to release computer code used in the data analysis process for published scientific papers. Nick Barnes has an op-ed in Nature that goes over just that. There are good arguments in both directions, but they can be distilled down to two principles:
It's hard for a number of reasons. First, many are reticent to release their code as it is something of a trade secret - many scientists have built careers around their way of doing things and releasing that is anathema to them. This is kinda antithetical to the whole philosophy of science, but it's life. Second, code is really messy. I've written only a little code myself and I've found it to quickly get junky. If you look at your car (or bike), it represents a highly engineered and efficient machine. There aren't unnecessary parts, gears, or fluids running through it. Only what is needed to move is present on the vehicle. That is not the case for code. There are unnecessary wheels, gears, doors, etc. It is clunky and it is hard to follow the silver thread of real information relevant to the paper. And the same piece of unwieldily code can be used on different project - what if people get scooped?
For my part, I think it is essential that we release code. And not just release it, but make it easier to use. In some papers that I have submitted recently (fingers crossed that they get published!) I have used ###LABELs to to document each step of the code used in analysis to make it easier for people replicate (or falsify) my findings. Like many folks I have a batch of unwieldy code where variables are wrapped around others - but I take the time to copy+paste the relevant parts for new publications into place.
It took me a while to learn how to replicate what other folks were doing - and then only occasionally. If those folks had provided their code, then I could have much more quickly learned the ropes. Scientists provide data - which is good! - but data is only part of the equation.
- Releasing code is critical for replication of scientific findings and will promote faster growth and development of humanity's knowledge
- Releasing code is hard
It's hard for a number of reasons. First, many are reticent to release their code as it is something of a trade secret - many scientists have built careers around their way of doing things and releasing that is anathema to them. This is kinda antithetical to the whole philosophy of science, but it's life. Second, code is really messy. I've written only a little code myself and I've found it to quickly get junky. If you look at your car (or bike), it represents a highly engineered and efficient machine. There aren't unnecessary parts, gears, or fluids running through it. Only what is needed to move is present on the vehicle. That is not the case for code. There are unnecessary wheels, gears, doors, etc. It is clunky and it is hard to follow the silver thread of real information relevant to the paper. And the same piece of unwieldily code can be used on different project - what if people get scooped?
For my part, I think it is essential that we release code. And not just release it, but make it easier to use. In some papers that I have submitted recently (fingers crossed that they get published!) I have used ###LABELs to to document each step of the code used in analysis to make it easier for people replicate (or falsify) my findings. Like many folks I have a batch of unwieldy code where variables are wrapped around others - but I take the time to copy+paste the relevant parts for new publications into place.
It took me a while to learn how to replicate what other folks were doing - and then only occasionally. If those folks had provided their code, then I could have much more quickly learned the ropes. Scientists provide data - which is good! - but data is only part of the equation.
Science today gives you data (the ingredients) and presents those findings in a research paper (the cake). But any good cook will tell you that the recipe is an important part - and that is where code comes in. Only when all three elements are together do we move forward in knowledge. You can't have your cake and eat it too.
Anyway, you can find code from my publications on their individual pages and in the "Stuff" section. Feel free to email me if you have any questions - though I can't always guarantee a response.
Anyway, you can find code from my publications on their individual pages and in the "Stuff" section. Feel free to email me if you have any questions - though I can't always guarantee a response.