I always enjoy your thoughts on that subject. Regarding (1), do you think it’s a good thing? Can we really reach a point where researchers make their results reproducible if there is no incentives on others to check these results? As you correctly point out, reproducing is more involved than merely repeating. Who will invest the time needed to do this?
Also, I’d love to hear your opinion on two different ways to conduct open research. One is to publish the final result as open source code. The other, which is even more appealing to me, is to perform your research in a glass box. For example, by using an online public repository to commit your code on a daily basis, or by continuously publishing your intermediate research notebooks, etc.
I think we can learn a lot by spending more time thoroughly reviewing and studying previous results. A lot of hidden assumptions are the source of great research.
So maybe that’s my answer. People should and are working to reproduce results because it is fun and fruitful.
As for working in a glass box, I have not dared doing this, yet. I hope to spend more time thinking about it.
Thanks for the link. (I did not think Donoho was the first, but he is often attributed the idea of reproducible computational research, informally.)
Anonymoussays:
I have reviewed papers with open data. It was great, I ran some stats and proved to myself that their stats were at least correct, I even found a mistake in one of their p-values where they made an erroneous decision on the border (it didn’t affect the rest of the paper).
It probably helped the paper as I could demonstrate that at least there stat methods were reproducible.
I write to ask for pointers to “build systems” to support reproducible computation.
I believe that my first published paper is widely cited because I made source code available on the internet. I distribute source for my only book with the goal (usually requires some hacking) that one can simply type “make” to produce it in about a day.
Now, I’m working with colleagues who are too busy “doing real science” to learn make. Make is designed to support compilation and forcing it to organize scientific computation can be ugly. I’ve recently looked at SCons and WAF, but they don’t seem ideal either. Is there a better system available?
I believe that my first published paper is widely cited because I made source code available on the internet.
Yes, source code can be use to promote and disseminate ideas.
I’m working with colleagues who are too busy “doing real science†to learn make.
I think that producing software is “real science”. There are journals, like “Source code for biology and medicine” that publish software.
Is there a better system available?
My experience so far is that we live in a fast evolving world. It seems very difficult to find an ideal solution that will stand still forever. So we, unavoidably, end up using several tools over the years.
I always enjoy your thoughts on that subject. Regarding (1), do you think it’s a good thing? Can we really reach a point where researchers make their results reproducible if there is no incentives on others to check these results? As you correctly point out, reproducing is more involved than merely repeating. Who will invest the time needed to do this?
Also, I’d love to hear your opinion on two different ways to conduct open research. One is to publish the final result as open source code. The other, which is even more appealing to me, is to perform your research in a glass box. For example, by using an online public repository to commit your code on a daily basis, or by continuously publishing your intermediate research notebooks, etc.
Actually, Jon Claerbout’s work on reproducibility predates Donoho’s: http://sepwww.stanford.edu/data/media/public/sep//jon/reproducible.html
SEP is mentioned with dates going back at least to 1990.
@Philippe
I think we can learn a lot by spending more time thoroughly reviewing and studying previous results. A lot of hidden assumptions are the source of great research.
So maybe that’s my answer. People should and are working to reproduce results because it is fun and fruitful.
As for working in a glass box, I have not dared doing this, yet. I hope to spend more time thinking about it.
@Carlos
Thanks for the link. (I did not think Donoho was the first, but he is often attributed the idea of reproducible computational research, informally.)
I have reviewed papers with open data. It was great, I ran some stats and proved to myself that their stats were at least correct, I even found a mistake in one of their p-values where they made an erroneous decision on the border (it didn’t affect the rest of the paper).
It probably helped the paper as I could demonstrate that at least there stat methods were reproducible.
I write to ask for pointers to “build systems” to support reproducible computation.
I believe that my first published paper is widely cited because I made source code available on the internet. I distribute source for my only book with the goal (usually requires some hacking) that one can simply type “make” to produce it in about a day.
Now, I’m working with colleagues who are too busy “doing real science” to learn make. Make is designed to support compilation and forcing it to organize scientific computation can be ugly. I’ve recently looked at SCons and WAF, but they don’t seem ideal either. Is there a better system available?
@Andrew
I believe that my first published paper is widely cited because I made source code available on the internet.
Yes, source code can be use to promote and disseminate ideas.
I’m working with colleagues who are too busy “doing real science†to learn make.
I think that producing software is “real science”. There are journals, like “Source code for biology and medicine” that publish software.
Is there a better system available?
My experience so far is that we live in a fast evolving world. It seems very difficult to find an ideal solution that will stand still forever. So we, unavoidably, end up using several tools over the years.