Daniel Lemire's blog

, 19 min read

Taking scientific publishing to the next level

22 thoughts on “Taking scientific publishing to the next level”

  1. Chung-chieh Shan says:

    Amen.

    I don’t think the human need for time and space limits would particularly hamper this approach to writing. Releases can be scheduled, compilations can be collected, pages can be counted, etc. as usual.

  2. I agree that our works should have their own dynamic collaborative documentation, just like software. But, right now, papers are like software releases, not software projects. Since papers have no edition number, they represent the state of the research in a given time. Once you release the software, you cannot override the same version with changes. If you do it, users/readers will get lost with features appearing and disappearing.

    Following the book model, if a paper had edition numbers, you would provide bug fixes, but still not new features. An article in the Wikipedia, however, is always updated. This means that the article IS always the current version of the whole software and may change dramatically from one day to the other. Citing in the project level may be dangerous, because things that were there to support your citation may disappear in the next day.

    Right now, there is no distintion between citing releases or projects. Can we publish and cite scientific findings in a Wikipedia style? Probably. But we need to change the way to cite first.

  3. Lorenzo Grespan says:

    Daniel,

    insteresting idea. It is becoming a recurrent subject lately, and I believe we already have the technology we need to make it happen.

    On a very basic level one could use a dropbox account (which comes with version control) or a google document/scribd to keep a “latest version of my research” of his/her papers. This might be a quick-and-dirty patch to get teachers to love the idea while a ‘standard’ infrastructure is put in place.

    This infrastructure could be something like: each university has their own bugzilla/trac system open for public/peers review, where each paper has his own ‘branch’ in the ‘research’ line, ‘bugs’ that have to be fixed, ‘patches’, etc. Something between a wiki and github.

    Also, having a ‘diff’ of each paper would be great for another reason: learning how to do it. I believe that seeing how a paper is written and becomes a finished work is a fundamental part of the learning process for a graduate student. At the beginning of our careers probably we assumed people wrote papers the way they appear in journals… In my case that would get me very depressed (I’d never be able to have such clarity of mind! How do you know your idea would work right away?! Etc.).

    There will be a couple of issues to overcome: (i) copyright and journals with closed mind (those that do not allow you to retain ownership of your paper, nor to put on your personal web page); (ii) researchers’ ego; (iii) human laziness(*); and (iv) being afraid somebody would ‘steal’ our work if we publish too early.

    (*) What do I mean by human laziness? The fact that having a clear deadline and a strict number of words or pages often forces you to re-think your work and to find better, more effective wording. Maybe make two pictures into a more precise one. Take out some irrelevant material. This, sometimes, is what makes you crank out a good paper. Otherwise we’d be happy with our work once we finish writing the first draft.

    All those ‘problems’, however, will slowly disappear if such a versioning system takes place. I think it’s just a matter of time.

    One last question for you: what would you do to make this happen? How can I help? 🙂

  4. Philippe Beaudoin says:

    Daniel, I couldn’t agree with you more. You explain my own thoughts clearly and concisely! I really hope I live to see scientific research reach that point. It will open so many possibilities. (For one, enabling serious research contributions by people outside of academia.)

  5. PÃ¥l M. Lykkja says:

    “Developing Texts Like We Develop Software” by Ed Felten is very interesting.
    http://www.freedom-to-tinker.com/blog/felten/developing-texts-we-develop-software

  6. John Regehr says:

    Talk’s cheap. Where’s the prototype?

  7. Couldn’t agree more. I think there is even more that researchers could benefit from adopting the development processes that come with the Open Source model. For example, by following the “Release Early, Release Often” scheme, you can get in touch with others researchers much faster than through traditional means, with the potential of self-organization collaboration.

    See my blog post http://blog.mikiobraun.de/2010/01/open-source-process.html

  8. Paul says:

    “Citing in the project level may be dangerous, because things that were there to support your citation may disappear in the next day.”

    In a way, that sounds like a feature. Things you cite may turn out to be false. How much better would it be if some conflict resolution software popped up “paper X now concludes this was false”, rather than risk building huge citation trees based off an original mistake. All citations of knowledge “decay”, this makes the process explicit.

    There’s also an analogy to conflict resolution in source control (when somebody commits a change that doesn’t mesh with your own local changes). If you point your citations to specific paragraphs, and perhaps annotate them with a brief explanation of what you’re citing, then anyone committing a fix to that area could be prompted with “are these citations still valid?” If not, helpful participants could look into finding other citations, or reworking the paper given this new knowledge.

  9. Suresh says:

    idea of bugzilla for papers is great ! but there’s one thing you have to worry about. Whether we like it or not, assigning credit for papers (and reputation) is the oily scum that greases the high-minded research enterprise. In this kind of open model, it’s not clear to me how credit gets assigned for work. I agree that citation models need to change, but if you’re citing the 7th patch of a paper after a series of bug fixes, are you citing any specific person ? the authors ? the union of authors and bug fixers ? and how are these contributions weighted ?

    I guess I’m wondering what an appropriate glide path would be to get to this future.

  10. Philippe Beaudoin says:

    Suresh hits the nail on the head. IMHO we are still using an inefficient research process mostly because our reward system is ready-made for that process.

    This is backwards: we should design a reward system that favor the emergence of more efficient research processes.

  11. Pradeep Padala says:

    There’s already plenty of places, where you can do this. One example is CiteULike. What’s stopping anyone from adding comments and reviews to the paper? CiteULike has excellent interface to track the comments with proper attribution.

    Well, the biggest problem for this is the motivation. The authors clearly don’t have any extra motivation to fix minor bugs in the paper (If they do, it would be a journal paper or an extended tech report). Why would others take this burden? We need some reward system to make this work.

  12. Aman Goel says:

    I think we need a website that allows discussion on any paper ever published. That will serve two purposes: 1) This will allow people to report bugs, and other people/authors to clarify what the correct version of the sentence/paragraph/figure is. 2) It will help develop a healthy discussion around every paper. Both newbies as well as experts in the field can quickly learn about the merits of a work, how it relates to other works, and what might be the pitfalls of the paper. This has another advantage – now, authors only worry about getting their papers accepted. Once there is a website than can list the weaknesses of a paper, the authors would be more concerned to write a better paper that others cannot poke holes through.

  13. Srikanth M R says:

    What you are referring to is essentially like this – http://www.phdcomics.com/comics/archive.php?comicid=1178

    I like the idea. But I don’t think it is like open source paradigm. It is more like a community moderated research blog.

    P.S: the captcha-like verification is a bit too much in this blog.

  14. jld says:
  15. Francois Rivest says:

    I like this idea. It certainly worth more discussion. I am also fully in agreement with installing a post-publication review process. That is, having a generalized way to give feedbacks and evaluate papers after they have been published, not only before.

  16. @Itman

    Right now, bugs are annoying to authors because they cannot do anything about it.

  17. @Itman

    Publishing an errata is not very satisfactory. However, in some journals, it counts as an article (technically)… so a scientist could extend his publication list with errata.

  18. Mike Stiber says:

    Everything in life has advantages and disadvantages. Assessing quality under the current publication model is done heuristically, using number and order of authors (modified by disciplinary practices), forum impact factor and/or acceptance rate, citations, assessments by peers, etc. (And, yes, “heuristically” here is synonymous with “pulled out of the assessor(s) ass(es)”.)

    A publication model based on open-source repositories could substitute hard statistics for the above metrics: hits, comments, likes/dislikes, cites, patches, plus perhaps some that we haven’t considered yet (it’s a graph with multiple kinds of edges; what would be the meaning of each metric one could extract?).

  19. Itman says:

    I often find bugs in the papers, sometimes, serious ones. My impression is that, unfortunately, scientists do not always care about fixing bugs in their papers. Overall, I feel that finding bugs is not a popular business. It may be that instead of building a reputation one will get a lot of enemies. May be our researches, should care more about their own bugs in the first place. Only then, we will see bug-reports to their papers.

  20. Itman says:

    Daniel,
    This is good point. Let’s hope things will change in the future. Already today, a few serious journals are truly on-line. How difficult should it be to fix it or, at least, publish a manual errata?

  21. Steven says:

    I like this idea. I find that often when doing research, you run into papers that have really serious yet not immediately obvious flaws. In a way, these sort of “pollute” the stream of progress, making it harder to publish a real paper that implements the technique correctly/effectively. The new work is just not seen as novel.

  22. Mark says:

    Just had to poist a comment saying…

    “Agreed!”