Daniel Lemire's blog

, 2 min read

A decade of using text-mining for citation function classification

2 thoughts on “A decade of using text-mining for citation function classification”

  1. Richard Oberdieck says:

    Hi Daniel,

    thank you for this very informative post! Indeed I could assume that the number of time I cite a paper indicates influence, and the abstract statement “rings true” as well. One other factor I would consider [at least from my experience] is whether the citation occurs in a block or as a standalone. For example, I often see statements like “Global warning is a serious issue [1-27]”. This is obviously used to cite based on “unscientific reasons” [as you put it], and thus I would assume that the references in there are marginal at best.

    However, what can be done to limit this citing war? The reviewers almost never check the citations (unless there are some missing), so maybe there should be a charge per reference, or an explanation under each reference given how it influences the paper. Clearly this puts review papers and certain fields prone to high reference numbers at a disadvantage, but at least it is a start.

    What do you think?

    1. One other factor I would consider [at least from my experience] is whether the citation occurs in a block or as a standalone.

      I think we considered this attribute and it was not significant.

      However, what can be done to limit this citing war?

      If you can reliably and very cheaply classify the references, then nothing needs to be done. The main problem right now is that tools just count citations without any attempt to identify the purpose of the citation.