Bibliometrics: the numbers game

In mid-December, British universities, their constituent units and departments, and most academics experienced the same kind of traumatic day familiar to 18-year olds awaiting the examination results on which their advancement to higher education, or not, depended. December 18th, 2014, was REF-Day. Since its predecessor (RAE-Day), 8 years before, a vast – by university standards – effort went into preparing bids on a department-by-department basis to rank them nationally and conflate individual assessments to build a sort of institutional league table for research excellence; hence REF stands for Research Excellence Framework (the RAE was the less meritorious-sounding Research Assessment Exercise). It resembled the Guide Michelin or Automobile Association star system for restaurants and hotels or guest houses. The reason for the 8-year frenzy of activity was that the outcomes aimed to inform the selective allocation of governmental research funding. Unsurprisingly, this kind of competition stemmed from the Tory government of Margaret Thatcher, which in 1986 set the scene for ‘performance-related’ funding rather than that based on peer review of each individual bid for major grants, which preceded it.

To itemise each aspect of the way the REF worked could take the majority of Earth Pages readers to an early and ignoble grave. It centred on departmental selection from its full-time researchers of those who were deemed to be ‘research active’ and those who were not, the former having to select four recently published works or ‘outputs’. They had to self-assess each according to its ‘impact’, defined as ‘an effect on, change or benefit to the economy, society, culture, public policy or services, health, the environment or quality of life, beyond academia’. Institutions vetted and bundled individual submissions, collated them in the subject areas designated by the REF, then sent them off to ‘REF Central’, where they were to be reviewed by subject-specialist panels that gave out the stars for each submitted item of work: **** = world-leading (30% were deemed to be); *** = internationally excellent (46%); ** = recognized internationally (20%); * = recognized nationally (3%); unclassified = below the standard of national recognition (1% – presumably those obviously lacking star quality were weeded out at institution level). There were more than 190 thousand ‘outputs’, which begs the questions; Were all of them read by at least one specialist panel member? Against what standards were they judged?

On average, each of the roughly 1000 panelists would have had to consider about 190 outputs in greater depth than a casual skim, or more if some were read by several panelists. Outputs were rated ‘in terms of their “originality, significance and rigour”, with reference to international research quality standards’, ‘the “reach and significance” of impacts on the economy, society and/or culture’ and the part they played in their department’s contribution to ‘the vitality and sustainability… of the wider discipline or research base’. On paper – and believe me, REF Central produced plenty of wordy PDFs of guidance – this level of scrutiny makes the adjective ‘daunting’ seem a bit of an understatement. Entering into this spirit of things in the gleeful manner of a Michelin or AA assessor does seem to me a bit hard to grasp. I wonder if the panels in reality just checked each submission for signs of an overly hubristic vision of self-worth.

To some extent, the issue of each output’s citation count or other bibliometric measure must at some stage have come into REF reckoning, and here is what spurred me to defy normal cautions about boredom as a contributor to general organ failure. Physicist Reinhard Werner of Leibniz University in Hanover, Germany believes that deciding on funding and hiring, or firing, needs to steer well-clear of impact factors, citations and other kinds of bibliometrics (Werner, R. 2015 The focus on bibliometrics makes papers less useful. Nature, v. 517, p. 245). Scientists cite other works for many reasons, some worthy and some less so. But it is rare that in doing so we express any opinion on the overall significance of the work that we choose to cite. Yet, conversely, a researcher can choose a field, phrase some findings and submit to such and such journal that will boost their citation frequency and impact. Just by writing about some mundane topic in a publicly accessible way, reviewing the work of lots of other people, or simply writing about this or that topic as observed or measured in an especially highly populous country where science is really booming does much the same thing. Werner makes a telling point, ‘When we believe that we will be judged by silly criteria, we will adapt and behave in silly ways’. Although he does not touch on the absurdities of the REF – why on Earth would he? – Werner comments on distortion of the job market, and peer-reviewed journals. He also pleas for a return to proper scrutiny of scientific merit and, I suspect, for cutting hubris off at the roots.

Publishing: is it worth the effort?

A measure of the esteem in which a peer-reviewed paper is held is supposedly the number of times to which it is referred in other papers. Of course, the older a paper is the more chance that such citations will have built up; but the annual rate of citation is likely to fizzle out over time. Papers that create a frisson of initial excitement and command enduring citation are few and far between: they probably launched a new line of inquiry.

It is instructive to try to nail Alfred Wegener’s influence in tectonics using the Web of Science, which ought to have been pretty high. Superficially, he had none and is remembered through that arm of Thomson Reuters for six papers: four on atmospheric physics – his speciality; one on lunar craters and a sixth on the patterns of cracking seen on rotten wood. These give him a mere 20 citations. Wegener’s posthumous problem was that Die Entstehung der Kontinente first appeared in the fourth issue of Geologische Rundshau in 1912, and seemingly the Web of Science doesn’t have that journal in its archives of a century ago. Later, extended editions appeared in book format which were not peer reviewed (most geoscientists would not touch his ideas with a barge pole until long after his death in 1930), and are therefore outside the academic pale. The key to a plausible mechanism for continental drift – symmetrical magnetic striping above ocean basins – was first described by Fred Vine and Drummond Matthews in an issue of Nature in 1963. In 50 years their work, ranking with discovering the structure of DNA, has accumulated 709 citations; i.e. 38.5 citations per year on average, which is not much for fuelling a revolution.

Photograph of Alfred Wegener, the scientist
Alfred Wegener, the unsung hero of continental drift(credit: Wikipedia)

Of course, citation is not the same as the frequency at which a paper is read. It is no secret that a not inconsiderable number of papers that appear in published reference lists haven’t been read by the authors who cite them. They are there by proxy, and you will probably find them in the bibliography of later papers that those same authors have cited. There is perhaps a certain kudos in such proxy citations, for it may be that the cited paper has achieved the equivalent of canonical status in the field.

Citation frequency is something of a lottery: language of publication; discipline (since 1953 Crick and Watson achieved three times Vine and Matthews’s average citations); date of publication (E. Komatsu of the University of Texas at Austin has already had 1939 citations for his February 2011 paper ‘Seven-Year Wilkinson Microwave Anisotropy Probe Observations: Cosmological Interpretations’ published in a supplement to the Astrophysical Journal; nine times the rate of Crick and Watson, but the paper is about the origin of everything)

Interestingly, the December 2012 issue of Geology presents stats on the most cited papers that it has published since 2000 (Cowie, P.A. 2012. Highly cited Geology papers (2000-2010) – What were they and who wrote them? Geology, v.  40, p. 1147-1148). Geology is among the highest ranking journals in the geoscience field, and had an impact factor of 4.8 over the last 5 years. A journal’s impact factor is the number of times all articles published in a 2-year period were cited in all indexed journals in the year following them, divided by the total number of articles published in the two years by the assessed journal. So, papers published in Geology between2007 and 2011 were cited on average 4.8 times in the year following publication. This journal is a useful source of citation statistics as it covers the full range of geoscience and all papers are limited to 4 printed pages, thereby forcing authors to be concise and clear in their writing and illustration. Consequently it is popular, which, incidentally, may explain its high impact factor.

Of the 33 papers cited most between 2000 and 2010, 14 are on topics relating to Tibet and China. There are 3 on oceanography; 3 on paleontology and extinctions; 6 on palaeoclimatology; 10 on tectonics and 10 on magmatism (3 of which were about rare adakites formed by partial melting of subducted oceanic crust). I haven’t read all of the papers, and the stats on topics may tell us very little, but I would bet that papers about geology in high-population emerging countries – China, India and Brazil – are met gleefully by rapidly growing communities of eager young geoscientists. It may even be worth a flutter on adakites as the ‘next big thing’ in petrogenesis. Mind you, it looks like I am not likely to be the best punter for hot papers, as out of the 33 ‘top-3’ papers since 2000, only 6 made it into Earth Pages, and of those only one between 2004-2010.

The digest goes on to show that year-by-year as many as 10 % of papers in Geology are not cited at all, up to 70% are cited between 1 and 5 times per year, while less than 10% get 10 or more citations in a year. Oddly, the author suggests that a dip in citations of Geology papers in recent years may reflect the launch of Nature Geoscience in 2008. Yet glossy as that new addition to the Nature stable might be, it has become something of a desert for papers on geology. Then there is evidence for both ‘vintage’ and ‘just-about-drinkable’ years  in Geology citations: the ‘top ten’ papers in 2001, 2005, 2006, 2008 and 2010 ranged from 10-15 citations for the tenth to 20-25 for the ‘hottest’ paper, while in 2000, 2002, 2003, 2004, 2007 and 2009 the most cited papers stood well above the rest at 32 to 55 citations per year. But that may just reflect the uneven pace at which well-received and provocative work emerges.

So, it begins to seem, from Geology at least, that for most geoscience authors publishing isn’t going to raise much hope as far as jobs or promotions are concerned. Yet if results are not published funding agencies may become fractious about your next grant application, and of course, university science departments puff themselves with annual publication rates (though rarely citation records, which as far as geosciences goes could be a wise move). But it is a matter of academic duty to publish for the record; even if a paper fills just one tiny niche the cumulative effect of publically available knowledge does eventually result in breaks through – one never knows… It could be a salutary lesson should publishers release data on hits for on-line PDFs of papers, as that would give some indication of how many readers individual papers have, but as for a ‘like this’ button or a means of star rating I think we have to venture into the deeper recesses of academic conservatism one small step at a time.