Thursday, February 28, 2013

The nature of publications

A paper in the journal of Genome Biology and Evolution has been doing the rounds on the internet recently and was shown to me by a friend. It is titled "On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE", by Graur et al. The title is blunt enough, but the abstract is extraordinarily so. Let me quote the entire thing here:
A recent slew of ENCODE Consortium publications, specifically the article signed by all Consortium members, put forward the idea that more than 80% of the human genome is functional. This claim flies in the face of current estimates according to which the fraction of the genome that is evolutionarily conserved through purifying selection is under 10%. Thus, according to the ENCODE Consortium, a biological function can be maintained indefinitely without selection, which implies that at least 80 – 10 = 70% of the genome is perfectly invulnerable to deleterious mutations, either because no mutation can ever occur in these “functional” regions, or because no mutation in these regions can ever be deleterious. This absurd conclusion was reached through various means, chiefly (1) by employing the seldom used “causal role” definition of biological function and then applying it inconsistently to different biochemical properties, (2) by committing a logical fallacy known as “affirming the consequent,” (3) by failing to appreciate the crucial difference between “junk DNA” and “garbage DNA,” (4) by using analytical methods that yield biased errors and inflate estimates of functionality, (5) by favoring statistical sensitivity over specificity, and (6) by emphasizing statistical significance rather than the magnitude of the effect. Here, we detail the many logical and methodological transgressions involved in assigning functionality to almost every nucleotide in the human genome. The ENCODE results were predicted by one of its authors to necessitate the rewriting of textbooks. We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be rewritten.
Ouch.

The paper that Graur et al. implicitly deride as "marketing, mass-media hype and public relations" is one of series of publications in Nature (link here for those interested) by the ENCODE consortium. I'm not going to claim any expertise in genetics, though the arguments put forward by Graur appear sensible and convincing.1 But I do think it is interesting that the ENCODE papers were published in Nature.

Nature is of course a very prestigious journal to publish in. In some fields, the presence or lack of a Nature article on a young researcher's CV can make or break their career chances. It is very selective in accepting articles: not only must contributions meet all the usual requirements of peer-review, they should also be judged to be in "the five most significant papers" published in that discipline that year. It has a very high Impact Factor rating, probably one of the highest of all science journals. In fact it is apparently one of the very few journals that does better on citation counts than the arXiv, which accepts everything.

But among some cosmologists, Nature has a reputation for often publishing claims that are over-exaggerated, describe dramatic results that turn out to be less dramatic in subsequent experiments, or are just plain wrong.One professor even once told me – and he was only half-joking – that he wouldn't believe a particular result because it had been published in Nature.

It is easy to see how such things can happen. The immense benefit of a high-profile Nature publication to a scientist's career leads to a pressure to find results that are dramatic enough to pass the "significance test" imposed by the journal, or to exaggerate the interpretation of results that are not quite dramatic enough. On the other hand, if a particular result does start to look interesting enough for Nature, the authors may be – perhaps unwittingly – less likely to subject it to the same level of close scrutiny they would otherwise give it. The journal then is more reliant on its referee's to provide the scrutiny to weed out the hype from the substance, but even with the most efficient refereeing system in the world given enough submitted papers full of earth-shattering results, some amount of rubbish will always slip through.

I was thinking along these lines after seeing Graur et al.'s paper, and I was reminded of a post by Sabine Hossenfelder at the Backreaction blog, which linked to this recent pre-print on the arXiv titled "Deep Impact: Unintended Consequences of Journal Rank". As Sabine discusses, the authors point to quite a few undesirable aspects of the ranking of journals according to "impact factor", and the consequent rush to try to publish in the top-ranked journals. The publication bias effect (and in some cases, the subsequent retractions that follow) appear to be influenced to a degree by the impact factor of the journal in which the study is published. Another thing that might be interesting (though probably hard to check) is the link between the likelihood of scientists holding a press conference or issuing a press release to announce a result, and the likelihood of that result being wrong. I'd guess the correlation is quite high!

Of course the only real reason that the impact factor of the journal in which your paper is published matters is that it can be used a proxy indication of the quality of your work for the benefit of people who can't be bothered, or are unable, to read the original work and judge it on merit.

The other yardstick by which researchers are often judged is the number of citations their papers receive, which at least has the (relative) merit of being based on those papers alone, rather than other people's papers. Combining impact factor and citation count is even sillier – unless they are counted in opposition, so that a paper that is highly cited despite being in a low-impact journal gets more credit, and a moderately cited one in a high-impact journal gets less!

Anyway, bear these things in mind if you ever find yourself making a reflexive judgement about the quality of a paper you haven't read based on where it was published.

1The paper includes a quote which pretty well sums up the problem for ENCODE:
"The onion test is a simple reality check for anyone who thinks they can assign a function to every nucleotide in the human genome. Whatever your proposed functions are, ask yourself this question: Why does an onion need a genome that is about five times larger than ours?"
2Cosmologists (the theoretical ones, at any rate) actually hardly ever publish in Nature. Even observational cosmology is rarely included. So you might regard this as a bit a of a case of sour grapes. I don't think that is the case, simply because it isn't really relevant to us. Not having a Nature publication is not a career-defining gap for a cosmologist: it's just normal.

No comments:

Post a Comment