X Close

There should never be heroes in science Some scientists make their careers by criticising other’s research. But who watches the watchmen?

Many scientists failed to accurately predict the extent of the coronavirus disaster (Photo by MEHDI FEDOUACH/AFP via Getty Images)

Many scientists failed to accurately predict the extent of the coronavirus disaster (Photo by MEHDI FEDOUACH/AFP via Getty Images)


June 29, 2020   9 mins

Science, the cliché goes, is self-correcting. This is true in two senses – one lofty, one more mundane. The high-minded one is about the principle of science: the idea that we’re constantly updating our knowledge, finding out how our previous theories were wrong, and stumbling — sometimes only after a fashion — towards the truth. The second is that, for science to work, individual scientists need to do the grinding, boring work of correcting other scientists’ mistakes.

All scientists are part of this to some degree: most prominently, the peer-review system, at least in theory, involves scientists critiquing each other’s studies, throwing out the bad or mistaken ones, and suggesting improvements for those that are salvageable.

Some scientists, though, make their entire reputations as critics, loudly drawing attention to the flaws and failings in their fields. It’s hard not to respect these often-eccentric characters, who stand up against groupthink and intellectual inertia, telling entire fields of research what they don’t want to hear. But recent months have given us some cautionary tales about these scientific watchmen, and have shown in excruciating detail how even the most perceptive critics of science can end up bafflingly wrong.

In my own field of psychology, one of the most prominent examples of an uber-critic was Hans Eysenck. From the 1950s all the way to his death in 1997, Eysenck wrote blistering critiques of psychoanalysis and psychotherapy, noting the unscientific nature of Freudian theories and digging into the evidence base for therapy’s effects on mental health (I should note that Eysenck worked at the Institute of Psychiatry, now part of King’s College London, which is my employer).

In one typically acrimonious exchange in 1978, Eysenck criticised a study that had reviewed all the available evidence on psychotherapy. Eysenck argued that this kind of study — known as a “meta-analysis” because it tries to pool all the previous studies together and draw an overall conclusion — was futile, owing to the poor quality of all the original studies. The meta-analysis, he said, was “an exercise in mega-silliness”: a “mass of reports—good, bad, and indifferent—are fed into the computer”, he explained, “in the hope that people will cease caring about the quality of the material on which the conclusions are based.”

Whether or not this was a sound argument in the case of psychotherapy, Eysenck had put his finger on an important issue all scientists face when they try to zoom out to take an overall view of the evidence on some question: if you put garbage into a meta-analysis, you’ll get garbage out.

You would think that a scientist so concerned with the garbage in the scientific literature would do his best to avoid producing more garbage himself. But around the same time as he was excoriating meta-analysis, Eysenck began collaborating with one Ronald Grossarth-Maticek, a therapist working at Heidelberg University. Grossarth-Maticek had, he claimed, run large-scale studies of how personality questionnaires could predict fatal disease, and how his special kind of behavioural therapy could help people avoid cancer and heart disease. Eysenck worked with him to get these data into the scientific literature, eventually producing many dozens of scientific papers and reports.

The first indication that something might be a little off in this research comes if you look at the questions being asked. Usually a personality questionnaire would include questions like “Can you get a party going?” or “Do you enjoy sunbathing on the beach?” The Eysenck-Grossarth-Maticek questionnaire included questions like the following — to which a yes/no answer is required:

“Do you change your behaviour according to consequences of previous behaviour, i.e., do you repeat ways of acting which have in the past led to positive results, such as contentment, wellbeing, self-reliance, etc., and to stop acting in ways which lead to negative consequences, i.e., to feelings of anxiety, hopelessness, depression, excitement, annoyance, etc.? In other words, have you learned to give up ways of acting which have negative consequences, and to rely more and more on ways of acting which have positive consequences?”

Go on. Yes or no? Quickly, please.

The second issue is with the results. Frankly, they’re unbelievable. Answers to the kind of question quoted above could, the pair claimed, classify people into “Cancer-Prone”, “Heart Disease-Prone” or “Healthy” personalities — with massive implications for their lives. Cancer-Prone personalities were an astonishing 120 times more likely to die of cancer than Healthy personalities in the next ten years (the equivalent number for Heart-Disease Prone personalities dying of heart disease was 27 times, also amazingly high). And in a trial of Grossarth-Maticek’s therapy, not one treated patient died of cancer, whereas 32% of the control participants, who received no therapy, did.

As one of Eysenck’s critics, Anthony Pelosi, has argued, “such results are unheard of in the entire history of medical science”. There are only three possibilities: they’re either the most important results ever discovered in medicine (a 100% chance of avoiding cancer after some talking therapy!), grossly mistaken in some way, or made up. I suspect the answer isn’t the first one.

These results were criticised in the early 1990s (including by a statistician who’d seen the raw data, and who more or less argued they were fraudulent), though they were vociferously defended by Eysenck. He wrote that the criticisms, “…however incorrect, full of errors and misunderstandings, and lacking in objectivity, may have been useful in drawing attention to a large body of work, of both scientific and social relevance, that has been overlooked for too long.”

It was only in 2020 that the self-correcting nature of science really started to bite: after a resurgence of criticism from Pelosi and others, King’s College London investigated Eysenck’s work with Grossarth-Maticek, listed many (though not all) of the articles that used the data as “unsafe”, and wrote to the relevant scientific journals advising them to retract the papers. So far, 14 have been pulled from the literature (with a further 64 given an editorial “expression of concern”) — and given how many papers the duo published on these bizarre studies, this may end up being the tip of a rather large iceberg.

How had such a strong advocate of rigour in science ended up presiding over one of the most ludicrous sets of scientific papers ever published? There are allegations that he received funding from tobacco companies, who would have stood to benefit if it was personality rather than cigarettes that caused cancer, and that this might have influenced his reasoning (a conflict of interest that was never fully declared).

But deeper explanations relate to Eysenck’s personality. When he wasn’t railing against psychotherapy, he was publishing and debating almost every other contentious issue in the book, including crime and violence, astrology, extra-sensory perception, and the genetics of race and intelligence. This was someone who loved argument, loved controversy — and most importantly, refused in almost any case to back down under criticism (see this cringeworthy video for further evidence). Once he found himself deeply involved in the Grossarth-Maticek studies, he felt beholden to defend them despite their transparent absurdity.

The strange story of Eysenck —  an arch-critic who conspicuously failed to see the flaws in his own work — has come to mind several times while I’ve been following a far more contemporary controversy: the case of John Ioannidis and COVID-19.

It’s fair to say that Stanford University’s John Ioannidis is a hero of mine. He’s the medical researcher who made waves in 2005 with a paper carrying the firecracker title “Why Most Published Research Findings are False”, and who has published an eye-watering number of papers outlining problems in clinical trials, economics, psychology, statistics, nutrition research and more.

Like Eysenck, he’s been a critic of meta-analysis: in a 2016 paper, he argued that scientists were cranking out far too many such analyses — not only because of the phenomenon of Garbage-In-Garbage-Out, but because the meta-analyses themselves are done poorly. He’s also argued that we should be much more transparent about conflicts of interest in research: even about conflicts we wouldn’t normally think of, such as nutrition researchers being biased towards finding health effects of a particular diet because it’s the one that they themselves follow.

Ioannidis’s contribution to science has been to make it far more open, honest, and self-reflective about its flaws. How odd it is, then, to see his failure to follow his own advice.

First, in mid-March, as the pandemic was making its way to America, Ioannidis wrote an article for STAT News where he argued that we should avoid rushing into big decisions like country-wide lockdowns without what he called “reliable data” on the virus. The most memorable part of the article was his prediction — on the basis of his analysis of the cursed cruise ship Diamond Princess — that around 10,000 people in the US would die from COVID-19 — a number that, he said, “is buried within the noise of the estimate of deaths from ‘influenza-like illness’”. As US deaths have just hit 125,000, I don’t need to emphasise how wrong that prediction was.

So far, so fair enough: everyone makes bad predictions sometimes. But some weeks later, it emerged that Ioannidis had helped co-author the infamous Santa Clara County study, where Stanford researchers estimated that the number of people who had been infected with the coronavirus was considerably higher than had been previously supposed. The message was that the “infection fatality rate” of the virus (the proportion of people who, once infected, die from the disease), must be very low, since the death rate had to be divided across a much larger number of infections. The study became extremely popular in anti-lockdown quarters and in the Right-wing populist media. The virus is hardly a threat, they argued — lift the lockdown now!

But the study had serious problems. When you do a study of the prevalence of a virus, your sample needs to be as random as possible. Here, though, the researchers had recruited participants using Facebook and via email, emphasising that they could get a test if they signed up to the study. In this way, it’s probable that they recruited disproportionate numbers of people who were worried they were (or had been) infected, and who thus wanted a test. If so, the study was fundamentally broken, with an artificially-high COVID infection rate that didn’t represent the real population level of the virus (there were also other issues relating to the false-positive rate of the test they used).

Then, an investigation by Stephanie Lee of BuzzFeed News revealed that the study had been part-funded by David Neeleman, the founder of the airline JetBlue — a company that would certainly have benefited from a shorter lockdown. Lee reported that Neeleman appeared to have been in direct contact with Ioannidis and the other Stanford researchers while the study was going on, and knew about their conclusions before they published their paper. Even if these conversations didn’t influence the conduct of the study in any way (as argued by Ioannidis and his co-authors), it was certainly odd — given Ioannidis’s record of advocating for radical transparency — that none of this was mentioned in the paper, even just to be safe.

Ioannidis didn’t stop there. He then did his own meta-analysis of prevalence studies, in an attempt to estimate the true infection fatality rate of the virus. His conclusion — once again — was that the infection fatality rate wasn’t far off that for the flu. But he had included flawed studies like his own one from Santa Clara, as well as several studies of the prevalence that only included young people — biasing the death rate substantially downwards and, again, not representing the rate in the population (several other issues are noted in a critique by the epidemiologist Hilda Bastian). That German accent you can hear faintly in the background is the ghost of Hans Eysenck, warning us about the “mega-silliness” of meta-analysing low-quality studies.

His most recent contribution is an article on forecasting COVID-19, upbraiding the researchers and politicians who predicted doomsday scenarios with overwhelmed hospitals. His own drastic under-prediction of 10,000 US deaths? Not mentioned once.

Although Ioannidis has at least sounded as if he’s glad to receive criticism, some of his discussion of the more mainstream epidemiological models has sounded Eysenckian — for instance, where he described the Imperial College model of the pandemic as having been “astronomically wrong”. There is, of course, a genuine debate to be had on how and when we should lift our lockdowns. There’s also a great deal that we don’t know about the virus (though more reliable estimates suggest, contra Ioannidis, that its infection fatality rate is many times higher than that of the flu). But Ioannidis’s constant string of findings that all confirm his initial belief — that the virus is far less dangerous than scientists are telling you — gives the impression of someone who has taken a position and is now simply defending it against all comers.

And for that reason, it’s an important reminder of what we often forget: scientists are human beings, and are subject to very human flaws. Most notably, they’re subject to bias, and a strong aversion to having their cherished theories proved wrong. The fact that Ioannidis, the world’s most famous sceptic of science, is himself subject to this bias is the strongest possible confirmation of its psychological power. The Eysenck and Ioannidis stories differ in very many ways, but they both tell us how contrarianism and iconoclasm — both crucial forces for the process of constant scepticism that science needs to progress — can go too far, leading researchers not to back down, but to double-down in the face of valid criticism.

Above, I should really have said that John Ioannidis was a hero of mine. Because this whole episode has reminded me that those self-critical, self-correcting principles of science simply don’t allow for hero-worship. Even the strongest critics of science need themselves to be criticised; those who raise the biggest questions about the way we do research need themselves to be questioned. Healthy science needs a whole community of sceptics, all constantly arguing with one another — and it helps if they’re willing to admit their own mistakes. Who watches the watchmen in science? The answer is, or at least should be: all of us.

Stuart Ritchie is a Lecturer in the Social, Genetic and Developmental Psychiatry Centre at King’s College London. His new book, Science Fictions: Exposing Fraud, Bias, Negligence and Hype in Science, is published on July 16.

 


Stuart Ritchie is a psychologist and a Lecturer in the Social, Genetic and Developmental Psychiatry Centre at King’s College London

StuartJRitchie

Join the discussion


Join like minded readers that support our journalism by becoming a paid subscriber


To join the discussion in the comments, become a paid subscriber.

Join like minded readers that support our journalism, read unlimited articles and enjoy other subscriber-only benefits.

Subscribe
Subscribe
Notify of
guest

13 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Gerald Heys
Gerald Heys
4 years ago

An interesting article, but I don’t think Professor Ioannidis actually predicted that there would be 10,000 deaths In the USA. What he said was:

‘If we assume that case fatality rate among individuals infected by SARS-CoV-2 is 0.3% in the general population ” a mid-range guess from my Diamond Princess analysis ” and that 1% of the U.S. population gets infected (about 3.3 million people), this would translate to about 10,000 deaths.’

To me, this reads more like a scenario than a prediction. And his estimate of 0.3% is very close to the more recently arrived at IFR figure of 0.26% from the CDC, which, of course, is based on a lot more data.

Mike Hearn
Mike Hearn
4 years ago
Reply to  Gerald Heys

Thanks. Logged in just to say that. His final projection was as wrong as everyone else’s which reinforces the current futility of epidemiology as a field, but, this comes with some severe caveats:

1. His analysis was almost entirely about CFR not total population infected. The number is wrong because 1% of the population getting infected was too low, not because his estimate of the fatality of the disease was too low. The latter number is

2. He also proposed an alternative scenario with 60% infected and 1% CFR (the Imperial College assumptions) and says this would match the 1918 flu pandemic, except not really because virtually all the dead would be very old so in life years the impact would be far less.

3. His key point was the following:

If we had not known about a new virus out there, and had not checked individuals with PCR tests, the number of total deaths due to “influenza-like illness” would not seem unusual this year. At most, we might have casually noted that flu this season seems to be a bit worse than average

And if you look at excess death data, that conclusion appears to be true. If we didn’t have rt-PCR technology, if we couldn’t sequence DNA so fast and thus could only observe the presence of the virus from hospitalisations and excess deaths, we’d have hardly noticed. There might have been a few news reports about a few overloaded hospitals as there have been in many previous years, and a few may have wondered why the “flu season” is so late this year, but that’d have been about it.

Jeremy Stone
Jeremy Stone
4 years ago

Ioannidis hasn’t behaved in a way that is wholly consistent with his principles or your hero-worship. But what he has done on Covid-19 is neither as mad nor as bad as Eysenck. Nobody knows what the Attack Ratio is anywhere, nor the right number of deaths to attribute, and thus the Infection Fatality Ratio is not knowable with certainty. But it is certain to be a lot closer in the US and the UK to the 0.1 per cent characteristic of flu than the 3.4 per cent number that Ioannidis disputed when it was publicised by the WHO. That really needed the treatment, and it is just a pity he was a bit too cavalier. The Ferguson epidemiological model was astronomically wrong, and completely circular besides (with extremely clear arguments as to this in a paper by Homburg and Kuhbandner recently issued in pre-print). I would say that cancellation of Ioannidis would be premature.

Helma Hesse
Helma Hesse
4 years ago

John Ioannidis wrote always in terms of could AND would AND asuming AND the lack of data AND 10.000 deaths per 1% of population infected.
The discussion about COVID is more emotional than anything else and lacking science and credibility, thus leading to even more anxiety and panic with the population. Also here.

David Bell
David Bell
4 years ago

Unfortunately in climate science and in pandemic modelling they have not learned self correction, only doubling down!

Dougie Undersub
Dougie Undersub
4 years ago

Nowhere is that community of sceptics more needed than in the field of climate science. Unfortunately, questioning the “settled” view in that field is a very career-limiting, funding-eliminating thing to do.

Jeremy Ford
Jeremy Ford
4 years ago

Is this article about science in general or COVID ‘science’? Given the state of hysteria surrounding COVID it appears silly to try to make a valid argument about transparency and skepticism about in science in general, even if the thesis is valid. It is too early. Thus the arguments presented are undermined.

Gerald Heys
Gerald Heys
4 years ago

An interesting article, but I don’t think that Prof Ioannidis actually predicted that there would only be 10,000 deaths in the USA from the virus. In the article, he said:

‘If we assume that case fatality rate among individuals infected by SARS-CoV-2 is 0.3% in the general population ” a mid-range guess from my Diamond Princess analysis ” and that 1% of the U.S. population gets infected (about 3.3 million people), this would translate to about 10,000 deaths.’

To me, this reads more along the lines of a scenario rather than a prediction. And that 0.3% estimate of a fatality rate is very close to the current 0.26% IFR estimate from the CDC. As to whether the 0.26% is correct, there are countries (Singapore, Qatar) that have large numbers of cases and significantly lower case fatality rates than this. The IFRs in those countries are, of course, very likely to be even lower.

Michael Dawson
Michael Dawson
4 years ago

A very good article. I’m sure I suffer from my share of biases as well, but it’s much more fun to point out other people’s, so… The virus provides amazingly good examples of confirmation bias, to take the most obvious bias, among posters on this site and also ConHome, which I also frequent. The virus started off being more or less a mystery and is revealing its nature frustratingly slowly, but this has not stopped lots of people jumping to conclusions about it and how to manage it, seizing on any old evidence to support general ideologies and prejudices they’ve had for years.

Last week’s article about face masks/coverings was a case in point. I could not believe the very high proportion of anti-mask posts. Their common theme seemed to be a doctrinaire belief in unbridled freedom, combined with ignorance of what research there is on the topic and the potential benefits – or a wilful urge to ignore contradicting evidence. I was surprised because for me this is a simple empirical issue – if masks significantly reduce risks, to the wearer and others, then they should be worn in enclosed public places until infection levels are low enough that masks don’t make any real difference. I did not start off with any ideological wish to impose masks – quite the opposite. And I don’t like wearing one myself. But I don’t have a problem looking at the evidence and making my mind up based on what I see. At least not in this case.

Paul Hayes
Paul Hayes
4 years ago

Healthy science needs a whole community of sceptics, all constantly arguing with one another ” and it helps if they’re willing to admit their own mistakes.

For various reasons, e.g. weaknesses in peer review, I think a willingness to admit error is more important to science – and more widely* – than merely helpful.

* “Science”, understood properly, is applicable almost everywhere and Cliffordian norms are for everyone.

Mike Hearn
Mike Hearn
4 years ago

Good essay, thanks for writing it.
I think a major part of the COVID story is about how “science” (really: academia) is not anywhere near as reliable as we tend to assume. In psychology the news about this has slowly started to enter the mainstream, in economics people clocked it’s unreliable a long time ago. At the start of COVID I think most people were tabula rasa with respect to the reliability of epidemiology as a field, and for public health more generally assumed competence.
There are different kinds of problems but a major one is the severe ideological bias present in academia, which corrupts everything. Take the discussion of funding. This psychologist believes Ioannadis might have been corrupted by his funding source – maybe so! But there’s another reason to hide this information: academics are very left wing and the left have a long history of dismissing any research paid for, even partly, by non-governmental sources. They argue that if it was funded by the private sector it must automatically be wrong.
This is based in a deeply rooted assumption in left-wing thinking that receiving funding from the government doesn’t create bias, but receiving funding from companies does. But that isn’t the case: the way you get funding from governments is by claiming to have found incredibly interesting results that are critical to public policy. Given this fact, is it any wonder that academic fields with the worst reliability problems like epidemiology, nutrition, psychology, economics and (yes) climatology produce a constant stream of papers with dodgy statistics/methods/models, along with a flood of “scientific” policy advice – advice that invariably involves massive state control of people’s lives? Yet nobody in the journalistic mainstream dismisses papers or their conclusions on the grounds that they were funded by governments. If people treated funding sources equally we’d probably see more disclosure.

Fraser Bailey
Fraser Bailey
4 years ago

What a racket it all is. And for so long we all thought that scientists were honest brokers. Mind you, I once thought that of the serious press and the BBC, and look what happened there.

Martin Z
Martin Z
4 years ago

Excellent piece. I have also found it deeply disappointing to see Ioannidis continually doubling down and subtly and not-so-subtly reframing his claims in order not to have to admit that he simply got it wrong initially. There would be no shame in that – indeed, it would set a good example.