James Ball

Modelling coronavirus is an imperfect science

Modelling coronavirus is an imperfect science
Text settings
Comments

We don’t know if our model for estimating immigration into the United Kingdom works. It’s a long-standing dataset, produced by the Office for National Statistics – one of the best at what it does in the world. The model measures people entering and leaving the UK, something tracked at ports and airports. It’s a model of high political interest and concern. And despite all of that, we’re still not sure it’s good enough to be classed as a gold-standard ‘national statistic’.

In the modern era, almost any number we ‘know’ – be it population, immigration, unemployment, inflation, or GDP – is actually an estimate produced by complicated statistical modelling.

Coronavirus is no exception. The decisions currently being made by the UK government on our response to the virus, which have potentially seismic impacts, are being informed by advanced modelling.

Given we are dealing with a fast-moving disease, there are few other choices. But models can disagree, as we have seen very publicly: epidemiologists at Imperial College estimated up to 510,000 deaths in the UK if we did not take severe and immediate steps to curtail coronavirus, while theoretical modelling by an Oxford University team suggested potentially less severe scenarios.

And given the huge importance of the decisions being made based on these models, they have gone through very little of the normal scientific scrutiny: the peer review process, field testing, even opening up the working of the model to outside experts.

Neil Ferguson noted last month that Imperial’s coronavirus code is ‘all in my head, completely undocumented. Nobody would be able to use it’. To most of the world, including much of the expert world, the Imperial model is a black box: data goes in and their estimates come out, and we either trust them or we don’t.

This matters, because it’s not as if we haven’t been led astray by modelling, or even simple spreadsheet errors, before. In the wake of the financial crisis, renowned Harvard economists Carmen Reinhart and Kenneth Rogoff published a paper with the stark finding that a high level of national debt led to a slowdown in economic growth. This was used as the intellectual basis for austerity programmes internationally.

Only several years later did another team of researchers discover that thanks to a spreadsheet error, Reinhart and Rogoff had excluded several countries from their calculations. Once these were included, their finding disappeared entirely.

Even the best in their fields can make these kinds of human errors in their calculations and models. In 1999, Nasa famously lost a £100m ($125m) Mars probe thanks to a failure by different teams of engineers to realise one was working in metric units, and the other in imperial. Academic research suggests up to 94 per cent of all spreadsheets contain at least one calculation error.

It’s to counter these kinds of potential errors – as well as deeper issues, such as the risk of omitted data or unconsidered variables – that the normal scientific process includes safeguards like peer review and open data. But due to the pace of decision-making needed during coronavirus, many of these traditional processes are simply too slow.

Usually any intervention upon which even hundreds of lives depend will be subject to years of rigorous trials and testing. The typical process for approval of new drugs, even ones broadly similar to those on the market, can take a decade. Cars and other products are subjected to the nearest possible simulations of real-world tests. National statistics, such as those on immigration, are constantly re-examined, both internally and by external statisticians.

That this has not been possible for some of the models upon which our governments are basing their coronavirus response is certainly not the fault of the people making them. They are dedicated experts who have stepped up to try to offer information, analysis, and answers when they are all but impossible to find.

It is in many ways the fault of the international system, which includes numerous bodies and agencies which are supposed to be able to fill this role during a global crisis. Which agency or government has come forward to standardise data, pull together scientists and give them the necessary resources to hone their models? The WHO has not stepped up. The EU has not stepped up. The US has certainly not stepped up.

During a global pandemic, a mathematical model whose documentation lies entirely in one academic’s head should not be the basis of decision-making for one of the world’s largest economies, and yet that is the world we are in. We should all be glad Ferguson and his colleagues have stepped up to the plate. But we need to make sure it won’t be necessary, next time.