OT: Review of Ferguson/Imperial model

- T
- Tim Streater
  
  Contact options for registered users
posted
3 years ago

Wed, May 6, 2020 4:47 PM

Seems a bit worrying, to me:

formatting link

- M
- Michael Chare
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 10:48 AM

Thank you for the link. So should we spend money so that we have better code?

- C
- Clive Page
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 11:04 AM

It did make interesting reading but has been written by someone who did not reveal her real name and is hosted on a site apparently run by Toby Young of The Spectator.

The Ferguson code is obviously buggy, badly documented, and runs inconsistently on different processors which is always a bad sign. But I'd assumed that would be the case - it's code written by an academic with, as far as one can see from the evidence, very little input from experienced software engineers.

It's more worrying that the Imperial models have a pretty poor record in past epidemics. Here are the results of Professor Ferguson?s earlier modelling efforts:

Bird Flu ?200m globally? ? Actual 282 Swine flu ?65,000 UK? ? Actual 457 Mad Cow ?50-50,000 UK? ? Actual 177

The short-term solution is to use models developed by other groups in the same field as well. Unfortunately their competitors like the Oxford group don't seem to be much better. But the more the merrier I'd suggest.

- C
- Chris Hogg
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 11:08 AM

Sue Denim - pseudonym, so someone who doesn't want to be identified. There's a discussion of the article over on WUWT

formatting link

The programming used by Ferguson has been described as unprofessional, amateur and spaghetti code.

- A
- Andy Burns
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 11:16 AM

If it's non-deterministic, couldn't the RNG be seeded to a known state before test runs (including comparisons on different processors)?

- M
- michael adams
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 11:27 AM

When they haven't actually yet`nailed down

a) the most prevalent and effective means of transmission - hard and soft surfaces, droplets, aerosols etc. etc.

b) whether masks, either in general or of specific types actually offer any protection to the wearer - rather than acting as a possible barrier to outward transmision.

c) how much virus is present per cc in the aerosol being pumped out by infected joggers at the start of their runs, after a mile etc. etc

d) how long that aerosol remains airborne at say 4ft off the ground ,

e) along with plenty of other questions, no doubt.

All of which seem quite amenable to experimentation and testing; certainly after 4 months of global exposure. And compared with which, IMHO any bugs in the code behind the modelling pale into insignificance.

michael adams

...

- N
- newshound
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 11:41 AM

Interesting reading, and comments from Clive.

It does seem disappointing, in this day and age, that the Imperial team have not received more local scrutiny and challenge, it looks as though they had a serious silo mentality. In the old days some academics did valid work with quick and dirty home-brews, I am thinking about Lovelock's Daisyworld and Dawkins did the same modelling morphology, but as a way of illustrating principles, rather than claiming to be accurate models for prediction.

I'm out of touch with Imperial now, but they used to do serious fluid dynamics. I would be very surprised if their current Engineering and Materials departments did not have very competent programmers using large codes and modern methods.

Clive's point noted about the reviewer, but they do seem to know "which way is up". I've been the first to defend Ferguson on his recent mistreatment by the Tory media, but I can't forgive him for such flawed methodology. I bet the people who run the Treasury Model would have laughed their socks off, if they had been allowed to see the raw code. The irony is that you don't really need any code at all to see that COVID had the potential to saturate the NHS without drastic action, once some numbers had come out of China.

As an aside, I have always had similar concerns about Climate Change Modelling, but I have the impression there is a bit more openness and scrutiny of code.

Still on the programming thread, there's an interesting paper by some big wheel in the official cyber-security organisation about the coding of the Covid tracking app. Evidently we do have some people in authority who take the subject seriously.

formatting link

- M
- michael adams
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 11:43 AM

Really Tim ?

Care to be a bit more specific ?

Which specific points in the article do you think carry the greatest weight ?

As to the conclusion of the article, which I'd imagine most readers can fully comprehend without there being any suspicion of their having the wool pulled over their eyes -

<quote>

On a personal level, I'd go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don't have these people, and the results speak for themselves.

</quote>

Apart from the fact that any deficiencies in the modelling are most likely the result of under-funding as it is, perhaps the author of the article might care to explain how data scientists and modellers working on behalf of the insurance industry will in future gain their expertise if academic epidemiology is defunded.

michael adams

...

- A
- alan_m
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 12:50 PM

Not insignificant if main policy decisions are been made as a result of a flawed model.

- G
- GB
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 1:05 PM

Thnak you for replying to a troll .

- N
- newshound
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 1:38 PM

Nice link. This is one of (many) good comments from someone called michael.

"The main problem may be one that the piece doesn?t consider, that is, the excessive quest for detail under the impression that this is related to accuracy.

You see this in business models all the time. You are trying to forecast takeup of some new product type. The department starts out with a back of the envelope one liner of total sales and price.

As the investment required rises, in the effort to satisfy senior management that they are thinking rigorously, the department breaks every significant parameter down further. So they end up forecasting by product type, by region, by single, married, city, rural?. etc.

By the time this process get through, and this happened in the notorious wireless spectrum auctions during the dotcom bubble, you have a model in Excel covering hundreds of pages with extensive use of macros. Now the code will probably be unstructured and uncommented. But that is not the real problem with it.

The real problem is that it has become impossible for decision makers to have a sensible argument about the key parameters. They no longer have anything on which they can bring their knowledge and experience to bear.

Whereas at the start they could sit around a table and argue whether those sales estimates were reasonable, and draw on examples from experience, they could happen in this or that mix or this or that blend of different prices, now they sit staring at the output from a model they have not seen and cannot understand while some bright spark from Finance or Marketing explains to them that this is what the model shows. And offers of proof of its legitimacy that everything has been carefully modelled down to the last detail.

As in Ferguson?s model ? its modelling, apparently, hotels differently from other vectors. When you read that, you know immediately that we are in the realm of arbitrary assumptions at an excessive level of detail. But you can?t see how arbitrary all the assumptions are, because they are buried somewhere in pages of code, and you can?t question the result without having to question all the detail, which you can?t get at. Certainly not in a meeting or reasonable length.

Any model which is going to be used as the basis for public policy should fit on one A4 and have no macros or VB in it. Then managers can actually bring their experience and intuition to bear on the key drivers. The way its usually done, and the effect of the Ferguson model, is to turn experienced and qualified people into Yes/No switches."

- M
- michael adams
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 2:06 PM

The model is "flawed" by definition if its based on all the untested assumptions which I listed before.

That's the whole point.

How can you possibly construct a useful model if you can't even answer the following basic questions ?

a) what is the most prevalent and effective means of transmission - hard and soft surfaces, droplets, aerosols, what ?

b) do masks, either in general or of specific types actually offer any protection to the wearer - rather than acting as a possible barrier to outward transmission.

c) how much virus is present per cc in the aerosol being pumped out by infected joggers at the start of their runs, after a mile etc. etc

d) how long that aerosol remains airborne at say 4ft off the ground ,

e) along with plenty of other questions, no doubt.

Nothing I've read in this thread so far, nor in the comments at the foot of the article suggests that the article in question is anything more than a hoax or a fraud which seems to have fooled a lot or people.

But as always I stand to be corrected.

In agreeing with one comment, that a model should take up no more than one side of an A4 sheet of paper, exactly the same could be said of any critique which should limit itself to one or more of the more damning points. Anything more simply gives the appearance of obfuscation deliberately designed to bamboozle the credulous.

michael adams

...

- N
- newshound
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 2:51 PM

Churchill famously insisted that wartime briefs were confined to one side of a single sheet of paper. He was, though, lucky to live in the days before word processors. I'm sure the stories are not apocryphal of getting "over-long" scientific papers accepted for publication by increasing the margins and reducing the font size.

- C
- charles
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 3:08 PM

One of our maths teachers at school stated that the value of a solution was inversely proportional to its length.

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 3:21 PM

No axe to grind there then ;-)

A lot of academic code is like that even in the hard sciences.

I agree it is a bit worrying that their code does not return consistent results with the same starting seeds or when the dataset is saved and restarted. We always found a few bugs in what was notionally extremely well tested code whenever it was ported to a new platform. Back in the days when floating point hardware came in a variety of actual lengths. Sometimes we found bugs in the hardware itself.

Even compiling it on a Z80 (which was done for a bet) resulted in finding some tolerated on mainframe non-compliance with the strictest Fortran standards of the day. The Z80 implementation was minimalist.

I always find it a bit worrying today when things don't behave themselves on multiple threaded CPUs and have to run single core.

The difficulty is in knowing how the human to human transmission goes.

In the case of Covid-19 I don't think he is probably all that far out if the infection was allowed to run its course unchecked. I'd be surprised when the dust settles if he is wrong by more than a factor of three. (assuming that we do not find and deploy a vaccine soon enough)

Royal Society would certainly agree with you and there is an ab initio modelling group working to do just that via Ramp. They will necessarily be a bit late to the party though from a standing start and with remote working. Their team is a who's who's of scientific and statistical wizards with a smattering immunologists and epidemiologists thrown in.

formatting link

- T
- Tim Streater
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 3:48 PM

And increasing the number of authors.

- S
- Spike
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 5:03 PM

The comments are well worth reading, and contain links to other interesting stuff.

- M
- michael adams
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 6:09 PM

- P
- Pancho
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 6:17 PM

He dealt with that point. For testing/regression reproducible seeds should be used. They are intended to be used in the code, but aren't. This kind of implies the code base is out of control.

Coders are being pressured into adding new features and regression testing has gone out the window. I suspect a few of us will have that particular T-shirt.

- J
- jgh
  
  Contact options for registered users
Vote on answer
posted
3 years ago

Thu, May 7, 2020 7:11 PM

But it's been stated that if you re-start it with the same seeds it gives different outputs. It's impressive to make computer code - the very definition of a deterministic system - behave nondeterministically.

Non-predicatably, easy - Conway's Game of Life is a simple example. But non-deterministically - there's something fundamentally going wrong when running a calculation gives different answers to the same inputs.

(And it's not something like accumulating floating point errors. If you do the same long run of floating point operations you will still end up with the same "wrong" answer at the end, not different answers)

The documented fix is "run the code with the same inputs several times and then average them". If it's run with the same inputs, be definition, it should give the same outputs. What the "average the outputs" thing is for is when you run a system with the inputs changing slightly.