Monday, May 04, 2020

Predictions

Laurie Garrett cheering essential workers, New York, New York Times

In an earlier blog post I mention the effects of Bill Gates' prescient TED Talk (conspiracy theories); in another post I mention the Cassandra Complex (impotent foreknowledge). So I am very happy to be introduced to Laurie Garrett in a profile by Frank Bruni. Garrett is a Pulitzer winning journalist, from her work tracking the Ebola outbreak in the 1995, and is also mentioned along side Gates and several others in a Vanity Fair article by David Ewing Duncan as a 'Cassandra'. Because she foresaw both HIV and COVID-19, she proclaims herself a 'Double Cassandra'. In her book, The Coming Plague, she also foresaw that there would be Cassandra's – so she's also a 'Meta-Cassandra'.

So being a Cassandra is a thing. She tells us what life might be like as we re-open from the 'lockdown':
This is history right in front of us. Did we go ‘back to normal’ after 9/11? No. We created a whole new normal. We securitized the United States. We turned into an anti-terror state. And it affected everything. We couldn’t go into a building without showing ID and walking through a metal detector, and couldn’t get on airplanes the same way ever again. That’s what’s going to happen with this.
The statement in the article that really takes my breath is her comment about the CDC in Atlanta, Georgia:
I’ve heard from every CDC in the world — the European CDC, the African CDC, China CDC — and they say, ‘Normally, our first call is to Atlanta, but we ain’t hearing back.’ There’s nothing going on down there. They’ve gutted that place. They’ve gagged that place. I can’t get calls returned anymore. Nobody down there is feeling like it’s safe to talk. Have you even seen anything important and vital coming out of the CDC?
If you want to know her better, and have some time, there is a fascinating and very personal talk with Garrett at the Columbia’s Earth Institute. Watch and be convinced as I am that she has some serious journalistic guts, and a ground-up understanding of the value in building, as well as the challenges facing, a strong public health infrastructure.


Speaking of Georgia, I have a reply here from the Department of Public Health from my last post (how about that?) – they explain:
This graph was added last week. If you read the data explanation at the top of the page, as well as the footnote directly under the graph you sent, you will understand. We are tracking cases based on date of onset of symptoms (if possible) or date test was sought. The date the test was confirmed is not a good indicator of when someone was a case in the state, as LabCorp and Quest saw significant delays in reporting results at one point. The average case rate accounts for things like delays in testing or batching that labs do when reporting results.
So let me just say, that, of course, I read the explanations and the footnotes, but the methodology is still unclear and the graphs still require a lot of clarification. There is no way to know, for example, if the records for May 1st (23 Confirmed Cases) and May 2nd (4 Confirmed Cases) are awaiting ten, or a hundred, or a thousand test results. Nor is there any way to account for future tests that would increase their numbers as onset dates. Basically, they are graphing the trend line of an incomplete dataset, and that is worse than meaningless.
The chart (above) presents the number of newly confirmed COVID-19 cases over time. This chart is meant to aid understanding whether the outbreak is growing, leveling off, or declining and can help to guide the COVID-19 response.
This "chart is meant to aid understanding"? It does not. What's the opposite of aid?

So according to the DPH and if I understand this correctly, once they get a test back as 'confirmed', they determine an onset date for that result, and then increment the record for that onset date. For example, they announced their first confirmed case on March 2nd, but recorded that as occurring on February 2nd, so now the dataset is stretched over an extra month. One can only assume that, as they have caught up with testing, all the days before the fourteen-day window are complete (as opposed to a one month window).

That fourteen-day window line is like a Wonkawash, everything squishes into it and comes out the other side unrealistically clean.


Their reasoning is that "(t)he date the test was confirmed is not a good indicator of when someone was a case in the state". Well, what is a good indicator of the number of cases? In order for these graphs to be meaningful in that way, Georgia needs a much more robust dataset, and needs to do much more extensive testing. To date, Georgia has tested at a rate of 17,772 per million; Portugal, by contrast, has tested at 44,132 per million (and has completed a testing sweep through the nation's nursing homes).

By extension, in order for Georgia to justify it's re-opening on April 24th and manage the outbreak going forward, it must do much more than massage a few graphs. As public health officials have warned, even ones working in the Trump Administration, there must be much more testing and tracing. Remember, Georgia and Portugal have almost the exact same population sizes (just over ten million), and both reported their first confirmed cases on the same day (March 2nd).

Since I am not a statistician, I cannot call this 'statistically dishonest' (I think it is), but I can certainly call it 'graphically dishonest', as the complete and incomplete records are represented in the exact same way. Here are some changes the DPH could use to make things more 'honest':
  • discontinue the rolling average inside the fourteen-day window – displaying a trend line for an incomplete dataset is totally misleading;
  • similarly, discontinue the connected blue-line for data points inside the 14 day window;
  • perhaps, display the data points inside the fourteen day window as a series of vertical bar graphs, with confirmed cases for that onset date in blue, and pending tests for that onset date in yellow, thus providing a graphic range or area (in yellow) where the final results would fall;
  • to make it very clear on other graphed results in all the 14-day windows are incomplete, display that data in a manner that is graphically dissimilar to the confirmed data – where appropriate, state and/or illustrate if values are likely to rise.
It does seem like the only statistic that has any tangible value is the number of confirmed cases – apply them 'from testing date' if you are concerned about the processing time, but the public has already heard about the delays in testing. The onset date is an approximate value, adjustable from a day to a month, and cannot provide any precision for the graph. It is, therefore, easily manipulated to shape the graph as desired. Otherwise, they should change this graph's title to: 'COVID-19 Confirmed Cases from Estimated Disease Onset Date, Most Recent Fourteen Days Shown Incomplete'.

Meanwhile the world has reached three and a half million cases and a quarter of a million deaths. The US has reached one million two hundred thousand cases and is nearing seventy thousand deaths. President Trump has revised his 'expected death total' upward again to one hundred thousand; the IHME is still showing a total of 72,433 by August 4th on their projections web page.


Portugal is beginning to re-open today. As we walk down our street, art suppliers, hair dressers, copy shops, and the stationers are all open. There are a lot more people moving by. In the park, folks are playing paddle-ball again, and the courts are being cleaned for more matches. It feels like the epidemic here is really slowing down, with just over twenty-five thousand cases and one thousand deaths – averaging about two hundred new cases per day over the three-day weekend (Georgia averaged about eight hundred per day). As we walk I think: if I could have predicted a pandemic early last year, I would have proposed retiring and moving to Portugal.

cases: 3,619,504 global • 1,200,794 USA • 25,524 Portugal
deaths: 250,478 global • 69,116 USA • 1,063 Portugal

Cassandra this: MSNBC is reporting on a story from the New York Times regarding leaked CDC projections that the US death rate will double to three thousand per day by the end of May, with an infection rate of two hundred thousand new cases per day. This completely negates Trump's claim today that total US deaths from COVID-19 will be held to one hundred thousand. So, something important and vital from the CDC.
As President Trump presses for states to reopen their economies, his administration is privately projecting a steady rise in the number of coronavirus cases and deaths over the next several weeks. The daily death toll will reach about 3,000 on June 1, according to an internal document obtained by The New York Times, nearly double the current number of about 1,750.
The projections, based on government modeling pulled together in chart form by the Federal Emergency Management Agency, forecast about 200,000 new cases each day by the end of the month, up from about 25,000 cases a day currently.

UPDATE [May 5, 9AM DST]: IHME is now showing a total of 134,475 by August 4th on their projections web page.

No comments: