Saturday, May 2, 2020

Loving Data In The Time Of Covid

I've been having discussions in various social media about the impact of the Covid-19 virus on the US.  My interlocutors have been pointing out that the total number of Covid-19 deaths in the United States as indicative of gross incompetence by the Trump administration.  My response was to suggest that the US has (thus far) done about as well as Europe.  That conclusion was based on some rough calculations...

...that were wrong!  Specifically, I had used a population for Europe that was grossly below the actual population count.  I looked at the European population on several sites and came up with different answers as it all depends on what one counts as being "Europe".  Sometimes Russia is included.  Sometimes it isn't.  Sometimes only part of Russia is included.

Regardless of how one defines the population of Europe, my estimate was way low.  This was a significant deficiency in my response.

So I set out to correct that deficiency.  The Covid data that follows came from https://ncov2019.live/data/europe and the NYC health department.   The population data came from Wikipedia entries for each nation in the database presented by ncov2019.live.  All data was as of 5/1/2020.

To restate, I was of the opinion that simply looking at the gross number of deaths in the US was an inadequate measure of the administration's response to the Covid pandemic.  A better way of judging the government's response is to compare our experience with that of other, comparable nations.

And the only way to make that comparison is to look at those deaths as a percentage of the population.  We have had over 60,000 deaths while Belgium has had over 7,700 deaths.  Simply looking at those raw numbers would suggest that the US government is failing badly.

Yet the deaths in Belgium have been 0.0669% of their total population while in the US, the deaths have only been 0.0196% of our population.  Covid deaths in Belgium are three times as high as that of the US when measured as a percentage of the population.

The ncov2019.live dataset indicates 50 nations as being in Europe.  I omitted the Vatican City data as they have less than 1000 citizens and they have managed to have a negative death to do Covid.  Being the Vatican, I'm assuming that Lazarus is involved in some way.  Omitting 1000 citizens out of a population of over 832 million isn't liable to alter the statistical result.

I added the US data as that is the comparison that I'm trying to make.  I also added New York City as a separate entity and added a line for US data with the NYC data removed.  While New York City is definitely part of America, they are also having a pretty unique experience with the Covid virus.  I also added a line for all of Europe.

Here's the data for Europe and the US.  Note that there are several modern nations in Europe that have had a much higher death rate than what has occurred in the United States.  Also note that when the data from NYC is removed, the average experience of the rest of the United States is almost identical to that of all of Europe.

[Click Images to Embiggen]



While this isn't evidence of stellar performance by the administration, it also isn't exactly proof of malfeasance.

Another way of looking at the data is with respect to the confirmed cases.  The number of confirmed cases is a bit fraught as we have been behind the curve on mass testing.  Here is the data.


New York City isn't the only outlier in the dataset.  Russia is another one.  They have only recently acknowledged that they have failed to halt the spread of Covid in their country.  So the data from Russia is likely to change pretty dramatically over the next couple of weeks.  As a result, I then added a line for Europe with the data from Russia removed.

Note that while the US without NYC is about the same as all of Europe, including Russia, Europe without Russia is worse than all of the US, including NYC.



One of the people with whom I was conversing suggested that the ratio between the percentage of confirmed cases to the percentage of deceased cases indicates something ominous about the lack of testing.  Essentially, the lower that ratio, the fewer people have been tested, and thus the great the odds for future deaths.

If the ratio of the percentage of confirmed cases to the percentage of deceased is meaningful in any way, then Europe is doing half as well as the United States.


There were a couple of good conclusions to be drawn from the data.

The first one is to be more careful with the data.  My estimate of the population of Europe was pretty low.  I was not including Russia in my estimate.  Nor was I counting nations such as Kazakhstan or a host of other former Soviet satellite states.  That was an obvious error on my part.

The second conclusion is that despite the error in my estimates of the European population, my impression that the impact of Covid-19 on the United States has been on par with the impact on Europe is largely correct based on the data that is available today.

While there are several points in our response to Covid where the administration has clearly fallen short, the net result is not significantly different from other, comparable nations.  Suggestions that the administration's performance is defined by gross incompetence and/or malfeasance are not justified by the data.

My perspective remains unchanged.  The US is a big country.  We should expect a large number of infections & deaths.  The US Covid experience is reasonably comparable with Europe when viewed on a percentage basis.

People that focus solely on the total number of cases and/or deaths while ignoring the relative size of our population are not seeking to educate and inform.  They are looking to wave "the bloody shirt" and whip up people's passions.

Passionate and uneducated people rarely make wise decisions.