2019-nCov-log-growth-Italy

Introduction

The number of cases of COVID-19 reported in Italy is still rising exponentially as is the total number of cases. There is no obviously visible change since ten northern regions were put under restricted movement regulations on 23th February and this was extended to all italian territory between 8th and 11th March 2020.

However, studying the logarithm of the number of new cases reported each day does reveal and change.

On a logarithmic scale an exponential rise becomes a straight line. The slope represents the constant term which describes the form of the exponential.

The linear regression of the logarithm shows the slope reduced by more than a factor of two after the measures were introduced.

This will mean that ( all other conditions remaining the same ) the outbreak will take twice as long to reach its peak but the rise in daily new cases will make it more manageable for the local health services. This strategy is often referred to as “flattening the curve”. Over the limited period since these measures were put in place this appears to have had a detectable effect, though it remains limited in magnitude.

On the 21st March 2020, a limited military deployment was established in the northern region of Lombardy, the centre of the epidemic, to more effectively enforce the existing measures.

While every death is a tragedy for those who are close, fatalities need to be taken in context. People are always dying. On average in recent years, there are 8 million deaths per year in Europe. That is around 22,000 deaths per day, each day. So far there have been around 8,000 deaths attributed to COVID-19 in Europe.


Method

There was an obvious under reporting issue on the 15th March with the excess of cases apparently getting totalled into data for 16th. Since the figures do not seem physically credible, an estimated correction was applied by transferring 3000 cases for 16th back to 15th. The following changes were made:

date cases deaths
17th 4000 347
16th 3230 370 # reported 6230 cases
15th 3090 173 # reported 90 cases
14th 2547 252

The very small numbers in the early part of the epidemic’s spread in Italy lead to large swings in the logarithmic plot, so the early erratic data ( grey line ) were eliminated from linear analysis. The two fitting periods were from day 54 ( 23th February ) to 70 ( 10th March ) and from day 71 to the current end of data.

Results
The following results were obtained from the nonlinear least squares fitting routine in gnuplot, m1 and m2 being the slope of the linear model for the earlier and later periods respectively. The ratio m1/m2 being 2.28 .

Final set of parameters Asymptotic Standard Error
======================= ==========================
m1 = 0.219408 +/- 0.01823 (8.309%)
c1 = -1.06877 +/- 0.5722 (53.53%)

Final set of parameters Asymptotic Standard Error
======================= ==========================
m2 = 0.119872 +/- 0.01592 (13.28%)
c2 = 2.63721 +/- 0.7347 (27.86%)

The time for the daily number of new cases to double under such an exponential growth is logn(2)/ slope, so the doubling time has gone from 3.16 days ( +.29 , -.24 ) to 5.78 days ( +0.88, -0.68 ) .

Conclusion

There is a gradual change in slope around the time just after the restrictions were introduced which, considering the median incubation of 5 days is consistent with that being a contributory factor. However, a similar analysis of the death toll shows a similar change in slope at a similar point in time. This is not consistent with direct attribution to the movement restrictions since there is a further delay between infection and an eventual fatality. This urges caution in drawing simplistic assumptions on such limited data. Clearly other factors are significant.

The lengthening of the doubling time for new cases will allow more time for already swamped emergency services to adapt to increasing case load but it remains an exponential growth.

So far there is no indication of a peak in cases or an end to the exponential phase of the epidemic in Italy.

Appendix

Data source:
European Centre for Disease Prevention and Control
https://www.ecdc.europa.eu/en/publications-data

The dates in the data file are the date of issue of the ECDC report and are generally from the previous day’s national medical statistics. The hour at which each accounting period starts and ends seems unclear and may be variable over time and from country to country.

data acquisition and extraction of Italian data:
curl https://www.ecdc.europa.eu/en/publications-data -o COVID.csv

awk 'function days(x){ split (x,dt,"/");d=dt[1];m=dt[2];if(m>1){d+=31};if(m>2){d+=29};return d }BEGIN{ FS=","}($7=="Italy"){print days($1)" "$5" "$6}' COVID.csv > Italy_days.dat

gnuplot console commands to recreate graph:

### Italy log analysis. ###

It="Italy_days.dat"
lin1(x)=m1*(x-of1)+c1; of1=31
lin2(x)=m2*(x-of2)+c2; of2=31
#p1=57;p2=70;col=3;src="fatalities"
p1=54;p2=70;col=2;src="cases"
c1=-5;m1=11; fit [p1:p2] lin1(x) It u 1:(log(column(col))) via m1,c1
c2=-5;m2=11; fit [p2+1:*] lin2(x) It u 1:(log(column(col))) via m2,c2


set xlab "time ( days ) from 1st January 2020 \n"."https://www.ecdc.europa.eu/en/publications-data"
set xrange [0:*]
set yrange [0:5000]


set tit "COVID-19 daily reported ".src." in Italy"
set xlab "time ( days ) relative to 1st Jan 2020 \n"."https://www.ecdc.europa.eu/en/publications-data"
set ylab "reported new ".src." ( logarithmic scale ) "
set key top left Left rev
set xr [45:95]
unset lab; unset arrow
set label 3 "Nation-wide restrictions" at 69,graph .6 rotate by -90; # march 8th, inforce 9th. plot 8th: allow for lag in data
set arrow 3 nohead from 69,0 to 69,2300
set label 4 "11 town quarantine" at 58,graph .4 rotate by -90; # feb 27th
set arrow 4 nohead from 58,0 to 58,2300

set lab 1 sprintf(" pre-restrictions slope %3.2f : doubling period = %3.1f days",m1,log(2)/m1) at 72,graph 0.15 ;
set lab 2 sprintf("post-restrictions slope %3.2f : doubling period = %3.1f days",m2,log(2)/m2) at 72,graph 0.11 ;


plot s=0 \
 , "Italy_days.dat" u (($1<=p1)?$1:NaN):((column(col))*1) w l tit "Italy: daily new ".src linecol rgb "light-gray"\
 , "Italy_days.dat" u ((($1>=p1)&&($1<=p2))?$1:NaN):((column(col))*1) w l tit "" linecol rgb "dark-green"\
 , "Italy_days.dat" u (($1>=p2)?$1:NaN):((column(col))*1) w l tit "" linecol rgb "light-green"\
 , exp(lin1(x)) linecol rgb "red" tit sprintf("linear fit to log of ".src." numbers: days %2d-%2d",p1,p2)\
 , exp(lin2((x>p2)?(x):NaN))  linecol rgb "orange" tit sprintf("linear fit to log of ".src." numbers after day %2d",p2)\



The free, open source software, Gnuplot, used for plotting and least squares analysis can be obtained from http://www.gnuplot.info