[CovsirPhy] COVID-19 Python package for data analysis: SIR-F model

Introduction

We are creating a Python package CovsirPhy that allows you to easily download and analyze COVID-19 data (such as the number of PCR positives). We plan to publish articles on analysis examples using packages and knowledge gained in creating them (Python, GitHub, Sphinx, ...).

The English version of the document is Covsir Phy: COVID-19 analysis with phase-dependent SIRs, Kaggle: COVID-19 data with SIR model.

** This time, I would like to introduce the SIR-F model. ** No actual data is available. English version: Usage (details: theoretical datasets)

1. Execution environment

CovsirPhy can be installed by the following method! Please use Python 3.7 or above, or Google Colaboratory.

--Stable version: pip install covsirphy --upgrade --Development version: pip install" git + https://github.com/lisphilar/covid19-sir.git#egg=covsirphy "

#For data display
from pprint import pprint
# CovsirPhy
import covsirphy as cs
cs.__version__
# '2.8.2'
Execution environment
OS Windows Subsystem for Linux
Python version 3.8.5

2. What is SIR-F model?

The SIR-F model is a derivative model created based on the well-known basic model SIR model [^ 1]. I created it while proceeding with the analysis using Kaggle data [^ 2].

(I think it is a novel model, but if you know the original paper published before February 2020, please let me know! I am not an infectious disease expert ...)

[^ 1]: [CovsirPhy] COVID-19 Python package for data analysis: SIR model

SIR model First, the SIR model defines the probability of infection when Susceptible contacts Infected as Effective contact rate $ \ beta $ [1 / min]. $ \ Gamma $ [1 / min] is the probability of moving from Infected to Recovered [^ 3] [^ 4].

\begin{align*}
\mathrm{S} \overset{\beta I}{\longrightarrow} \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}  \\
\end{align*}

SIR-D model However, the SIR model does not consider Fatal (number of deaths) or is included in Recovered. In the case of COVID-19, data on the number of confirmed cases (the number of PCR positives), the number of recoverers, and the number of deaths have been collected by Johns Hopkins University, etc. [^ 5] and can be used as model variables. I can do it. The number of confirmed cases is the total of the number of infected people $ I $, the number of recoverers $ R $, and the number of deaths $ D $.

SIR-D model: $ \ Alpha_2 $ [1 / min] as the mortality rate of infected people

\begin{align*}
\mathrm{S} \overset{\beta  I}{\longrightarrow}\ & \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}  \\
& \mathrm{I} \overset{\alpha_2}{\longrightarrow} \mathrm{D}  \\
\end{align*}

SIR-F model Furthermore, in the case of COVID-19, it is difficult to make a definitive diagnosis of infection, and many cases of death before the definitive diagnosis were reported, especially in the early stages. The model that reflects these cases is as follows. $ S ^ {\ ast} $ is the percentage of infected people with a definitive diagnosis, and $ \ alpha_1 $ [-] is the percentage of $ S ^ {\ ast} $ infected people who died at the time of the definitive diagnosis (no unit) ) Is shown.

SIR-F model:

\begin{align*}
\mathrm{S} \overset{\beta I}{\longrightarrow} \mathrm{S}^\ast \overset{\alpha_1}{\longrightarrow}\ & \mathrm{F}    \\
\mathrm{S}^\ast \overset{1 - \alpha_1}{\longrightarrow}\ & \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}    \\
& \mathrm{I} \overset{\alpha_2}{\longrightarrow} \mathrm{F}    \\
\end{align*}

When $ \ alpha_1 = 0 $, the SIR-F model matches the SIR-D model.

3. Simultaneous ordinary differential equations

As total population $ N = S + I + R + F $

\begin{align*}
& \frac{\mathrm{d}S}{\mathrm{d}T}= - N^{-1}\beta S I  \\
& \frac{\mathrm{d}I}{\mathrm{d}T}= N^{-1}(1 - \alpha_1) \beta S I - (\gamma + \alpha_2) I  \\
& \frac{\mathrm{d}R}{\mathrm{d}T}= \gamma I  \\
& \frac{\mathrm{d}F}{\mathrm{d}T}= N^{-1}\alpha_1 \beta S I + \alpha_2 I  \\
\end{align*}

4. Dimensionless parameters

You can handle it as it is, but it will be dimensionless because the parameter range is limited to $ (0, 1) $. Although not mentioned in this article, it is effective when calculating parameters from actual data.

$ (S, I, R, F) = N \ times (x, y, z, w) $, $ (T, \ alpha_1, \ alpha_2, \ beta, \ gamma) = (\ taut, \ theta, \ tau ^ {-1} \ kappa, \ tau ^ {-1} \ rho, \ tau ^ {-1} \ sigma) $, $ 1 \ leq \ tau \ leq 1440 $ [min]

\begin{align*}
& \frac{\mathrm{d}x}{\mathrm{d}t}= - \rho x y  \\
& \frac{\mathrm{d}y}{\mathrm{d}t}= \rho (1-\theta) x y - (\sigma + \kappa) y  \\
& \frac{\mathrm{d}z}{\mathrm{d}t}= \sigma y  \\
& \frac{\mathrm{d}w}{\mathrm{d}t}= \rho \theta x y + \kappa y  \\
\end{align*}

At this time,

\begin{align*}
& 0 \leq (x, y, z, w, \theta, \kappa, \rho, \sigma) \leq 1  \\
\end{align*}

5. (Basic / Effective) Number of reproductions

The (basic / effective) reproduction number of the SIR-F model is defined as follows by extending the definition formula [^ 6] of the SIR model.

\begin{align*}
R_t = \rho (1 - \theta) (\sigma + \kappa)^{-1} = \beta (1 - \alpha_1) (\gamma + \alpha_2)^{-1}
\end{align*}

6. Data example

Set the parameter $ (\ theta, \ kappa, \ rho, \ sigma) = (0.002, 0.005, 0.2, 0.075) $ and the initial value and graph.

# Parameters
pprint(cs.SIRF.EXAMPLE, compact=True)
# {'param_dict': {'kappa': 0.005, 'rho': 0.2, 'sigma': 0.075, 'theta': 0.002},
#  'population': 1000000,
#  'step_n': 180,
#  'y0_dict': {'Fatal': 0,
#              'Infected': 1000,
#              'Recovered': 0,
#              'Susceptible': 999000}}

(Basic / Effective) Number of reproductions:

# Reproduction number
eg_dict = cs.SIRF.EXAMPLE.copy()
model_ins = cs.SIRF(
    population=eg_dict["population"],
    **eg_dict["param_dict"]
)
model_ins.calc_r0()
# 2.5

graph display:

# Set tau value and start date of records
example_data = cs.ExampleData(tau=1440, start_date="01Jan2020")
# Add records with SIR-F model
model = cs.SIRF
area = {"country": "Full", "province": model.NAME}
example_data.add(model, **area)
# Change parameter values if needed
# example_data.add(model, param_dict={"kappa": 0.001, "kappa": 0.002, "rho": 0.4, "sigma": 0.0150}, **area)
# Records with model variables
df = example_data.specialized(model, **area)
# Plotting
cs.line_plot(
    df.set_index("Date"),
    title=f"Example data of {model.NAME} model",
    y_integer=True,
    filename="sirf.png "
)

sirf.png

7. Next time

Next time, I will explain the procedure for downloading and checking the actual data.

Recommended Posts

[CovsirPhy] COVID-19 Python package for data analysis: SIR-F model
[CovsirPhy] COVID-19 Python Package for Data Analysis: SIR model
[CovsirPhy] COVID-19 Python Package for Data Analysis: Data loading
[CovsirPhy] COVID-19 Python package for data analysis: S-R trend analysis
[CovsirPhy] COVID-19 Python Package for Data Analysis: Parameter estimation
[CovsirPhy] COVID-19 Python Package for Data Analysis: Scenario Analysis (Parameter Comparison)
Python for Data Analysis Chapter 4
Python for Data Analysis Chapter 2
Python for Data Analysis Chapter 3
Preprocessing template for data analysis (Python)
Python visualization tool for data analysis work
Data analysis python
Data analysis with python 2
Data analysis using Python 0
Data analysis overview python
Python data analysis template
Data analysis with Python
Let's analyze Covid-19 (Corona) data using Python [For beginners]
Data analysis for improving POG 1 ~ Web scraping with Python ~
[For beginners] How to study Python3 data analysis exam
My python data analysis container
[Python] Notes on data analysis
Python data analysis learning notes
Data analysis using python pandas
Tips for data analysis ・ Notes
[Understand in the shortest time] Python basics for data analysis
Which should I study, R or Python, for data analysis?
<Python> Build a dedicated server for Jupyter Notebook data analysis
Introduction to Statistical Modeling for Data Analysis GLM Model Selection
Python: Time Series Analysis: Preprocessing Time Series Data
Python course for data science_useful techniques
Practice of data analysis by Python and pandas (Tokyo COVID-19 data edition)
Data analysis for improving POG 3-Regression analysis-
Data formatting for Python / color plots
Data analysis starting with python (data visualization 1)
Data analysis starting with python (data visualization 2)
Create a USB boot Ubuntu with a Python environment for data analysis
A summary of Python e-books that are useful for free-to-read data analysis
Detailed Python techniques required for data shaping (1)
[Python] First data analysis / machine learning (Kaggle)
Data analysis starting with python (data preprocessing-machine learning)
How to use "deque" for Python data
Detailed Python techniques required for data shaping (2)
I did Python data analysis training remotely
Python 3 Engineer Certified Data Analysis Exam Preparation
JupyterLab Basic Setting 2 (pip) for data analysis
JupyterLab Basic Setup for Data Analysis (pip)
Analysis for Data Scientists: Qiita Self-Article Summary 2020
Data analysis in Python Summary of sources to look at first for beginners
Python: Time Series Analysis: Building a SARIMA Model
[Introduction to SEIR model] Try fitting COVID-19 data ♬
Data analysis for improving POG 2 ~ Analysis with jupyter notebook ~
Python template for log analysis at explosive speed
Prepare a programming language environment for data analysis
[Examination Report] Python 3 Engineer Certified Data Analysis Exam
Analysis for Data Scientists: Qiita Self-Article Summary 2020 (Practice)
Python3 Engineer Certification Data Analysis Exam Self-made Questions
Python: Time Series Analysis: Stationarity, ARMA / ARIMA Model
Python 3 Engineer Certification Data Analysis Exam Pre-Exam Learning
An introduction to statistical modeling for data analysis
How to use data analysis tools for beginners