[PYTHON] Analyzing data on the number of corona patients in Japan

We will analyze the number of corona patients in Japan and create a predictive model.

here ・ How to use the library called Prophet provided by Facebook ・ Random forest regression ・ XGBoost I will try three methods.

environment

Use Jupyter Notebook which is installed when you install Anaconda on Windows 10

version information: Jupyter Notebook 6.1.4 Python 3.8.5

Library import

fbprophet, plotly, and xgboost with Anacona Prompt respectively  conda install -c conda-forge fbprophet  pip install plotly  pip install xgboost After installing with, I imported the library on Jupyter Notebook by the following.

import pandas as pd
import seaborn as sns
import numpy as np
from scipy.stats import norm
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
pd.pandas.set_option('display.max_columns', None)
from scipy import stats
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
from sklearn.linear_model import Lasso
from sklearn.feature_selection import SelectFromModel
pd.pandas.set_option('display.max_columns', None)
import matplotlib.pyplot as plt
from fbprophet import Prophet
import plotly.express as px
from sklearn.ensemble import RandomForestRegressor
from xgboost import XGBRegressor
from sklearn.metrics import accuracy_score
from sklearn import metrics
from sklearn.model_selection import ParameterGrid
from tqdm import tqdm
from sklearn.pipeline import Pipeline
from sklearn.decomposition import PCA

Data to use

--nhk_news_covid19_domestic_daily_data.csv: Corona patient number data released by NHK

Data provider: https://www3.nhk.or.jp/news/special/coronavirus/data-all/

Display patient count data as a data frame:

path = "../input/covid19data_from_NHK/nhk_news_covid19_domestic_daily_data.csv"
df_Jap = pd.read_csv(path,encoding = 'utf-8')
df_Jap

image.png

Graph display:

fig = px.area(df_Jap, x="date", y="Number of infected people in Japan_Number of announcements per day", height=600, width=700,
             color_discrete_sequence = ["blue"])
fig.update_layout(xaxis_rangeslider_visible=True)
fig.show()

image.png

You can see that the first wave around April 2020, the second wave around August, and the third wave around December are coming.

Training data and test data

I divide the data into training data and test data, but here Training data: 2020/7/14-2020/11/30 Test data: 2020/12/1-2020/12/31 will do. The reason for setting it from July 14, 2020 is that before that, the number of patients was close to 0 and the characteristics of the data would change.

Here, the "date" column is converted to datetime64 type by the to_datetime method of pandas to make it easier to handle.

df_Jap['Date (datetime64)'] = pd.to_datetime(df_Jap['date'], errors='coerce')
import datetime as dt
Train_Jap=df_Jap[(dt.datetime(2020,7,14) <= df_Jap['Date (datetime64)']) & (df_Jap['Date (datetime64)']  <= dt.datetime(2020,11,30) )]
Test_Jap=df_Jap[(dt.datetime(2020,12,1) <= df_Jap['Date (datetime64)']) & (df_Jap['Date (datetime64)']  <= dt.datetime(2020,12,31) )]

Learning with Prophet

Prophet is a time series forecasting library developed by Facebook, details are described below. https://facebook.github.io/prophet/docs/quick_start.html

Feature: --Prophet input must be a dataframe with columns ds and y --ds is the date stamp (date), y is the numerical measurement data you want to predict

Renamed for use with Prophet:

Train_Jap=Train_Jap[["date","Number of infected people in Japan_Number of announcements per day"]].rename(columns={"date":"ds","Number of infected people in Japan_Number of announcements per day":"y"})
Test_Jap=Test_Jap[["date","Number of infected people in Japan_Number of announcements per day"]].rename(columns={"date":"ds","Number of infected people in Japan_Number of announcements per day":"y"})

Prophet settings

The Prophet model is trained against the training data (Train_Jap) and predicted against the test data (Test_Jap), but the following settings can be made.

You can select growth ='linear' and growth ='logistic' (logistic function). With growth ='logistic', you can set the upper limit ('cap' (carrying capacity)) and lower limit of the data, which is used here. 'cap' is set to 10000 here because of the visibility of the graph. (This is fine as long as the number of infected people per day does not exceed 10,000)

Train_Jap['cap']=10000
Test_Jap['cap']=10000

As a trend change point '2020/8/7', '2020/9/7', '2020/10/24', '2020/11/20' To set. (Read visually from the graph)

seasonality_mode ='additive' seasonality_mode ='multiplicative' You can choose between the two, but if you look at the graph above, the weekly periodicity increases in proportion to the number of infected people, so the multiplicative seasonality is adopted.

A parameter that adjusts the degree to which the periodicity of the model fits the data. First, let's set it to 1 temporarily, but later we will try to enter multiple values ​​by grid search.

It represents the variance of the Laplace distribution, which is the prior distribution of the trend term, and the larger it is, the easier it is to detect the change point. If you make this term too large, quite a few points will be detected as change points. First, let's set it to 1 temporarily, but later we will try to enter multiple values ​​by grid search.

Information source: https://www.atmarkit.co.jp/ait/articles/1906/07/news004_2.html

Represents the width of the uncertainty interval, which is 0.8 by default. I will try to enter multiple values ​​later in the grid search.

Execution of learning

model=Prophet(growth='logistic', seasonality_mode='multiplicative', seasonality_prior_scale=1, changepoint_prior_scale=1, changepoints=['2020/8/7', '2020/9/7', '2020/10/24', '2020/11/20'])
model.fit(Train_Jap)
forecast = model.predict(Test_Jap)
fig = model.plot_components(forecast)

Trends (movements excluding changes in annual and weekly cycles) and weekly learning results are available ↓ The annual cycle is not displayed because there is no data for one year. I knew that the number of infected people would decrease on Monday due to the small number of tests on Sunday, but on the contrary, it seems that there is a tendency for the number of people to be infected on Thursday, Friday and Saturday to increase:

image.png

View training data and forecast results. The change point of the set trend is displayed as a red dashed line, the forecast trend line is displayed as a red line, and the forecast result is displayed as a blue line.

from fbprophet.plot import add_changepoints_to_plot
plot = model.plot(forecast)
a = add_changepoints_to_plot(plot.gca(), model, forecast)
plt.ylim([-0, 5000])

image.png

Add the predicted value yhat obtained by training to Test_Jap as a column:

Test_Jap['yhat']=forecast['yhat'].values

Compare the actual number of infected people in December with the predicted results:

plt.figure(figsize=(20, 8))
plt.plot(Test_Jap['ds'], Test_Jap['y'], 'b-', label = 'Actual')
plt.plot(Test_Jap['ds'], Test_Jap['yhat'], 'r--', label = 'Prediction')
plt.xlabel('Date',rotation=90); plt.ylabel('Sales'); plt.title('Actual vs Prediction')
plt.xticks(rotation=90)
plt.legend();

image.png

Let's calculate the accuracy:

Test_Jap['diff']=(Test_Jap.y-Test_Jap.yhat).abs()
acc_ts2=(1-(Test_Jap['diff'].sum()/Test_Jap['y'].sum()))*100
acc_ts2

#85.55183564855452

The accuracy this time was about 86%. As an index MAE (Mean Absolute Error) MSE (Mean Squared Error) RMSE (Root Mean Square Error) Let's calculate.

MAE_ts=metrics.mean_absolute_error(Test_Jap['y'], Test_Jap['yhat'])
MSE_ts=metrics.mean_squared_error(Test_Jap['y'], Test_Jap['yhat'])
RMSE_ts=np.sqrt(metrics.mean_squared_error(Test_Jap['y'], Test_Jap['yhat']))
print('MAE:', MAE_ts)
print('MSE:', MSE_ts)
print('RMSE:', RMSE_ts)

MAE: 404.44140578238205
MSE: 305342.084316079
RMSE: 552.5776726543328

I don't think it's bad already, but we will tune the parameters by grid search below.

Grid search

I have set the following values ​​for'changepoint_prior_scale','seasonality_prior_scale', and'interval_width'.

params_grid = {'changepoint_prior_scale':[0.1, 0.5, 1, 2, 10],
               'seasonality_prior_scale':[0.1, 0.5, 1, 2, 10],
               'interval_width':[0.8, 0.85, 0.9, 0.95]
              }
grid = ParameterGrid(params_grid)
model_parameters = pd.DataFrame(columns = ['Acc','Parameters'])
for p in tqdm(grid):
    
    Train=Train_Jap.copy()
    Valid=Test_Jap[['ds','y','cap']].reset_index()
            
    m=Prophet(growth='logistic', 
              seasonality_mode='multiplicative', 
              seasonality_prior_scale=p['seasonality_prior_scale'], 
              changepoint_prior_scale=p['changepoint_prior_scale'], 
              changepoints=['2020/8/7', '2020/9/7', '2020/10/24', '2020/11/20'],
              interval_width = p['interval_width']
             )
            
    m.fit(Train_Jap)
            

    forecast = m.predict(Valid[['ds','cap']])
    forecast = forecast.astype({"ds": object,"cap": object})
    Valid=pd.concat([Valid.reset_index()[['ds','y']],forecast['yhat']], axis=1)
    
    #performance metric
    Valid['diff']=(Valid.y-Valid.yhat).abs()
    acc=(1-((Valid['diff'].sum()/Valid['y'].sum())))*100
    
    model_parameters = model_parameters.append({'Acc':acc,'Parameters':p},ignore_index=True)
            
parameters = model_parameters.sort_values(by=['Acc'],ascending=False)
parameters = parameters.reset_index(drop=True)
        
best_parameters=parameters['Parameters'][0]

The best parameter is 'changepoint_prior_scale': 0.5, 'interval_width': 0.8, 'seasonality_prior_scale': 2 have become. Let's try graph drawing and accuracy calculation again using these parameters.

m = Prophet(growth='logistic', 
            seasonality_mode='multiplicative', 
            seasonality_prior_scale=best_parameters['seasonality_prior_scale'],
            changepoint_prior_scale=best_parameters['changepoint_prior_scale'],
            changepoints=['2020/8/7', '2020/9/7', '2020/10/24', '2020/11/20'],
            interval_width =best_parameters['interval_width']
            )
m.fit(Train_Jap)
forecast=m.predict(Test_Jap)
from fbprophet.plot import add_changepoints_to_plot
plot = model.plot(forecast)
a = add_changepoints_to_plot(plot.gca(), model, forecast)
plt.ylim([-0, 5000])

image.png

Test_Jap['yhat']=forecast['yhat'].values
plt.figure(figsize=(20, 8))
plt.plot(Test_Jap['ds'], Test_Jap['y'], 'b-', label = 'Actual')
plt.plot(Test_Jap['ds'], Test_Jap['yhat'], 'r--', label = 'Prediction')
plt.xlabel('Date',rotation=90); plt.ylabel('Sales'); plt.title('Actual vs Prediction')
plt.xticks(rotation=90)
plt.legend();

image.png

Test_Jap_best['diff']=(Test_Jap_best.y-Test_Jap_best.yhat).abs()
acc_ts2=(1-(Test_Jap_best['diff'].sum()/Test_Jap_best['y'].sum()))*100
acc_ts2

#90.92890097106074

The accuracy was about 91% and the MAE, MSE, and RMSE were as follows:

MAE_ts=metrics.mean_absolute_error(Test_Jap['y'], Test_Jap['yhat'])
MSE_ts=metrics.mean_squared_error(Test_Jap['y'], Test_Jap['yhat'])
RMSE_ts=np.sqrt(metrics.mean_squared_error(Test_Jap['y'], Test_Jap['yhat']))
print('MAE:', MAE_ts)
print('MSE:', MSE_ts)
print('RMSE:', RMSE_ts)

#MAE: 253.92347110782634
#MSE: 111763.48089497152
#RMSE: 334.31045585648604

Learning with random forest regression

In using random forest regression, we will use the month, day, and day of the week as explanatory variables (here, variables that are thought to affect the number of patients with corona).

It may be more accurate to use the number of PCR tests and the number of travelers from overseas as explanatory variables here, but when predicting the number of corona patients beyond the present, such data cannot be used, so we will use it here. No.

def create_date_features(df):
    df['Moon']=df['Date (datetime64)'].dt.month
    df['Day']=df['Day付(datetime64)'].dt.day
    df['Day of the week (number)'] = df['Date (datetime64)'].dt.weekday
    return df

reg_Jap=create_date_features(df_Jap)
def getdayofweek(day):
    if (day == 0):
        return "Monday"
    elif(day == 1):
        return "Tuesday"
    elif(day ==2):
        return "Wednesday"
    elif(day == 3):
        return "Thursday"
    elif(day ==4):
        return "Friday"
    elif(day == 5):
        return "Saturday"
    elif(day ==6):
        return "Sunday"

reg_Jap['Day of the week'] = reg_Jap['Day of the week(番号)'] .apply(getdayofweek)
reg_Jap=pd.get_dummies(reg_Jap,columns=['Day of the week'])
reg_Jap

image.png

Training data:

Train_reg_Jap=reg_Jap[(dt.datetime(2020,7,14) <= reg_Jap['Date (datetime64)']) & (reg_Jap['Date (datetime64)']  <= dt.datetime(2020,11,30) )]
x_train_reg_Jap=Train_reg_Jap[['Moon','Day','曜Day_Moon曜','曜Day_Tuesday','曜Day_Wednesday','曜Day_Thursday','曜Day_Friday','曜Day_Saturday','曜Day_Day曜']]
y_train_reg_Jap=Train_reg_Jap[['Number of infected people in Japan_Number of announcements per day']]

test data:

Test_reg_Jap=reg_Jap[(dt.datetime(2020,12,1) <= reg_Jap['Date (datetime64)']) & (reg_Jap['Date (datetime64)']  <= dt.datetime(2020,12,31) )]
x_test_reg_Jap=Test_reg_Jap[['Moon','Day','曜Day_Moon曜','曜Day_Tuesday','曜Day_Wednesday','曜Day_Thursday','曜Day_Friday','曜Day_Saturday','曜Day_Day曜']]
y_test_reg_Jap=Test_reg_Jap[['Number of infected people in Japan_Number of announcements per day']]

Perform learning. First, set only n_estimators = 100 as a parameter, and use the default values ​​for the others.

--n_estimators: Number of decision trees

The parameters for random forest regression can be found here. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor :

rf_Jap = RandomForestRegressor(n_estimators=100)
rf_Jap.fit(x_train_reg_Jap,y_train_reg_Jap)
pred_rf = rf_Jap.predict(x_test_reg_Jap)

Accuracy calculation:

y_test_reg_Jap['diff']=abs((y_test_reg_Jap['Number of infected people in Japan_Number of announcements per day'].values-pred_rf))
acc_rf=(1-(y_test_reg_Jap['diff'].sum()/y_test_reg_Jap['Number of infected people in Japan_Number of announcements per day'].sum()))*100
acc_rf

#57.238258785021955

It became a low value of about 57%.

Grid search

--max_features: Number of features of decision tree

For “auto”, max_features = n_features, For “sqrt”, max_features = sqrt (n_features), For “log2”, max_features = log2 (n_features) Will be.

--min_samples_split: Minimum number of samples when splitting a tree

--bootstrap: Whether to use bootstrap sampling (a method to allow duplication and retrieve data randomly)

param_grid = { 
        "n_estimators"      : [10,20,300,100,200,500],
        "max_features"      : ["auto", "sqrt", "log2"],
        "min_samples_split" : [2,4,6,8],
        "bootstrap": [True, False],
            }
grid = ParameterGrid(param_grid)
model_parameters = pd.DataFrame(columns = ['Acc','Parameters'])
for p in tqdm(grid):
    
    X_Train=x_train_reg_Jap.copy()
    Y_Train=y_train_reg_Jap.copy()
    X_Valid=x_test_reg_Jap.copy()
    Y_Valid=y_test_reg_Jap.copy()
    m = RandomForestRegressor(n_estimators = p['n_estimators'],
               max_features = p['max_features'],
               min_samples_split = p['min_samples_split'],
               bootstrap=p['bootstrap']
        )
    
    m.fit(X_Train,Y_Train)
    pred_rf2 = m.predict(X_Valid)
            
    Y_Valid['yhat']=pred_rf2
    
    #performance metric
    Y_Valid_diff=(Y_Valid['Number of infected people in Japan_Number of announcements per day']-Y_Valid.yhat).abs()
    acc=(1-((Y_Valid_diff.sum()/Y_Valid['Number of infected people in Japan_Number of announcements per day'].sum())))*100
    
    model_parameters = model_parameters.append({'Acc':acc,'Parameters':p},ignore_index=True)
            
parameters = model_parameters.sort_values(by=['Acc'],ascending=False)
parameters = parameters.reset_index(drop=True)
        
best_parameters=parameters['Parameters'][0]
m2 = RandomForestRegressor(
        bootstrap=best_parameters['bootstrap'],
        max_features=best_parameters['max_features'],
        min_samples_split=best_parameters['min_samples_split'],
        n_estimators=best_parameters['n_estimators']
        )
m2.fit(x_train_reg_Jap,y_train_reg_Jap)
pred_rf2 = m2.predict(x_test_reg_Jap)
y_test_reg_Jap_diff2=abs((y_test_reg_Jap['Number of infected people in Japan_Number of announcements per day'].values-pred_rf2))
acc_rf2=(1-(y_test_reg_Jap_diff2.sum()/y_test_reg_Jap['Number of infected people in Japan_Number of announcements per day'].sum()))*100
acc_rf2

#57.96572037955858

Even if I used the best parameters, it was almost the same as about 57%. In addition, due to the randomness of random forest regression, there were cases where the best parameter was lower in accuracy in the above code.

Learning with XGBoost

XGBoost (extreme gradient boosting) is a type of gradient boosting that uses the error between the measured value and the predicted value when creating a new decision tree using the results of the previous decision tree. Is a feature.

The parameters of XGBoost are described here. https://xgboost.readthedocs.io/en/latest/parameter.html

Also, the explanation on this site was easy to understand. http://kamonohashiperry.com/archives/209

--nthread: Number of parallel threads used for calculation --learning_rate: Learning rate --max_depth: Maximum depth of decision tree --min_child_weight: The minimum total value of data weighting in child nodes --subsample: Percentage of samples randomly extracted in each tree --colsample_bytree: Percentage of columns randomly extracted in each tree

The data uses x_train_reg_Jap, y_train_reg_Jap, x_test_reg_Jap, and y_test_reg_Jap used in random forest regression. (Explanatory variables are month, day and day of the week)

Execution of learning:

xgb_Jap = XGBRegressor(n_estimators=100)
xgb_Jap.fit(x_train_reg_Jap,y_train_reg_Jap)
xgb_pred = xgb_Jap.predict(x_test_reg_Jap)

Accuracy calculation:

y_test_reg_us['diff']=(y_test_reg_us['Number of infected people in Japan_Number of announcements per day']-xgb_pred).abs()
acc_xg=(1-(y_test_reg_us['diff'].sum()/y_test_reg_us['Number of infected people in Japan_Number of announcements per day'].sum()))*100
acc_xg
#57.22788239483825

Grid search

param_grid = { 
            'nthread':[4], #when use hyperthread, xgboost may become slower,
            'learning_rate': [.03, 0.05, .07], #so called `eta` value
            'max_depth': [5, 6, 7],
            'min_child_weight': [1,4],
            'subsample': [0.7],
            'colsample_bytree': [0.7],
            'n_estimators': [100,200,500]
            }
grid = ParameterGrid(param_grid)
model_parameters = pd.DataFrame(columns = ['Acc','Parameters'])
for p in tqdm(grid):
    
    X_Train=x_train_reg_Jap.copy()
    Y_Train=y_train_reg_Jap.copy()
    X_Valid=x_test_reg_Jap.copy()
    Y_Valid=y_test_reg_Jap.copy()
    m = XGBRegressor(nthread = p['nthread'],
               learning_rate = p['learning_rate'],
               max_depth=p['max_depth'],
               min_child_weight = p['min_child_weight'],
               subsample = p['subsample'],
               colsample_bytree=p['colsample_bytree'],
               n_estimators=p['n_estimators']             )
    
    
    
    m.fit(X_Train,Y_Train)
    pred_xg2 = m.predict(X_Valid)
            
    Y_Valid['yhat']=pred_xg2
    
    #performance metric
    Y_Valid['diff']=(Y_Valid['Number of infected people in Japan_Number of announcements per day']-Y_Valid.yhat).abs()
    acc=(1-((Y_Valid['diff'].sum()/Y_Valid['Number of infected people in Japan_Number of announcements per day'].sum())))*100
    
    model_parameters = model_parameters.append({'Acc':acc,'Parameters':p},ignore_index=True)
            
parameters = model_parameters.sort_values(by=['Acc'],ascending=False)
parameters = parameters.reset_index(drop=True)
        
best_parameters=parameters['Parameters'][0]
m = XGBRegressor(
        nthread=best_parameters['nthread'],
        learning_rate=best_parameters['learning_rate'],
        max_depth=best_parameters['max_depth'],
        min_child_weight=best_parameters['min_child_weight'],
        subsample=best_parameters['subsample'],
        colsample_bytree=best_parameters['colsample_bytree'],
        n_estimators=best_parameters['n_estimators']
        )
m.fit(x_train_reg_Jap,y_train_reg_Jap)
pred_xg2 = m.predict(x_test_reg_Jap)
y_test_reg_Jap['diff']=(y_test_reg_Jap['Number of infected people in Japan_Number of announcements per day']-pred_xg2).abs()
acc_xg2=(1-(y_test_reg_Jap['diff'].sum()/y_test_reg_Jap['Number of infected people in Japan_Number of announcements per day'].sum()))*100
acc_xg2

#57.36199327011835

Again, the best parameters are almost the same as the previous values.

Summary

・ Prophet ・ Random forest regression ・ XGBoost In these three methods, a model was trained using data on the number of Japanese coronaviruses infected from July 14th to November 30th in 2020, and the number of infected people from December 1st to December 31st was predicted. The accuracy was about 91%, but it was as low as 57% for random forest regression and XGBoost.

I think it's not unreasonable because I could only use the month, day, and day of the week as explanatory variables.

My impression is that Prophet reads the pattern of the 2D graph and predicts it, and I think that it is more suitable than random forest regression and XGBoost in predicting the number of corona patients like this time.

Recommended Posts

Analyzing data on the number of corona patients in Japan
Count the number of characters in the text on the clipboard on mac
[Homology] Count the number of holes in data with Python
Posted the number of new corona positives in Tokyo to Slack (deployed on Heroku)
How strong is your Qiita? Statistics on the number of Contributes seen in the data
Organize useful blogs in the field of data science (overseas & Japan)
Get the number of readers of a treatise on Mendeley in Python
Output the number of CPU cores in Python
Find the number of days in a month
The story of reading HSPICE data in Python
Download the wind data of the Japan Meteorological Agency
Scraping the rainfall data of the Japan Meteorological Agency and displaying it on M5Stack
Set an upper limit on the number of recursive function iterations in Python
Create a BOT that displays the number of infected people in the new corona
Visualize the timeline of the number of issues on GitHub assigned to you in Python
How to get the number of digits in Python
Count the number of parameters in the deep learning model
Try to estimate the number of likes on Twitter
About the inefficiency of data transfer in luigi on-memory
Get the size (number of elements) of UnionFind in Python
Not being aware of the contents of the data in python
Difference in results depending on the argument of multiprocess.Process
Let's use the open data of "Mamebus" in Python
Find out the maximum number of characters in multi-line text stored in a data frame
Try scraping the data of COVID-19 in Tokyo with Python
How to find the optimal number of clusters in k-means
Maya | Find out the number of polygons in the selected object
A note on the default behavior of collate_fn in PyTorch
10. Counting the number of lines
Examine the margin of error in the number of deaths from pneumonia
Survey on the use of machine learning in real services
Get the number of digits
Get the number of specific elements in a python list
Python --Find out number of groups in the regex expression
Data analysis based on the election results of the Tokyo Governor's election (2020)
I created a Discord bot on Docker that reports the number of corona infected people in Tokyo at a specified time.
Get the number of occurrences for each element in the list
Calculate the number of changes
What I saw by analyzing the data of the engineer market
Arrange songs played on Spotify in order of number of views
I drew a Python graph using public data on the number of patients positive for the new coronavirus (COVID-19) in Tokyo + with a link to the national version of practice data
I tried fitting the exponential function and logistics function to the number of COVID-19 positive patients in Tokyo
Verification of the spread of hoaxes in the "State of Emergency Declaration on April 1"
Count the number of Thai and Arabic characters well in Python
[Python] Let's reduce the number of elements in the result in set operations
Django: Fluctuate the number of child forms depending on the number of input items
Seaborn basics for beginners ① Aggregate graph of the number of data (Countplot)
Find the rank of a matrix in the XOR world (rank of a matrix on F2)
Start data science on the cloud
Get the number of views of Qiita
The story of participating in AtCoder
Calculation of the number of Klamer correlations
The story of the "hole" in the file
Analyzing the age-specific severity of coronavirus
Check the data summary in CASTable
Get the number of Youtube subscribers
The meaning of ".object" in Django
Considering the situation in Japan by statistician Nate Silver, "The number of people infected with coronavirus is meaningless"
Save the audio data acquired by the browser in wav format on the server
Generate a list packed with the number of days in the current month.
Check the in-memory bytes of a floating point number float in Python