Since the stock market crashed due to the influence of the new coronavirus, I tried to visualize the performance of my investment trust with Python.

As of March 2020, stock prices have plummeted due to the global expansion of the new coronavirus.

Since it's a big deal, I tried to visualize the results of the investment trusts that I have accumulated steadily with Python and investigated how much impact it had. Please take a look at the results of my mutual fund.

Data collection

I am purchasing an investment trust at Rakuten Securities. If you go to Account Management> Investment Trusts> Investment Trusts After from the top menu, you can see the transition of the entire investment trust including all canceled stocks. If it's just the transition, you can fully check it on this screen, but we will get the raw data for further analysis. スクリーンショット 2020-03-05 22.04.33.png

Right-click on the screen to view the page source. Then, at the bottom (from the 1649th line onward), there are such arrays data1, data2, data3, data4, so copy them.

スクリーンショット 2020-03-05 22.10.39.png

You can also check it from the developer tools, so get it the way you like.

Check data with Google Colaboratory

Start Google Colaboratory and try to handle the two-dimensional array as it is for the time being.

#It's actually longer.
data1 = [[1520175600000,0.000000],[1520262000000,-2.000000],[1520348400000,-1.000000]]
data2 = [[1520175600000,3758.000000],[1520262000000,3756.000000],[1520348400000,3757.000000]]
data3 = [[1520175600000,3758.000000],[1520262000000,3758.000000],[1520348400000,3758.000000]]
data4 = [[1520175600000,0.000000],[1520262000000,0.000000],[1520348400000,0.000000]]

All lengths were 488. (It depends on the person.)

len(data1), len(data2), len(data3), len(data4)
# => (488, 488, 488, 488)

Make it a Pandas Dataframe. Originally it was a two-dimensional array of JavaScript, but since the shape of the array is the same in Python, you can read the copied one as it is with Pandas. From here, data1 is extracted and displayed.

import pandas as pd

df1 = pd.DataFrame(data1)
df1
0 1
0 1520175600000 0.0
1 1520262000000 -2.0
2 1520348400000 -1.0
3 1520434800000 -2.0
4 1520521200000 -2.0
... ... ...

The first column looks like a time stamp, but it doesn't seem to have a decimal point. Divide by 1000 to align the digits, index and then convert to Datetime.

df1[0] = df1[0] // 1000
df1 = df1.set_index([0])
df1.index = pd.to_datetime(df1.index, unit='s')
df1
1
0
2018-03-04 15:00:00 0.0
2018-03-05 15:00:00 -2.0
2018-03-06 15:00:00 -1.0
2018-03-07 15:00:00 -2.0
2018-03-08 15:00:00 -2.0
... ...
2020-02-26 15:00:00 59802.0
2020-02-27 15:00:00 19223.0
2020-03-01 15:00:00 -245.0
2020-03-02 15:00:00 37929.0
2020-03-03 15:00:00 9971.0

The data for the last two years is displayed. Since it is explained as Transition of the entire investment trust including all canceled stocks, the acquisition period may differ depending on the person.

For the time being, let's visualize this data1 with matplotlib.

import matplotlib.pyplot as plt
df1.plot()
plt.show()

Here are the results. total_return.png

If you compare it with Rakuten Securities' Investment Trust Ashiato screen, you can see that data1 is a total return. The profit of about 140,000 yen accumulated from around January 2019 has disappeared in an instant due to the influence of the coronavirus.

Try plotting the rest of data2 ~ 3 in the same way. datas.png

Apparently, data1 ~ 4 obtained from the source of the Rakuten Securities page correspond to the following items in order.

--Total return

After that, I put them together in one Dataframe to make it easier to handle, and changed the column names to match the contents. (Excluded this time because we have not purchased investment trusts with distributions) I also excluded the first six months, which was mostly dormant.

return value invest
time
2018-11-01 15:00:00 -1.0 1629.0 1630.0
2018-11-04 15:00:00 -1.0 0.0 1.0
2018-11-05 15:00:00 0.0 20000.0 20000.0
2018-11-06 15:00:00 67.0 20067.0 20000.0
2018-11-07 15:00:00 439.0 20439.0 20000.0
... ... ... ...

Make that kind of visualization with Plotly.

I personally like the atmosphere such as the color of Plotly, so I will try to visualize it with Plotly.

pip install plotly

First, let's reproduce the screen of Rakuten Securities.

Total return

import plotly.graph_objects as go

data = [
    go.Scatter(x=df.index, y=df['return'], name='Total return'),
]

layout = go.Layout(
    width=800,
    height=450,
    xaxis = dict(
        title='time',
        type='date',
        tickformat="%Y-%m",
        dtick='M1',
        tickangle=90,
        showgrid=False
    ),  
    yaxis = dict(
        title = 'Amount of money'
    ),
    showlegend=True,
    legend=dict(
        x=0.025,
        y=0.95,
        font=dict(
            size=15
        ),
        borderwidth=2
    ),
)

fig = go.Figure(
    data=data,
    layout=layout
)
fig.show()

スクリーンショット 2020-03-07 16.30.34.png

Valuation and real investment

The code is almost the same, so collapse (click to expand)
data = [
    go.Scatter(x=df.index, y=df['value'], name='Valuation'),
    go.Scatter(x=df.index, y=df['invest'], name='Investment amount')
]

layout = go.Layout(
    width=800,
    height=450,
    xaxis = dict(
        title='time',
        type='date',
        tickformat="%Y-%m",
        dtick='M1',
        tickangle=90,
        showgrid=False
    ),  
    yaxis = dict(
        title = 'Amount of money'
    ),
    legend=dict(
        x=0.025,
        y=0.95,
        font=dict(
            size=15
        ),
        borderwidth=2
    ),
)

fig = go.Figure(
    data=data,
    layout=layout
)
fig.show()

スクリーンショット 2020-03-07 16.05.42.png

Pre-return business day difference / rate of change

df_diff = df.diff()
The code is almost the same, so collapse (click to expand)
data = [
    go.Scatter(x=df_diff.index, y=df_diff['return'], name='Business day difference before return', line = dict(color='purple'))
]

layout = go.Layout(
    width=800,
    height=450,
    xaxis = dict(
        title='time',
        type='date',
        tickformat="%Y-%m",
        dtick='M1',
        tickangle=90,
        showgrid=False
    ),  
    yaxis = dict(
        title = 'Amount of money',
    ),
    showlegend=True,
    legend=dict(
        x=0.025,
        y=0.95,
        font=dict(
            size=15
        ),
        borderwidth=2
    ),
)

fig = go.Figure(
    data=data,
    layout=layout
)
fig.show()

スクリーンショット 2020-03-07 17.16.08.png

Obviously, as the investment amount increases compared to the absolute value, the change in return also increases, so let's look at the ratio to the valuation amount at that time.

df['return_change_ratio'] = df['return'].diff() / df['value']
df
The code is almost the same, so collapse (click to expand)
data = [
    go.Scatter(x=df.index, y=df['return_change_ratio'], name='Rate of change on business days before return', line = dict(color='purple'))
]

layout = go.Layout(
    width=800,
    height=450,
    xaxis = dict(
        title='time',
        type='date',
        tickformat="%Y-%m",
        dtick='M1',
        tickangle=90,
        showgrid=False
    ),  
    yaxis = dict(
        title = 'Rate of change',
        tickformat='%'
    ),
    showlegend=True,
    legend=dict(
        x=0.025,
        y=0.05,
        font=dict(
            size=15
        ),
        borderwidth=2
    ),
)

fig = go.Figure(
    data=data,
    layout=layout
)
fig.show()

スクリーンショット 2020-03-08 12.04.52.png

If you look at this, you can see that the day when assets fluctuate by ± 5% in one day is coming on a regular basis.

I arranged the data in descending order of the rate of decline.

df.sort_values('return_change_ratio', ascending=True).head(10)
return_change_ratio
time
2020-02-24 15:00:00 -0.055890
2020-02-27 15:00:00 -0.051792
2018-12-24 15:00:00 -0.049762
2020-03-05 15:00:00 -0.043918
2019-08-25 15:00:00 -0.039766
2019-01-03 15:00:00 -0.037813
2020-02-25 15:00:00 -0.035507
2018-12-04 15:00:00 -0.034236
2020-03-03 15:00:00 -0.033716
2019-08-14 15:00:00 -0.031938

The 1st, 2nd, 4th, 7th and 9th from the top are thought to be due to the corona shock, which accounted for 5 out of 10 days, accounting for half.

On the contrary, I arranged them in descending order of increase.

df.sort_values('return_change_ratio', ascending=False).head(10)
return_change_ratio
time
2020-03-02 15:00:00 0.044621
2018-12-26 15:00:00 0.040878
2020-03-04 15:00:00 0.038827
2019-01-06 15:00:00 0.026219
2019-08-13 15:00:00 0.022233
2020-02-04 15:00:00 0.021108
2019-11-04 15:00:00 0.019685
2019-06-04 15:00:00 0.018888
2018-11-07 15:00:00 0.018200
2019-09-05 15:00:00 0.017959

The first and third are probably the effects of the corona shock. It seems that the daily rate of increase also reached the highest level in my investment history.

Rate of change histogram

data = [go.Histogram(x=df['return_change_ratio'], xbins=dict(size=0.001))]

layout = go.Layout(
    width=800,
    height=450,
    xaxis = dict(
        title="Rate of change", 
        dtick=0.01,
        range=[-0.06, 0.06],
        tickformat='%'
    ),
    yaxis = dict(
        title="frequency"
    )
)

fig = go.Figure(
    data=data,
    layout=layout
)
fig.show()

スクリーンショット 2020-03-08 12.34.47.png

Two of the three outliers of frequency 1 on the right hem and the first, second, fourth, seventh, and ninth of the ten consecutive frequency 1s on the left hem are those of the corona shock.

Monthly return

I resampled it monthly with OHLC, took the difference between Open and Close, and examined the change in monthly return. (Is there a simpler way to do it?)

df_month = df['return'].resample('M').ohlc()
df_month['diff'] = df_month['close'] - df_month['open']
df_month
The code is almost the same, so collapse (click to expand)

スクリーンショット 2020-03-07 23.18.06.png

data = [
    go.Bar(x=df_month.index.strftime('%Y-%m'), y=df_month['diff'], name='Monthly return', marker_color ='orange')
]

layout = go.Layout(
    width=800,
    height=450,
    xaxis = dict(
        title='time',
        tickformat="%Y-%m",
        dtick='M1',
        tickangle=90,
        showgrid=False
    ),  
    yaxis = dict(
        title = 'Amount of money',
    ),
    showlegend=True,
    legend=dict(
        x=0.025,
        y=0.05,
        font=dict(
            size=15
        ),
        borderwidth=2
    ),
)

fig = go.Figure(
    data=data,
    layout=layout
)
fig.show()

When I resampled with Pandas, the last date in the period became an index and it was misaligned with the label of the plot, so I made it a character string.

スクリーンショット 2020-03-07 23.29.31.png

Monthly investment yield

In addition to daily savings, we buy more and sell irregularly, so if you want to calculate the monthly yield strictly, you have to consider how much you bought and sold on what day of the month. However, here, I simply calculated the return ratio based on the average value of the investment amount for the month.

That said, most are fixed-amount reserve investments, so it shouldn't be much different from this simple result. (It is possible to calculate exactly with only the data prepared this time.)

import numpy as np

df_month = df.resample('M').agg({'return' : lambda x: x[-1] - x[0], 'invest': np.mean})
df_month['interest'] = df_month['return'] / df_month['invest']
df_month
return invest interest
time
2018-11-30 -113.0 25081.550000 -0.004505
2018-12-31 -4160.0 39805.473684 -0.104508
2019-01-31 6285.0 85135.350000 0.073824
2019-02-28 8414.0 174170.210526 0.048309
2019-03-31 1161.0 238031.650000 0.004878
2019-04-30 5696.0 243301.684211 0.023411
2019-05-31 -16022.0 284249.052632 -0.056366
2019-06-30 20011.0 320820.857143 0.062374
2019-07-31 3929.0 352307.818182 0.011152
2019-08-31 -4843.0 399468.000000 -0.012124
2019-09-30 17063.0 460678.400000 0.037039
2019-10-31 19022.0 517172.190476 0.036781
2019-11-30 20319.0 571007.684211 0.035584
2019-12-31 21238.0 645114.238095 0.032921
2020-01-31 15858.0 688004.947368 0.023049
2020-02-29 -56921.0 732071.222222 -0.077753
2020-03-31 7343.0 817248.000000 0.008985

(Even if you don't use OHLC, you can calculate the difference in monthly returns by using the agg function.)

The code is almost the same, so collapse (click to expand)
data = [
    go.Bar(x=df_month.index.strftime('%Y-%m'), y=df_month['interest'], name='Monthly yield', marker_color ='pink')
]

layout = go.Layout(
    width=800,
    height=450,
    xaxis = dict(
        title='time',
        tickformat="%Y-%m",
        dtick='M1',
        tickangle=90,
        showgrid=False
    ),  
    yaxis = dict(
        title='yield',
        range = [-0.15, 0.15],
        tickformat='%'
    ),
    showlegend=True,
    legend=dict(
        x=0.025,
        y=0.95,
        font=dict(
            size=15
        ),
        borderwidth=2
    ),
)

fig = go.Figure(
    data=data,
    layout=layout
)
fig.show()

スクリーンショット 2020-03-08 22.18.04.png

In February 2020, when the coronavirus began to spread worldwide, the investment yield was about -7.78%`. There are still no signs of convergence, and the impact on the economy is expected to begin in earnest, so we are paying close attention to future trends.

The worst monthly investment yield was -10.45% in December 2018. By the way, there was a crash at the end of the year.

end

Since it was visualized on an ad hoc basis, it may be a misguided view or mistake. In that case, I would appreciate it if you could give us a quick comment.

Also, if you have this data, please suggest any opinions or advice that such analysis would be interesting. Thank you. Thank you for watching until the end.

Recommended Posts

Since the stock market crashed due to the influence of the new coronavirus, I tried to visualize the performance of my investment trust with Python.
I tried to automatically send the literature of the new coronavirus to LINE with Python
I tried to visualize the characteristics of new coronavirus infected person information with wordcloud
I tried to streamline the standard role of new employees with Python
I tried to predict the behavior of the new coronavirus with the SEIR model.
I tried to easily visualize the tweets of JAWS DAYS 2017 with Python + ELK
I tried to find the entropy of the image with python
[Python] I tried to visualize the follow relationship of Twitter
I tried to improve the efficiency of daily work with Python
[Python] I tried to visualize the prize money of "ONE PIECE" over 100 million characters with matplotlib.
[Python] I tried to visualize the night on the Galactic Railroad with WordCloud!
I tried to get the authentication code of Qiita API with Python.
I tried to visualize the text of the novel "Weathering with You" with WordCloud
I tried to get the movie information of TMDb API with Python
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University
Since it is the 20th anniversary of the formation, I tried to visualize the lyrics of Perfume with Word Cloud
I tried to touch the CSV file with Python
I tried to solve the soma cube with python
I tried to visualize the spacha information of VTuber
I tried to solve the problem with Python Vol.1
I tried to summarize the string operations of Python
I tried to put out the frequent word ranking of LINE talk with Python
Python practice 100 knocks I tried to visualize the decision tree of Chapter 5 using graphviz
I tried to automate the article update of Livedoor blog with Python and selenium.
[First data science ⑥] I tried to visualize the market price of restaurants in Tokyo
I tried to visualize the running data of the racing game (Assetto Corsa) with Plotly
I tried to compare the processing speed with dplyr of R and pandas of Python
The 15th offline real-time I tried to solve the problem of how to write with python
I tried "gamma correction" of the image with Python + OpenCV
I tried to simulate how the infection spreads with Python
I tried to find the average of the sequence with TensorFlow
How to write offline real time I tried to solve the problem of F02 with Python
I tried to tabulate the number of deaths per capita of COVID-19 (new coronavirus) by country
I evaluated the strategy of stock system trading with Python.
I tried to get the number of days of the month holidays (Saturdays, Sundays, and holidays) with python
[Python] I tried to visualize tweets about Corona with WordCloud
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"
I tried to divide the file into folders with Python
[Python] The status of each prefecture of the new coronavirus is only published in PDF, but I tried to scrape it without downloading it.
I tried to find out how to streamline the work flow with Excel × Python, my article summary ★
I tried to verify the speaker identification by the Speaker Recognition API of Azure Cognitive Services with Python. # 1
Stock price plummeted with "new corona"? I tried to get the Nikkei Stock Average by web scraping
I tried to verify the speaker identification by the Speaker Recognition API of Azure Cognitive Services with Python. # 2
I tried to predict the number of domestically infected people of the new corona with a mathematical model
I tried to solve the first question of the University of Tokyo 2019 math entrance exam with python sympy
I tried scraping the ranking of Qiita Advent Calendar with Python
I tried to automate the watering of the planter with Raspberry Pi
I want to output the beginning of the next month with Python
I tried to create a list of prime numbers with python
I tried to fix "I tried stochastic simulation of bingo game with Python"
I tried to expand the size of the logical volume with LVM
I tried to visualize Boeing of violin performance by pose estimation
I want to check the position of my face with OpenCV!
I tried to automatically collect images of Kanna Hashimoto with Python! !!
I tried to visualize the common condition of VTuber channel viewers
PhytoMine-I tried to get the genetic information of plants with Python
I tried to open the latest data of the Excel file managed by date in the folder with Python
I tried to output the rpm list of SSH login destination to an Excel sheet with Python + openpyxl.
I tried to refer to the fun rock-paper-scissors poi for beginners with Python
Did the number of store closures increase due to the impact of the new coronavirus?
I tried to visualize the age group and rate distribution of Atcoder