As of March 2020, stock prices have plummeted due to the global expansion of the new coronavirus.
Since it's a big deal, I tried to visualize the results of the investment trusts that I have accumulated steadily with Python and investigated how much impact it had. Please take a look at the results of my mutual fund.
I am purchasing an investment trust at Rakuten Securities.
If you go to Account Management> Investment Trusts> Investment Trusts After
from the top menu, you can see the transition of the entire investment trust including all canceled stocks.
If it's just the transition, you can fully check it on this screen, but we will get the raw data for further analysis.
Right-click on the screen to view the page source.
Then, at the bottom (from the 1649th line onward), there are such arrays data1, data2, data3, data4
, so copy them.
You can also check it from the developer tools, so get it the way you like.
Start Google Colaboratory and try to handle the two-dimensional array as it is for the time being.
#It's actually longer.
data1 = [[1520175600000,0.000000],[1520262000000,-2.000000],[1520348400000,-1.000000]]
data2 = [[1520175600000,3758.000000],[1520262000000,3756.000000],[1520348400000,3757.000000]]
data3 = [[1520175600000,3758.000000],[1520262000000,3758.000000],[1520348400000,3758.000000]]
data4 = [[1520175600000,0.000000],[1520262000000,0.000000],[1520348400000,0.000000]]
All lengths were 488. (It depends on the person.)
len(data1), len(data2), len(data3), len(data4)
# => (488, 488, 488, 488)
Make it a Pandas Dataframe.
Originally it was a two-dimensional array of JavaScript, but since the shape of the array is the same in Python, you can read the copied one as it is with Pandas
. From here, data1 is extracted and displayed.
import pandas as pd
df1 = pd.DataFrame(data1)
df1
0 | 1 | |
---|---|---|
0 | 1520175600000 | 0.0 |
1 | 1520262000000 | -2.0 |
2 | 1520348400000 | -1.0 |
3 | 1520434800000 | -2.0 |
4 | 1520521200000 | -2.0 |
... | ... | ... |
The first column looks like a time stamp, but it doesn't seem to have a decimal point. Divide by 1000 to align the digits, index and then convert to Datetime.
df1[0] = df1[0] // 1000
df1 = df1.set_index([0])
df1.index = pd.to_datetime(df1.index, unit='s')
df1
1 | |
---|---|
0 | |
2018-03-04 15:00:00 | 0.0 |
2018-03-05 15:00:00 | -2.0 |
2018-03-06 15:00:00 | -1.0 |
2018-03-07 15:00:00 | -2.0 |
2018-03-08 15:00:00 | -2.0 |
... | ... |
2020-02-26 15:00:00 | 59802.0 |
2020-02-27 15:00:00 | 19223.0 |
2020-03-01 15:00:00 | -245.0 |
2020-03-02 15:00:00 | 37929.0 |
2020-03-03 15:00:00 | 9971.0 |
The data for the last two years is displayed.
Since it is explained as Transition of the entire investment trust including all canceled stocks
, the acquisition period may differ depending on the person.
For the time being, let's visualize this data1 with matplotlib
.
import matplotlib.pyplot as plt
df1.plot()
plt.show()
Here are the results.
If you compare it with Rakuten Securities' Investment Trust Ashiato
screen, you can see that data1 is a total return.
The profit of about 140,000 yen accumulated from around January 2019 has disappeared in an instant due to the influence of the coronavirus.
Try plotting the rest of data2 ~ 3 in the same way.
Apparently, data1 ~ 4 obtained from the source of the Rakuten Securities page correspond to the following items in order.
--Total return
After that, I put them together in one Dataframe to make it easier to handle, and changed the column names to match the contents. (Excluded this time because we have not purchased investment trusts with distributions) I also excluded the first six months, which was mostly dormant.
return | value | invest | |
---|---|---|---|
time | |||
2018-11-01 15:00:00 | -1.0 | 1629.0 | 1630.0 |
2018-11-04 15:00:00 | -1.0 | 0.0 | 1.0 |
2018-11-05 15:00:00 | 0.0 | 20000.0 | 20000.0 |
2018-11-06 15:00:00 | 67.0 | 20067.0 | 20000.0 |
2018-11-07 15:00:00 | 439.0 | 20439.0 | 20000.0 |
... | ... | ... | ... |
I personally like the atmosphere such as the color of Plotly, so I will try to visualize it with Plotly.
pip install plotly
First, let's reproduce the screen of Rakuten Securities.
import plotly.graph_objects as go
data = [
go.Scatter(x=df.index, y=df['return'], name='Total return'),
]
layout = go.Layout(
width=800,
height=450,
xaxis = dict(
title='time',
type='date',
tickformat="%Y-%m",
dtick='M1',
tickangle=90,
showgrid=False
),
yaxis = dict(
title = 'Amount of money'
),
showlegend=True,
legend=dict(
x=0.025,
y=0.95,
font=dict(
size=15
),
borderwidth=2
),
)
fig = go.Figure(
data=data,
layout=layout
)
fig.show()
data = [
go.Scatter(x=df.index, y=df['value'], name='Valuation'),
go.Scatter(x=df.index, y=df['invest'], name='Investment amount')
]
layout = go.Layout(
width=800,
height=450,
xaxis = dict(
title='time',
type='date',
tickformat="%Y-%m",
dtick='M1',
tickangle=90,
showgrid=False
),
yaxis = dict(
title = 'Amount of money'
),
legend=dict(
x=0.025,
y=0.95,
font=dict(
size=15
),
borderwidth=2
),
)
fig = go.Figure(
data=data,
layout=layout
)
fig.show()
df_diff = df.diff()
data = [
go.Scatter(x=df_diff.index, y=df_diff['return'], name='Business day difference before return', line = dict(color='purple'))
]
layout = go.Layout(
width=800,
height=450,
xaxis = dict(
title='time',
type='date',
tickformat="%Y-%m",
dtick='M1',
tickangle=90,
showgrid=False
),
yaxis = dict(
title = 'Amount of money',
),
showlegend=True,
legend=dict(
x=0.025,
y=0.95,
font=dict(
size=15
),
borderwidth=2
),
)
fig = go.Figure(
data=data,
layout=layout
)
fig.show()
Obviously, as the investment amount increases compared to the absolute value, the change in return also increases, so let's look at the ratio to the valuation amount at that time.
df['return_change_ratio'] = df['return'].diff() / df['value']
df
data = [
go.Scatter(x=df.index, y=df['return_change_ratio'], name='Rate of change on business days before return', line = dict(color='purple'))
]
layout = go.Layout(
width=800,
height=450,
xaxis = dict(
title='time',
type='date',
tickformat="%Y-%m",
dtick='M1',
tickangle=90,
showgrid=False
),
yaxis = dict(
title = 'Rate of change',
tickformat='%'
),
showlegend=True,
legend=dict(
x=0.025,
y=0.05,
font=dict(
size=15
),
borderwidth=2
),
)
fig = go.Figure(
data=data,
layout=layout
)
fig.show()
If you look at this, you can see that the day when assets fluctuate by ± 5% in one day is coming on a regular basis.
I arranged the data in descending order of the rate of decline.
df.sort_values('return_change_ratio', ascending=True).head(10)
return_change_ratio | |
---|---|
time | |
2020-02-24 15:00:00 | -0.055890 |
2020-02-27 15:00:00 | -0.051792 |
2018-12-24 15:00:00 | -0.049762 |
2020-03-05 15:00:00 | -0.043918 |
2019-08-25 15:00:00 | -0.039766 |
2019-01-03 15:00:00 | -0.037813 |
2020-02-25 15:00:00 | -0.035507 |
2018-12-04 15:00:00 | -0.034236 |
2020-03-03 15:00:00 | -0.033716 |
2019-08-14 15:00:00 | -0.031938 |
The 1st, 2nd, 4th, 7th and 9th from the top are thought to be due to the corona shock, which accounted for 5 out of 10 days, accounting for half.
On the contrary, I arranged them in descending order of increase.
df.sort_values('return_change_ratio', ascending=False).head(10)
return_change_ratio | |
---|---|
time | |
2020-03-02 15:00:00 | 0.044621 |
2018-12-26 15:00:00 | 0.040878 |
2020-03-04 15:00:00 | 0.038827 |
2019-01-06 15:00:00 | 0.026219 |
2019-08-13 15:00:00 | 0.022233 |
2020-02-04 15:00:00 | 0.021108 |
2019-11-04 15:00:00 | 0.019685 |
2019-06-04 15:00:00 | 0.018888 |
2018-11-07 15:00:00 | 0.018200 |
2019-09-05 15:00:00 | 0.017959 |
The first and third are probably the effects of the corona shock. It seems that the daily rate of increase also reached the highest level in my investment history.
data = [go.Histogram(x=df['return_change_ratio'], xbins=dict(size=0.001))]
layout = go.Layout(
width=800,
height=450,
xaxis = dict(
title="Rate of change",
dtick=0.01,
range=[-0.06, 0.06],
tickformat='%'
),
yaxis = dict(
title="frequency"
)
)
fig = go.Figure(
data=data,
layout=layout
)
fig.show()
Two of the three outliers of frequency 1 on the right hem and the first, second, fourth, seventh, and ninth of the ten consecutive frequency 1s on the left hem are those of the corona shock.
I resampled it monthly with OHLC, took the difference between Open and Close, and examined the change in monthly return. (Is there a simpler way to do it?)
df_month = df['return'].resample('M').ohlc()
df_month['diff'] = df_month['close'] - df_month['open']
df_month
data = [
go.Bar(x=df_month.index.strftime('%Y-%m'), y=df_month['diff'], name='Monthly return', marker_color ='orange')
]
layout = go.Layout(
width=800,
height=450,
xaxis = dict(
title='time',
tickformat="%Y-%m",
dtick='M1',
tickangle=90,
showgrid=False
),
yaxis = dict(
title = 'Amount of money',
),
showlegend=True,
legend=dict(
x=0.025,
y=0.05,
font=dict(
size=15
),
borderwidth=2
),
)
fig = go.Figure(
data=data,
layout=layout
)
fig.show()
When I resampled with Pandas, the last date in the period became an index and it was misaligned with the label of the plot, so I made it a character string.
In addition to daily savings, we buy more and sell irregularly, so if you want to calculate the monthly yield strictly, you have to consider how much you bought and sold on what day of the month. However, here, I simply calculated the return ratio based on the average value of the investment amount for the month.
That said, most are fixed-amount reserve investments, so it shouldn't be much different from this simple result. (It is possible to calculate exactly with only the data prepared this time.)
import numpy as np
df_month = df.resample('M').agg({'return' : lambda x: x[-1] - x[0], 'invest': np.mean})
df_month['interest'] = df_month['return'] / df_month['invest']
df_month
return | invest | interest | |
---|---|---|---|
time | |||
2018-11-30 | -113.0 | 25081.550000 | -0.004505 |
2018-12-31 | -4160.0 | 39805.473684 | -0.104508 |
2019-01-31 | 6285.0 | 85135.350000 | 0.073824 |
2019-02-28 | 8414.0 | 174170.210526 | 0.048309 |
2019-03-31 | 1161.0 | 238031.650000 | 0.004878 |
2019-04-30 | 5696.0 | 243301.684211 | 0.023411 |
2019-05-31 | -16022.0 | 284249.052632 | -0.056366 |
2019-06-30 | 20011.0 | 320820.857143 | 0.062374 |
2019-07-31 | 3929.0 | 352307.818182 | 0.011152 |
2019-08-31 | -4843.0 | 399468.000000 | -0.012124 |
2019-09-30 | 17063.0 | 460678.400000 | 0.037039 |
2019-10-31 | 19022.0 | 517172.190476 | 0.036781 |
2019-11-30 | 20319.0 | 571007.684211 | 0.035584 |
2019-12-31 | 21238.0 | 645114.238095 | 0.032921 |
2020-01-31 | 15858.0 | 688004.947368 | 0.023049 |
2020-02-29 | -56921.0 | 732071.222222 | -0.077753 |
2020-03-31 | 7343.0 | 817248.000000 | 0.008985 |
(Even if you don't use OHLC, you can calculate the difference in monthly returns by using the agg function.)
data = [
go.Bar(x=df_month.index.strftime('%Y-%m'), y=df_month['interest'], name='Monthly yield', marker_color ='pink')
]
layout = go.Layout(
width=800,
height=450,
xaxis = dict(
title='time',
tickformat="%Y-%m",
dtick='M1',
tickangle=90,
showgrid=False
),
yaxis = dict(
title='yield',
range = [-0.15, 0.15],
tickformat='%'
),
showlegend=True,
legend=dict(
x=0.025,
y=0.95,
font=dict(
size=15
),
borderwidth=2
),
)
fig = go.Figure(
data=data,
layout=layout
)
fig.show()
In February 2020, when the coronavirus began to spread worldwide, the investment yield was about -7.78%`. There are still no signs of convergence, and the impact on the economy is expected to begin in earnest, so we are paying close attention to future trends.
The worst monthly investment yield was -10.45%
in December 2018. By the way, there was a crash at the end of the year.
Since it was visualized on an ad hoc basis, it may be a misguided view or mistake. In that case, I would appreciate it if you could give us a quick comment.
Also, if you have this data, please suggest any opinions or advice that such analysis would be interesting. Thank you. Thank you for watching until the end.
Recommended Posts