[PYTHON] Analysis of financial data by pandas and its visualization (2)

Return index and cumulative return

Continuing from Yesterday, we will continue to analyze financial data.

In analyzing a stock portfolio, returns usually indicate a percentage change in asset price. Find the percentage change in stock price from Apple's stock price in Yahoo! Finance.

Pandas dataframes have powerful functions for frequency conversion.

function Description
resample Convert data to fixed frequency
reindex Assign data to a new index

See the Reference (http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.html) for other dataframe functions.

[Adjusted closing price](http://www.yahoo-help.jp/app/answers/detail/p/546/a_id/45316/~/%E8%AA%BF%E6%95%B4%E5%BE % 8C% E7% B5% 82% E5% 80% A4% E3% 81% A8% E3% 81% AF) (Adjusted Closing Values) is a split to capture data continuously before and after a stock split or dividend. It is adjusted to the later value.

The return index is an index that shows the performance when the dividend of the stock is also taken into consideration, and is time-series data that has a value that represents the investment unit. Apple's return index can be found with the cumprod method.

import pandas as pd
import pandas.io.data as web

#Acquired adjusted closing price for Apple shares since 2010
price = web.get_data_yahoo('AAPL', '2009-12-31')['Adj Close']
returns = price.pct_change()
ret_index = (1 + returns).cumprod() #Calculation of return index
ret_index[0] = 1 #1 because the first line is NaN.To 0
print ( ret_index )
# => 
# Date
# 2009-12-31    1.000000
# 2010-01-04    1.015602
# 2010-01-05    1.017330
# 2010-01-06    1.001136
# 2010-01-07    0.999309
# 2010-01-08    1.005974
# 2010-01-11    0.997087
# 2010-01-12    0.985731
# 2010-01-13    0.999654
# 2010-01-14    0.993828
# 2010-01-15    0.977239
# 2010-01-19    1.020490
# 2010-01-20    1.004789
# 2010-01-21    0.987410
# 2010-01-22    0.938432
# ...
# 2014-02-19    2.653155
# 2014-02-20    2.622445
# 2014-02-21    2.593315
# 2014-02-24    2.604671
# 2014-02-25    2.577565
# 2014-02-26    2.554310
# 2014-02-27    2.605263
# 2014-02-28    2.598203
# 2014-03-03    2.605708
# 2014-03-04    2.622889
# 2014-03-05    2.628419
# 2014-03-06    2.620470
# 2014-03-07    2.618939
# 2014-03-10    2.621309
# 2014-03-11    2.646835

#Calculate cumulative return
m_returns = ret_index.resample('BM', how='last').pct_change()
print( m_returns['2014'] ) #Show 2014
# =>
# Date
# 2014-01-31   -0.107696
# 2014-02-28    0.057514
# 2014-03-31    0.018718

#Cumulative return can also be calculated by resample while performing aggregation.
m_returns = (1 + returns).resample('M', how='prod', kind='period') - 1
print( m_returns['2014'] ) #Show 2014(Same result)

When you print () the information of a huge data frame, it is automatically omitted and the beginning and end are displayed.

Cumulative return calculation and visualization of each company's stock portfolio

Let's plot the stock portfolio price history in the financial and IT sectors, focusing on the three years since 2010, especially after the earthquake until March 11th of this year.

def get_px(stock, start, end):
    return web.get_data_yahoo(stock, start, end)['Adj Close']

names = ['AAPL', 'GOOG', 'MSFT', 'DELL', 'GS', 'MS', 'BAC', 'C']
px = pd.DataFrame( {n: get_px(n, '1/1/2010', '3/11/2014') for n in names} )

px = px.asfreq('B').fillna(method='pad')
rets = px.pct_change()
result = ((1 + rets).cumprod() - 1)

plt.figure()
result.plot()
plt.show()
plt.savefig("image.png ")

image.png

From here you can calculate portfolio returns over a period of time and backtest your strategy with various visualizations.

By handling financial data in a data frame that is easy to visualize and has abundant functions, we found that ad hoc analysis can be tried without relying on expensive software.

reference

Introduction to data analysis with Python-Data processing using NumPy and pandas http://www.oreilly.co.jp/books/9784873116556/

Recommended Posts

Analysis of financial data by pandas and its visualization (2)
Analysis of financial data by pandas and its visualization (1)
Practice of data analysis by Python and pandas (Tokyo COVID-19 data edition)
Visualization of data by prefecture
Visualization method of data by explanatory variable and objective variable
Starbucks Twitter Data Location Visualization and Analysis
Implement "Data Visualization Design # 3" with pandas and matplotlib
Calculation of technical indicators by TA-Lib and pandas
Sentiment analysis of large-scale tweet data by NLTK
Data visualization with pandas
Overview and tips of seaborn with statistical data visualization
Story of image analysis of PDF file and data extraction
[Control engineering] Visualization and analysis of PID control and step response
Analysis of measurement data ②-Histogram and fitting, lmfit recommendation-
Overview of natural language processing and its data preprocessing
Visualization memo by pandas, seaborn
Data analysis using python pandas
Negative / positive judgment of sentences and visualization of grounds by Transformer
Negative / positive judgment of sentences by BERT and visualization of grounds
Visualization of matrix created by numpy
Beginning of Nico Nico Pedia analysis ~ JSON and touch the provided data ~
First satellite data analysis by Tellus
Recommended books and sources of data analysis programming (Python or R)
A simple data analysis of Bitcoin provided by CoinMetrics in Python
Data visualization method using matplotlib (+ pandas) (5)
Automatic acquisition of gene expression level data by python and R
Scientific / technical calculation by Python] Drawing and visualization of 3D isosurface and its cross section using mayavi
Data visualization method using matplotlib (+ pandas) (3)
Impressions of touching Dash, a data visualization tool made by python
10 selections of data extraction by pandas.DataFrame.query
Animation of geographic data by geopandas
Clash of Clans and image analysis (3)
Recommendation of data analysis using MessagePack
Time series analysis 3 Preprocessing of time series data
Data analysis starting with python (data visualization 1)
Data visualization method using matplotlib (+ pandas) (4)
Data analysis starting with python (data visualization 2)
Aggregation and visualization of accumulated numbers
Data handling 2 Analysis of various data formats
Graph the ratio of topcoder, Codeforces and TOEIC by rating (Pandas + seaborn)
Preprocessing of Wikipedia dump files and word-separation of large amounts of data by MeCab
Visualization of latitude / longitude coordinate data (assuming meteorological data) using cartopy and matplotlib
[In-Database Python Analysis Tutorial with SQL Server 2017] Step 3: Data Exploration and Visualization
Summary of probability distributions that often appear in statistics and data analysis
Python visualization tool for data analysis work
Import of japandas with pandas 1.0 and above
A little scrutiny of pandas 1.0 and dask
Example of 3D skeleton analysis by Python
Regression model and its visualization using scikit-learn
Correlation visualization of features and objective variables
Pandas of the beginner, by the beginner, for the beginner [Python]
Separation of design and data in matplotlib
Recommendation of Altair! Data visualization with Python
Analysis of X-ray microtomography image by Python
Example of efficient data processing with PANDAS
[Python] From morphological analysis of CSV data to CSV output and graph display [GiNZA]
Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 1
[Python] Comparison of Principal Component Analysis Theory and Implementation by Python (PCA, Kernel PCA, 2DPCA)
Image analysis was easy using the data and API provided by Microsoft COCO.
Beautiful graph drawing with python -seaborn makes data analysis and visualization easier Part 2
Predict short-lived works of Weekly Shonen Jump by machine learning (Part 1: Data analysis)