If you use various libraries with python, I thought, "You can do a little thing with a little code, and you can make a little script with a little 5 steps, which is convenient." So I just listed python and other commands. I may come up with this, but I will post a 10-step script on an irregular basis.
As ** 3rd **, I would like to get the data with datareader and post the writing to csv. </ font>
I think that there are many cases where csv and excel data are passed in-house, and even if it is published on the net, it is lined up with json etc. and csv is published.
The data to be acquired is from January 1, 2016 of fred's Nikkei 225. This data is stored in the pandas data frame and written to csv.
Finally, as a supplement, plot with the plot function.
【environment】 Linux: debian10 python: 3.7.3 pandas: 1.0.3 pandas-datareader: 0.8.1
To get the datareader of Nikkei 225 data acquisition, the syntax is as follows. pdr.DataReader('NIKKEI225' ,'fred' ,start)
When writing to csv, write with pandas. The syntax is dataframe .to_csv ('write filename.csv')
The code ran in jupyter.
#Get Nikkei225 data from fred with datareader and write to csv #outfile = ('./nikkei225_20200428.csv') import pandas as pd from pandas_datareader import data, wb import datetime import matplotlib.pyplot as plt #Data acquisition / storage in data frame start = datetime.datetime(2016, 1 ,1) df_nikkei225 = pdr.DataReader('NIKKEI225' ,'fred' ,start) #Write to csv df_nikkei225.to_csv('./nikkei225_20200428.csv')
In the above, in the storage of the script in the data frame, the start date to be acquired is specified in the first line, and the Nikkei 225 is acquired from fred in the second line from the specified date in the first line, and it is described as a data frame.
start = datetime.datetime(2016, 1 ,1) df_nikkei225 = pdr.DataReader('NIKKEI225' ,'fred' ,start)
The name of the data frame is "df_nikkei225", but in reality anything is fine.
The output to csv was output as "nikkei225_20200428.csv" in the current directory.
Let's take a look at the acquired data. "Df_nikkei225" is the data frame used in the previous code.
df_nikkei225 #Contents of the stored data frame NIKKEI225 DATE 2016-01-01 NaN 2016-01-04 18450.98 2016-01-05 18374.00 2016-01-06 18191.32 2016-01-07 17767.34 ... ... 2020-04-22 19137.95 2020-04-23 19429.44 2020-04-24 19262.00 2020-04-27 19783.22 2020-04-28 19771.19 1128 rows × 1 columns
As a [supplement], I will draw a graph and output it as a jpg. The easiest way to plot the above data is with two lines of code. First, I imported matplotlib with "import matplotlib.pyplot as plt".
In that state, if the "data frame name.plot ()" and data frame is "df", you can plot with df.plot (). </ font> As an example, when outputting a graph as jpg, plt.savefig ("output file name.jpg ") </ font>
#Graph drawing / output df_nikkei225.plot() plt.savefig("nikkei225_20200428.jpg ")
df_nikkei225.info() <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 1128 entries, 2016-01-01 to 2020-04-28 Data columns (total 1 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 NIKKEI225 1056 non-null float64 dtypes: float64(1) memory usage: 17.6 KB df_nikkei225.isnull().sum() NIKKEI225 72 dtype: int64 df_nikkei225.head() DATE 2016-01-01 NaN 2016-01-04 18450.98 2016-01-05 18374.00 2016-01-06 18191.32 2016-01-07 17767.34
There is no data on days when there are no transactions, but since the acquired data is plotted as it is, I think that the actual graph will be processed for missing values for the time being. I will touch on that in a separate article.
** Above, it was data acquisition and csv conversion with datareader. ** **