[PYTHON] [Stock price analysis] Learning pandas with fictitious data (002: Log output)

From the continuation of the last time (Load DataFrame)

After trial and error, I finally succeeded in reading read_csv. (There is no particular reason why I didn't use read_table last time. To be honest, I think it's the same for read_csv and read_table.)

[Stock Price Analysis] Learning pandas with fictitious data (001: Environmental preparation-File reading)

~~ ↑ The address of the previous article is https://qiita.com/waka_taka/items/93049e603dcdd046cc01 I mentioned it because it was, but I wanted to embed it, https://camo.qiitausercontent.com/d76aa803f668e38e03042f90af5a95c8ce768712/68747470733a2f2f71696974612e636f6d2f77616b615f74616b612f6974656d732f3933303439653630336463646430343663633031 It was converted to and did not link well. (Is it Qiita's specification? I'm not used to MarkDown format yet, so it feels like a simple careless mistake, but ...) ~~

It was a normal careless mistake. .. .. I just wrote it in the image embedding syntax.

Implemented logging function

Since it's a big deal, it's awkward to output debug information with a print statement one by one, so I'll try to implement it while studying logging. This article has nothing to do with stock price analysis.

Success_case01.py


import pandas as pd
import logging

#Specifying the log output file name and log output level
logging.basicConfig(filename='CodeLog.log', level=logging.INFO)

#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('SampleStock01_t1.csv', encoding='SJIS', \
	header=1, sep='\t')

#Output of debug information
logging.info(dframe)

CodeLog.log


INFO:root:Date Open Price High Low Price Close
0     2016/1/4   9,934  10,055   9,933  10,000
1     2016/1/5  10,062  10,092   9,942  10,015
2     2016/1/6   9,961  10,041   9,928  10,007
3     2016/1/7   9,946  10,060   9,889   9,968
4     2016/1/8   9,812   9,952   9,730   9,932
..         ...     ...     ...     ...     ...
937  2019/11/1  13,956  15,059  13,940  14,928
938  2019/11/5  13,893  15,054  13,820  14,968
939  2019/11/6  14,003  15,155  13,919  15,047
940  2019/11/7  14,180  15,054  14,057  15,041
941  2019/11/8  14,076  15,052  13,939  15,041

[942 rows x 5 columns]

Formatting of log format (output information such as date to log)

logging --- logging function for Python Try to format the log format with reference to

Success_case02.py


import pandas as pd
import logging

#Specifying the log format
# %(asctime)s :A human-readable representation of the time the LogRecord was generated.
# %(funcName)s :The name of the function that contains the logging call
# %(levelname)s :Character logging level for messages
# %(lineno)d :Source line number where the logging call was issued
# %(message)s : msg %Log message requested as args
fomatter = '%(asctime)s:%(funcName)s:%(levelname)s:%(lineno)d:\n%(message)s'

#Specifying the log output file name and log output level
#Add log format specification(format parameter)
logging.basicConfig(filename='CodeLog.log', level=logging.INFO, format=fomatter)

#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('SampleStock01_t1.csv', encoding='SJIS', \
	header=1, sep='\t')

logging.info(dframe)

CodeLog.log


2019-11-11 12:49:17,060:<module>:INFO:20:
Date Open Price High Low Price Close
0     2016/1/4   9,934  10,055   9,933  10,000
1     2016/1/5  10,062  10,092   9,942  10,015
2     2016/1/6   9,961  10,041   9,928  10,007
3     2016/1/7   9,946  10,060   9,889   9,968
4     2016/1/8   9,812   9,952   9,730   9,932
..         ...     ...     ...     ...     ...
937  2019/11/1  13,956  15,059  13,940  14,928
938  2019/11/5  13,893  15,054  13,820  14,968
939  2019/11/6  14,003  15,155  13,919  15,047
940  2019/11/7  14,180  15,054  14,057  15,041
941  2019/11/8  14,076  15,052  13,939  15,041

[942 rows x 5 columns]

Use logger

In application development, it is a practice to set logging with ** main function ** and use logger for everything else, so I used it even on a small scale where I can not find the need to use logger as a habit. I think

Success_case03.py


import pandas as pd
import logging

#Specifying the log format
# %(asctime)s :A human-readable representation of the time the LogRecord was generated.
# %(funcName)s :The name of the function that contains the logging call
# %(levelname)s :Character logging level for messages
# %(lineno)d :Source line number where the logging call was issued
# %(message)s : msg %Log message requested as args
fomatter = '%(asctime)s:%(funcName)s:%(levelname)s:%(lineno)d:\n%(message)s'

#Specifying the log output file name and log output level
#Add log format specification(format parameter)
logging.basicConfig(filename='CodeLog.log', level=logging.INFO, format=fomatter)

#Logger settings(INFO log level)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('SampleStock01_t1.csv', encoding='SJIS', \
	header=1, sep='\t')

#Change to use logger
logger.info(dframe)

CodeLog.log


2019-11-11 13:00:59,279:<module>:INFO:25:
Date Open Price High Low Price Close
0     2016/1/4   9,934  10,055   9,933  10,000
1     2016/1/5  10,062  10,092   9,942  10,015
2     2016/1/6   9,961  10,041   9,928  10,007
3     2016/1/7   9,946  10,060   9,889   9,968
4     2016/1/8   9,812   9,952   9,730   9,932
..         ...     ...     ...     ...     ...
937  2019/11/1  13,956  15,059  13,940  14,928
938  2019/11/5  13,893  15,054  13,820  14,968
939  2019/11/6  14,003  15,155  13,919  15,047
940  2019/11/7  14,180  15,054  14,057  15,041
941  2019/11/8  14,076  15,052  13,939  15,041

[942 rows x 5 columns]

Use handler

To be honest, I don't know "why I have to write it like this" yet, but I'm writing it because it's better for beginners to get used to this format. In the future, if you find the value of this description method, we will correct it as appropriate.

Success_case04.py


import pandas as pd
import logging

#Specifying the log format
# %(asctime)s :A human-readable representation of the time the LogRecord was generated.
# %(funcName)s :The name of the function that contains the logging call
# %(levelname)s :Character logging level for messages
# %(lineno)d :Source line number where the logging call was issued
# %(message)s : msg %Log message requested as args
fomatter = logging.Formatter('%(asctime)s:%(funcName)s:%(levelname)s:%(lineno)d:\n%(message)s')

#Logger settings(INFO log level)
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

#Handler settings(Change output file/Log level settings/Log format settings)
handler = logging.FileHandler('handler_log.log')
handler.setLevel(logging.INFO)
handler.setFormatter(fomatter)

logger.addHandler(handler)

#CSV file(SampleStock01.csv)Specify the character code of
dframe = pd.read_csv('SampleStock01_t1.csv', encoding='SJIS', \
	header=1, sep='\t')

#Change to use logger
logger.info(dframe)

handler_log.log


2019-11-11 13:31:56,161:<module>:INFO:28:
Date Open Price High Low Price Close
0     2016/1/4   9,934  10,055   9,933  10,000
1     2016/1/5  10,062  10,092   9,942  10,015
2     2016/1/6   9,961  10,041   9,928  10,007
3     2016/1/7   9,946  10,060   9,889   9,968
4     2016/1/8   9,812   9,952   9,730   9,932
..         ...     ...     ...     ...     ...
937  2019/11/1  13,956  15,059  13,940  14,928
938  2019/11/5  13,893  15,054  13,820  14,968
939  2019/11/6  14,003  15,155  13,919  15,047
940  2019/11/7  14,180  15,054  14,057  15,041
941  2019/11/8  14,076  15,052  13,939  15,041

[942 rows x 5 columns]

Finally

When I try to post continuously with Qiita, I get a message asking me to post after a while. I don't know because the load on the server will increase, but if I want to post the information I studied at any time, it's a shame that the information I want to write disappears from my head. Well, it's very easy to use, so I'm not dissatisfied with it.

This time, the information is completely unrelated to stock price analysis, but next time I would like to write an article about playing with panda and matplotlib. Log output is so important that I wanted to include it in the early stages.

Recommended Posts

[Stock price analysis] Learning pandas with fictitious data (002: Log output)
[Stock price analysis] Learning pandas with fictitious data (001: environment preparation-file reading)
[Stock price analysis] Learning pandas with fictitious data (003: Type organization-candlestick chart)
[Stock price analysis] Learn pandas with Nikkei 225 (004: Change read data to Nikkei 225)
Download Japanese stock price data with python
Data analysis starting with python (data preprocessing-machine learning)
Get stock price data with Quandl API [Python]
Automatic acquisition of stock price data with docker-compose
Stock price forecast using deep learning [Data acquisition]
Data analysis with python 2
Data visualization with pandas
Shuffle data with pandas
Data analysis with Python
Data analysis environment construction with Python (IPython notebook + Pandas)
Data is missing when getting stock price data with Pandas-datareader
Get stock price with Python
Stock price data acquisition tips
Data analysis using python pandas
Data processing tips with Pandas
Get Japanese stock price information from yahoo finance with pandas
Output log file with Job (Notebook) of Cloud Pak for Data
Input / output with Python (Python learning memo ⑤)
Versatile data plotting with pandas + matplotlib
Convenient analysis with Pandas + Jupyter notebook
Data analysis starting with python (data visualization 1)
Data analysis starting with python (data visualization 2)
Stock price forecast using machine learning (scikit-learn)
Plot the Nikkei Stock Average with pandas
Machine learning imbalanced data sklearn with k-NN
I tried factor analysis with Titanic data!
Output Python log to console with GAE
Try Bitcoin Price Forecasting with Deep Learning
Stock price forecast using deep learning (TensorFlow)
[Python] First data analysis / machine learning (Kaggle)
Try converting to tidy data with pandas
Sentiment analysis of tweets with deep learning
Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~
Stock price forecast using machine learning (regression)
Preprocessing in machine learning 1 Data analysis process
Working with 3D data structures in pandas
Example of efficient data processing with PANDAS
Best practices for messing with data with pandas
Python practice data analysis Summary of learning that I hit about 10 with 100 knocks
How to replace with Pandas DataFrame, which is useful for data analysis (easy)
The first step to log analysis (how to format and put log data in Pandas)