I know? Data analysis using Python or things you want to use when you want with numpy

Codes that you want to reach when you want them, mainly in Pandas

** I tried to make it like Advent Calender, but I'm sorry I couldn't make it in time **

I will add it soon

Below, give an alias as follows

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Do not omit columns and rows when displaying

pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 500)

Use describe () for lines other than numbers

DF.desctibe(include='all')

Data in chronological order

I referred to here https://note.nkmk.me/python-pandas-time-series-resample-asfreq/

Create date data

df = pd.DataFrame({'data' : range(0, 30, 2)},
                 index=pd.date_range(start='2018/1/1', end='2018/1/30', freq='2D'))

Extract data at regular intervals

For example, if you want to extract data every 3 days, use asfreq (). Dates that originally have no data will be None.

df.asfreq(freq='3D')
before after

If you want to extract every Wednesday and fill the None part with the previous data, use asfreq (freq ='W-WED', method ='pad').

df.asfreq(freq='W-WED', method='pad')
before after

Conversely, to fill with the trailing value, use asfreq (method ='bfill').

df.asfreq(freq='W-WED', method='bfill')
before after

Have date data and get totals weekly

Use resample ()

df.resample('W').sum()
before after

Data with the same date, total for each date

The data looks like this

df2 = pd.DataFrame({'data': range(17)},
                    index=pd.date_range('2018-08-01', '2018-08-05', freq='6H'))

Use resample ('D'). Sum () to get the daily sum.

df2.resample('D').sum()
before after

For the mean, use resample ('D'). Mean ().

before after

Visualization (Sun)

Prepare virtual data from 2018/1/1 to 2018/12/31

np.random.seed(0)
df3 = pd.DataFrame({'data': 
                    np.maximum(0, np.minimum(np.arange(0, -365, -1) + 182, 0) + np.minimum(np.arange(365), 182) + np.floor(np.random.normal(50, 50, 365)))},
                    index=pd.date_range('2018-01-01', '2018-12-31'))

If you visualize this,

plt.figure(figsize=(18, 6), facecolor='white')
plt.plot(df3.index, df3['data'], marker='o')
plt.xticks(rotation=90)
plt.show()
graph data
image.png

Visualization (weekly)

Visualize every Monday

np.random.seed(0)
df3 = pd.DataFrame({'data': 
                    np.maximum(0, np.minimum(np.arange(0, -365, -1) + 182, 0) + np.minimum(np.arange(365), 182) + np.floor(np.random.normal(50, 50, 365)))},
                    index=pd.date_range('2018-01-01', '2018-12-31'))

_df = df3.asfreq('W-MON')
plt.figure(figsize=(18, 6), facecolor='white')
plt.plot(_df.index, _df['data'], marker='o')
plt.xticks(rotation=90)
plt.show()
graph data
image.png

Visualization (monthly)

Visualized at the end of the month

np.random.seed(0)
df3 = pd.DataFrame({'data': 
                    np.maximum(0, np.minimum(np.arange(0, -365, -1) + 182, 0) + np.minimum(np.arange(365), 182) + np.floor(np.random.normal(50, 50, 365)))},
                    index=pd.date_range('2018-01-01', '2018-12-31'))

_df = df3.asfreq('1M')
plt.figure(figsize=(18, 6), facecolor='white')
plt.plot(_df.index, _df['data'])
plt.xticks(rotation=90)
plt.show()
graph data
image.png

asfreq ('1M') is the last day of the month

df3.asfreq('1M')

The first day of the month

np.random.seed(0)
df3 = pd.DataFrame({'data': 
                    np.maximum(0, np.minimum(np.arange(0, -365, -1) + 182, 0) + np.minimum(np.arange(365), 182) + np.floor(np.random.normal(50, 50, 365)))},
                    index=pd.date_range('2018-01-01', '2018-12-31'))

_df = df3.asfreq('1MS')
plt.figure(figsize=(18, 6), facecolor='white')
plt.plot(_df.index, _df['data'])
plt.xticks(rotation=90)
plt.show()
graph data
image.png

asfreq ('1MS') is the first day of the month

df3.asfreq('1MS')

Visualization (1 quarter)

np.random.seed(0)
df3 = pd.DataFrame({'data': 
                    np.maximum(0, np.minimum(np.arange(0, -365, -1) + 182, 0) + np.minimum(np.arange(365), 182) + np.floor(np.random.normal(50, 50, 365)))},
                    index=pd.date_range('2018-01-01', '2018-12-31'))

_df = df3.asfreq('1QS')
plt.figure(figsize=(18, 6), facecolor='white')
plt.plot(_df.index, _df['data'], marker='o')
plt.xticks(rotation=90)
plt.show()
graph data
image.png image.png

Visualization (total for one week)

np.random.seed(0)
df3 = pd.DataFrame({'data': 
                    np.maximum(0, np.minimum(np.arange(0, -365, -1) + 182, 0) + np.minimum(np.arange(365), 182) + np.floor(np.random.normal(50, 50, 365)))},
                    index=pd.date_range('2018-01-01', '2018-12-31'))

_df = df3.resample('W').sum()
plt.figure(figsize=(18, 6), facecolor='white')
plt.plot(_df.index, _df['data'], marker='o')
plt.xticks(rotation=90)
plt.show()
graph data
image.png

Visualization (total for one month)

np.random.seed(0)
df3 = pd.DataFrame({'data': 
                    np.maximum(0, np.minimum(np.arange(0, -365, -1) + 182, 0) + np.minimum(np.arange(365), 182) + np.floor(np.random.normal(50, 50, 365)))},
                    index=pd.date_range('2018-01-01', '2018-12-31'))

_df = df3.resample('MS').sum()
plt.figure(figsize=(18, 6), facecolor='white')
plt.plot(_df.index, _df['data'], marker='o')
plt.xticks(rotation=90)
plt.show()
graph data
image.png

Time series interpolation

Time series interpolation is often used for occupational patterns

This can be done by using asfreq () well.

df = pd.DataFrame({'data' : range(0, 30, 2)},
                 index=pd.date_range(start='2018/1/1', end='2018/1/30', freq='2D'))

df2 = df.asfreq('1D')
before after

Date → day of the week

Use weekday or dayofweek for datatime Series

0 is Monday and 6 is Sunday

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DatetimeIndex.weekday.html

np.random.seed(0)
df3 = pd.DataFrame({'data': 
                    np.maximum(0, np.minimum(np.arange(0, -365, -1) + 182, 0) + np.minimum(np.arange(365), 182) + np.floor(np.random.normal(50, 50, 365)))},
                    index=pd.date_range('2018-01-01', '2018-12-31'))
df3.index.weekday
Int64Index([0, 1, 2, 3, 4, 5, 6, 0, 1, 2,
            ...
            5, 6, 0, 1, 2, 3, 4, 5, 6, 0],
           dtype='int64', length=365)

Recommended Posts

I know? Data analysis using Python or things you want to use when you want with numpy
Solution when you want to use cv_bridge with python3 (virtualenv)
If you want to use field names with hyphens when updating firestore data in python
I want to use MATLAB feval with python
I want to use Temporary Directory with Python2
When you want to use it as it is when using it with lambda memo
Three things I was addicted to when using Python and MySQL with Docker
[Python] I want to use only index when looping a list with a for statement
I want to be able to analyze data with Python (Part 3)
Knowledge you need to know when programming competitive programming with Python2
I want to be able to analyze data with Python (Part 1)
[python] A note when trying to use numpy with Cython
Use aggdraw when you want to draw beautifully with pillow
I want to be able to analyze data with Python (Part 4)
I want to be able to analyze data with Python (Part 2)
[Python] I want to use the -h option with argparse
When you want to register Django's initial data with relationships
Things to keep in mind when using Python with AtCoder
Things to keep in mind when using cgi with python.
I want to debug with Python
I want to know the weather with LINE bot feat.Heroku + Python
I want to know if you install Python on Mac ・ Iroha
[Python] When you want to use all variables in another file
When you want to send an object with requests using flask
I want to use jar from python
I want to analyze logs with Python
I want to play with aws with python
When you want to use multiple versions of the same Python library (virtual environment using venv)
Gist repository to use when you want to try a little with ansible
Python Note: When you want to know the attributes of an object
I want to improve efficiency with Python even in an experimental system (4) Use ser.close () when an error is thrown using try syntax
I tried fMRI data analysis with python (Introduction to brain information decoding)
Tips (data structure) that you should know when programming competitive programming with Python2
I want to email from Gmail using Python.
I want to knock 100 data sciences with Colaboratory
I want to make a game with Python
I tried to get CloudWatch data with Python
I want to use ceres solver from python
I don't want to use -inf with np.log
#Unresolved I want to compile gobject-introspection with Python3
I want to use ip vrf with SONiC
I want to solve APG4b with Python (Chapter 2)
I want to write to a file with Python
What are you using when testing with Python?
How to build an environment when you want to use python2.7 after installing Anaconda3
[Python] I want to know the variables in the function when an error occurs!
Personal best practice template to use when you want to make MVP with Flask
Use PIL in Python to extract only the data you want from Exif
Things to keep in mind when using Python for those who use MATLAB
[In-Database Python Analysis Tutorial with SQL Server 2017] Step 2: Import data to SQL Server using PowerShell
I want to use mkl with numpy and scipy under pyenv + poetry environment
If you want to make a discord bot with python, let's use a framework
Data analysis with python 2
Data analysis using Python 0
Data analysis with Python
I want to handle optimization with python and cplex
Settings when you want to run python-mecab with travis
Reading Note: An Introduction to Data Analysis with Python
When you want to filter with Django REST framework
I want to inherit to the back with python dataclass
I want to work with a robot in python.