I really want to see my deposit balance in Python-Visualize cash stock flow with seaborn

Introduction

Even if you want to save money, you don't know if you are increasing or decreasing cash in the first place. It may be possible to visualize it using Zaim or Moneyforward, but analysis requires becoming a premium member, and it is troublesome to enter each time, so it is discouraging to manage. Since I want to visualize the cash deposit / withdrawal status, I implemented a script that visualizes the increase / decrease of money in the deposit account using Python's Seaborn.

policy

Most online banking can export the deposit / withdrawal history for about one year with csv. Use this csv as an input to visualize the cash deposit / withdrawal history. There are many ways to look at visualization of money flow. Here, we will roughly visualize two things, "** stock " and " flow **". If you compare a deposit account to a bathtub and cash to water, stock and flow can be explained as follows. image.png

** (1) Stock ** The amount of water in the bathtub at some point. That is, it represents the deposit balance of a certain month. If you compare the stocks of different months, you can see the ** increasing / decreasing trend of the deposit balance **.

** (2) Flow ** The amount of water that has flowed in a certain period. Furthermore, it can be divided into (2a) inflow and (2b) outflow, and represents the income and expenditure of a certain month. If you compare the inflows and outflows in the same month, you can see the ** profit and loss amount ** in a certain month.

Implementation

Each section is supposed to be written in a notebook such as Jupyter Notebook. Once you write it in your notebook, even if the deposit is updated later, you can replace the csv file and execute it sequentially, so it's easy. Version should work as long as you use Python3.

In addition, the data used is the one in which a value like that is randomly generated. ~~ It's more than my own take ~~.

Package import

import numpy as np
import pandas as pd
import codecs
import seaborn as sns
sns.set(font='Hiragino Sans')
import matplotlib.pyplot as plt

If you are using ʻanaconda, you should be able to import everything without any problems. In the case of mac, seaborn does not recognize Japanese as it is, so specify a font that supports Japanese with sns.set ()`.

Data import / preprocessing

code

#Read deposit / withdrawal history file
with codecs.open("../input/20200517212253.csv", "r", "Shift-JIS", "ignore") as file:
    df = pd.read_table(file, delimiter=",")

#Preprocessing
"""
Implement the preprocessing of the read df.
df uses "transaction date" as an index, and has the following four columns.
 1.Transaction name: Has two values, "payment" or "payment".
 2.Amount: Transaction amount (flow)
 3.Post-transaction balance: Deposit balance (stock) at that time
"""

When reading csv, specify the character code to avoid garbled characters. The preprocessing code is omitted because it depends on the item output by the net banking used. As described in the code, the index should have the transaction date, and also have the "transaction name", "amount (= flow)", and "post-transaction balance (= stock)".

** df sample ** image.png

By the way, in the case of net banking, the output result column may be divided into "year", "month", and "day", but you can combine them into one column with datetime type as follows.

df["Transaction date"] = [pd.Timestamp(year=int(row["Handling date year"]), 
                            month=int(row["Handling date month"]), 
                            day=int(row["Handling date"])) for i, row in df.iterrows()]

① Visualization of stock --Plot the increase / decrease trend of the deposit balance

Compare the deposit balances (stocks) of different months and check whether the balances are on an increasing trend or a decreasing trend.

image.png

Since the csv of the original data is inserted based on the deposit / withdrawal history, that is, the transaction occurrence base, it is not possible to do something like "Graph the monthly transition starting from the 1st of the month". Therefore, draw a line graph that plots all the columns of df "Balance after transaction".

code

#Generate a data frame for stock drawing
df4stock = df.copy()

#Visualization
fig, ax = plt.subplots(figsize=(20, 10))
sns.lineplot(x=df4stock.index, y="Post-transaction balance", data=df4stock, estimator=None);
fig.gca().ticklabel_format(style='plain', axis='y')

** Output result **

image.png

The blue circle marker indicates the occurrence of the transaction and the balance at that time. Salary income and card payment expenditure make up a monthly periodicity. The overall increase / decrease trend is estimated from the monthly maximum balance (payday).

In time series analysis, the overall trend of increase / decrease is defined as ** trend component **, and the increase / decrease with cycles such as monthly cycle is defined as ** seasonal component **. If you want to extract the trend component, you can insert the dummy data of the day when the transaction did not occur and use tsa.seasonal_decompose of stats model to separate the trend component and the seasonal component. Implementation is omitted in this article. (Reference: Sakutto Trend Extraction: Introduction to Time Series Analysis with Python statsmodels)

(2) Flow visualization-Plot monthly deposit / withdrawal amount

Compare the amount of income and expenditure (flow) in a certain month and check how much the balance has increased or decreased.

image.png

The original data is grouped by year and month, and a new data frame with the amount of expenditure and income for each month is generated. Draw a bar graph that plots the balance and difference of each month based on the created data frame.

code

#Generate data frame for flow drawing
df4flow = df.groupby([df.index.year, df.index.month, "Transaction name"]).sum().reset_index("Transaction name")
df4flow

#Visualization
fig, ax = plt.subplots(figsize=(20, 10))
ax = sns.barplot(x=df4flow.index, y="Amount of money", hue="Transaction name", data=df4flow);

#Store the amount of each bar displayed by the value label in the list
height = [0 if np.isnan(i.get_height()) else i.get_height() for i in ax.patches]


for i, patch in enumerate(ax.patches):
    
    diff = height[i//2] - height[i//2 + len(height)//2]
    maxh = max(height[i//2], height[i//2 + len(height)//2])
    
    #Displaying value labels
    if i % 2 == 0:
        colidx = i//2
        ax.text(i//2 - patch.get_width()/2.0, height[colidx]/2.0, int(height[colidx]), color='black', ha="center")
        ax.text(i//2, maxh+10000, int(diff), color='black', ha="center")
        
    else:
        colidx = i//2 + len(height)//2
        ax.text(i//2 + patch.get_width()/2.0, height[colidx]/2.0, int(height[colidx]), color='black', ha="center")

** Output result **

image.png

The newly generated df4flow data frame for the graph has a column with the total transaction amount by income / expenditure with the year and month in the MultiIndex. In drawing, in addition to the bar graph, the value labels of each income / expenditure and the difference are drawn by the helper method ʻax.text () . The output value is obtained from the Rectangle object ʻax.patches, which has information as a rectangle for each bar graph. Since the height of the rectangle is the amount as it is, get the height with the method get_height () of the same object. The position of the value label is calculated and placed so that it is in the middle of each bar.

code

for i in ax.patches:
    print(i.get_height())

** Output result ** image.png It can be seen that ʻax.pathces [0: 4]has income (blue) and ʻax.pathces [5: 9]has expenditure (vermilion) bar information. Note that the output order is different from the data frame.

At the end

By visualizing the cash stock flow, it became possible to see whether the money is increasing or decreasing (I gave up buying a PC because it was decreasing). As mentioned above, if you put the above code together in Notebook, all you have to do is update the input data, which makes management easier.

It is surprisingly difficult to do graph output with Seaborn or Matplotlib like you do with Excel. It seems that we need to know more about the structure of the ʻArtist` object generated by the graph.

Reference: Basic knowledge of matplotlib that I wanted to know early, or the story of an artist who can adjust the appearance.

Recommended Posts

I really want to see my deposit balance in Python-Visualize cash stock flow with seaborn
Even in JavaScript, I want to see Python `range ()`!
I want to work with a robot in python.
I want to solve APG4b with Python (only 4.01 and 4.04 in Chapter 4)
I want to run Rails with rails s even in vagrant environment
I want to check the position of my face with OpenCV!
I really want to use GitHub Flavored Markdown (GFM) with Pelican!
I want to do ○○ with Pandas
I want to get angry with my mom when my memory is tight
(Matplotlib) I want to draw a graph with a size specified in pixels
I want to use the Django Debug Toolbar in my Ajax application
I want to express my feelings with the lyrics of Mr. Children
Environment maintenance made with Docker (I want to post-process GrADS in Python
I want to detect objects with OpenCV
I want to print in a comprehension
I want to blog with Jupyter Notebook
I want to pip install with PythonAnywhere
I want to analyze logs with Python
I want to play with aws with python
I want to embed Matplotlib in PySimpleGUI
I really wanted to copy with selenium
I want to see a list of WebDAV files in the Requests module
I want to do Dunnett's test in Python
I want to have recursion come to my mind
I want to use MATLAB feval with python
I want to analyze songs with Spotify API 2
I want to create a window in Python
I tried to integrate with Keras in TFv1.1
I want to display multiple images with matplotlib.
I want to make a game with Python
I want to be an OREMO with setParam!
I want to store DB information in list
I want to analyze songs with Spotify API 1
I want to merge nested dicts in Python
I want to make fits from my head
I want to use Temporary Directory with Python2
I don't want to use -inf with np.log
#Unresolved I want to compile gobject-introspection with Python3
I want to use ip vrf with SONiC
I want to solve APG4b with Python (Chapter 2)
I want to start over with Django's Migrate
I want to write to a file with Python
I want to display the progress in Python!
I want to save a file with "Do not compress images in file" set in OpenPyXL
I want to get an error message in Japanese with django Password Change Form
I want to extract only pods with the specified label using Label Selector in Client-go