[PYTHON] Reasons to use logarithm

Purpose

See Source Code for Someone on Kaggle's site for a look at it. I will record it here because I want to understand it.

The environment uses Python 3, matplotlib and pandas.

This time I will try to understand logarithm (log) in my own way. I have never used logarithm (log) once in 10 years as a member of society. I have only a faint memory of what I studied when I was a student.

So I tried to find out why logarithm (log) is necessary. Difference between seismic intensity and magnitude? Simple and clear! was read.

When the number is too big and difficult to handle, logarithm (log) is used to make it easier to handle.

State without logarithm (log)

If you don't use logarithm, it looks like this. Most of the histograms aren't showing up and I'm not sure what they are.

python


import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("./creditcard.csv")

f, (ax1, ax2 ) = plt.subplots(2, 1, sharex=True, figsize=(12,4))

bins = 30

ax1.hist(df.Amount[df.Class == 1], bins = bins)
ax1.set_title('Fraud')

ax2.hist(df.Amount[df.Class == 0], bins = bins)
ax2.set_title('Normal')

plt.xlabel('Amount ($)')
plt.ylabel('Number of Transactions')

plt.show()

image

State using logarithm (log)

Add the following to the source code when logarithm (log) is not used.

plt.yscale('log')

python


import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("./creditcard.csv")

f, (ax1, ax2 ) = plt.subplots(2, 1, sharex=True, figsize=(12,4))

bins = 30

ax1.hist(df.Amount[df.Class == 1], bins = bins)
ax1.set_title('Fraud')

ax2.hist(df.Amount[df.Class == 0], bins = bins)
ax2.set_title('Normal')

plt.xlabel('Amount ($)')
plt.ylabel('Number of Transactions')
plt.yscale('log')
plt.show()

image

It turns out that the overall trend is somewhat visible when using the logarithm (log) than when not using it.

Recommended Posts

Reasons to use logarithm
How to use xml.etree.ElementTree
How to use virtualenv
How to use Seaboan
How to use image-match
How to use shogun
How to use Pandas 2
How to use Virtualenv
How to use numpy.vectorize
How to use pytest_report_header
Easy to use Flask
How to use partial
How to use SymPy
How to use x-means
How to use WikiExtractor.py
How to use IPython
How to use virtualenv
How to use Matplotlib
How to use iptables
How to use numpy
How to use TokyoTechFes2015
How to use venv
How to use dictionary {}
How to use Pyenv
Easy to use SQLite3
How to use list []
How to use python-kabusapi
Python-How to use pyinstaller
How to use OptParse
How to use return
How to use dotenv
How to use pyenv-virtualenv
How to use Go.mod
How to use imutils
How to use import
How to use search sorted
[gensim] How to use Doc2Vec
Understand how to use django-filter
Use MeCab to fetch readings
How to use the generator
[Python] How to use list 1
QSM analysis-How to use MEDI-
How to use FastAPI ③ OpenAPI
How to use Python argparse
How to use IPython Notebook
How to use Pandas Rolling
[Note] How to use virtualenv
How to use redis-py Dictionaries
3 Reasons Beginners to Start Python
Python: How to use pydub
[Python] How to use checkio
[Go] How to use "... (3 periods)"
How to use Django's GeoIp2
Easy to use E-Cell 4 Intermediate
[Python] How to use input ()
How to use the decorator
[Introduction] How to use open3d
How to use Python lambda
How to use Jupyter Notebook
[Python] How to use virtualenv
python3: How to use bottle (3)