[PYTHON] Pandas basics for beginners ⑧ Digit processing

What is pandas

A library for handling structured data (table type data) in Python. It is a library that can easily perform file reading and subsequent processing / extraction processing (it can be performed like SQL), and is indispensable for data preprocessing such as machine learning. The table of contents for other items is here.

Introduction

In this article, it is the processing method of the number of digits. The first thing you should understand is how to adjust the number of digits in pandas itself and how to adjust the number of digits in individual data frames and variables. Also note that pandas rounding is not rounding, but even rounding. If you don't know how to round to even numbers, check it out.

Preparation

First, import the library. Name pandas pd and import it.

python


import pandas as pd

I will try the sample with Titanic data. If you don't know Titanic, please check "kaggle Titanic".

python


dataframe = pd.read_csv('train.csv')

Adjusting the number of digits in pandas

Various settings of pandas are managed by ʻoption. (There are various other options, so please check if you are interested.) The total number of digits is managed by display.float_format, and the number of digits after the decimal point is managed by display.precision`. Let's actually check it.

In


print(pd.options.display.float_format)
print(pd.options.display.precision)

Out


None
6

There is no limit to the total number of digits, and 6 digits are displayed after the decimal point. Looking at the actual data, for example, Fare is displayed up to four digits after the decimal point. This is because the original CSV data has only 4 digits, but if the number of digits is large, it will be displayed up to 6 digits. image.png

Then change this value to display two decimal places. (Fare display will be 2 digits)

python


pd.options.display.precision = 2

image.png Use reset_option if you want to initialize.

python


pd.reset_option('display.precision')

How to set individually

Use round () for individual settings. If you want to use 2 digits after the decimal point, use the following. (Fare display will be 2 digits)

python


dataframe.round(2)

image.png

When setting for each column, it is as follows. (Example: Age is 1 digit and Fare is 3 digits.)

python


dataframe.round({'Age':1, 'Fare':3})

image.png

Finally

As a beginner can understand, we have summarized the necessary knowledge when implementing machine learning with Python as a simple article. The table of contents is here, so I hope you can refer to other articles as well.

Recommended Posts

Pandas basics for beginners ⑧ Digit processing
Pandas basics for beginners ① Reading & processing
Pandas basics summary link for beginners
Pandas basics for beginners ③ Histogram creation with matplotlib
Basics of pandas for beginners ② Understanding data overview
Seaborn basics for beginners ④ pairplot
100 Pandas knocks for Python beginners
Seaborn basics for beginners ② Histogram (distplot)
Pandas basics
[Must-see for beginners] Basics of Linux
Pandas basics
Pandas basics for beginners ④ Handling of date and time items
Processing memos often used in pandas (beginners)
Roadmap for beginners
Python Pandas is not suitable for batch processing
[Pandas] Basics of processing date data using dt
[For recording] Pandas memorandum
[Translation] NumPy Official Tutorial "NumPy: the absolute basics for beginners"
Spacemacs settings (for beginners)
Python basics ② for statement
Processing datasets with pandas (1)
Processing datasets with pandas (2)
Summary of pre-processing practices for Python beginners (Pandas dataframe)
MongoDB Basics: Transaction Processing
python textbook for beginners
100 Language Processing Knock: Chapter 2 UNIX Command Basics (using pandas)
[Linux] Basics of authority setting by chmod for beginners
Dijkstra algorithm for beginners
OpenCV for Python beginners
Seaborn basics for beginners ③ Scatter plot (jointplot) * With histogram
[For beginners] Basics of Python explained by Java Gold Part 2
[Explanation for beginners] Introduction to convolution processing (explained in TensorFlow)
[Explanation for beginners] Introduction to pooling processing (explained in TensorFlow)
[For beginners] Basics of Python explained by Java Gold Part 1
Learning flow for Python beginners
[For beginners] kaggle exercise (merucari)
Linux distribution recommended for beginners
CNN (1) for image classification (for beginners)
Python3 environment construction (for beginners)
Overview of Docker (for beginners)
Python #function 2 for super beginners
Basic Python grammar for beginners
Python for super beginners Python #functions 1
Python #list for super beginners
~ Tips for beginners to Python ③ ~
[For Kaggle beginners] Titanic (LightGBM)
Reference resource summary (for beginners)
Linux command memorandum [for beginners]
Data processing tips with Pandas
Convenient Linux shortcuts (for beginners)
[Python] Iterative processing (for, while)
[Pandas] I tried to analyze sales data with Python [For beginners]
Seaborn basics for beginners ① Aggregate graph of the number of data (Countplot)