[Python / Pandas] A bug occurs when trying to replace a DataFrame with `None` with` replace`

What happened

I had a bug (looked like) when I tried to replace np.nan with None using the replace method in DataFrame of pandas


Conducted at Google Colaboratory

Source code

1. Create DataFrame before replacement

Click here for DataFrame for operation check

import pandas as pd
import numpy as np

indexes = [
    datetime.datetime(2020, 1, 1, 11, 50),
    datetime.datetime(2020, 1, 1, 12, 50),
    datetime.datetime(2020, 1, 1, 12, 52),
    datetime.datetime(2020, 1, 1, 18, 50),
    datetime.datetime(2020, 1, 1, 19, 50),
    datetime.datetime(2020, 1, 1, 21, 50),
df = pd.DataFrame({
    'high': [1, np.nan, 3, np.nan, np.nan, 11],
    'close': [4, 5, 6, 7, np.nan, 2],
    'memo': ['sign', '', np.nan, 'sign2', np.nan, 'sign3'],
    'bool': [True, None, True, False, None, False],
    'stoploss': [True, None, True, False, None, False]
}, index=indexes)

->                    high   close   memo   bool	stoploss
2020-01-01 11:50:00   1.0    4.0     sign   True	True
2020-01-01 12:50:00   NaN    5.0            None	None
2020-01-01 12:52:00   3.0    6.0     NaN    True	True
2020-01-01 18:50:00   NaN    7.0     sign2  False	False
2020-01-01 19:50:00   NaN    NaN     NaN    None	None
2020-01-01 21:50:00   11.0   2.0     sign3  False	False

2. Replace method 1

Those who have bugs

df.replace(np.nan, None)
->                   high	close	memo	bool	stoploss
2020-01-01 11:50:00  1.0	4.0     sign	True	True
2020-01-01 12:50:00  1.0	5.0             True    True
2020-01-01 12:52:00  3.0	6.0             True	True
2020-01-01 18:50:00  3.0	7.0     sign2   False	False
2020-01-01 19:50:00  3.0	7.0     sign2   False   False
2020-01-01 21:50:00  11.0	2.0     sign3	False	False

...What's this! !!ヾ ノ .ÒдÓ) Noshi bang bang !! Where it was np.nan, it is not None, it is filled with the previous value (It looks like it was fillna)

3. Replace method 2

Fine? Who

df.replace({np.nan: None})
->                    high   close   memo   bool	stoploss
2020-01-01 11:50:00   1      4       sign   True	True
2020-01-01 12:50:00   None   5	            None	None
2020-01-01 12:52:00   3      6       None   True	True
2020-01-01 18:50:00   None   7       sign2  False	False
2020-01-01 19:50:00   None   None    None   None	None
2020-01-01 21:50:00   11     2       sign3  False	False

As expected (? No, I noticed that somehow, float is all integers ... It's okay (help)

... I was impatient for a moment (more than 30 minutes), but when I looked closely, the contents were still float.

tmp_df = df.replace({np.nan: None})

-> array([[1.0, 4.0, 'sign', True, True],
       [None, 5.0, '', None, None],
       [3.0, 6.0, None, True, True],
       [None, 7.0, 'sign2', False, False],
       [None, None, None, None, None],
       [11.0, 2.0, 'sign3', False, False]], dtype=object)

ε- (´∀ ` *) Hot

I have to remember how to write this ... (..) φdf.replace ({np.nan: None})

Reference material

For the time being, the official pandas documentation also mentions this. However, it took a long time to find it, so I decided to record it this time.

When value=None and to_replace is a scalar, list or tuple, replace uses the method parameter (default ‘pad’) to do the replacement. So this is why the ‘a’ values are being replaced by 10 in rows 1 and 2 and ‘b’ in row 4 in this case. The command s.replace('a', None) is actually equivalent to s.replace(to_replace='a', value=None, method='pad'):

-Excerpt from pandas.DataFrame.replace

If it was written in Japanese, I might have noticed it a little earlier ...

Other related materials

I don't know if it's a bit related, but if you try to fill None with np.nan, another problem seems to occur.

StackOverflow : Replace None with NaN in pandas dataframe

Recommended Posts

[Python / Pandas] A bug occurs when trying to replace a DataFrame with `None` with` replace`
[python] A note when trying to use numpy with Cython
I get a UnicodeDecodeError when trying to connect to oracle with python sqlalchemy
[Python] A memo to write CSV vertically with Pandas
I got stuck when trying to specify a relative path with relative_to () in python
A memo of misunderstanding when trying to load the entire self-made module with Python3
Convert list to DataFrame with python
A program that failed when trying to create a linebot with reference to "Dialogue system made with python"
[Python] Format when to_csv with pandas
[Python] How to add rows and columns to a table (pandas DataFrame)
Trying to handle SQLite3 with Python [Note]
[Python] Add total rows to Pandas DataFrame
Replace column names / values with pandas dataframe
ImportError when trying to use gcloud package with AWS Lambda Python version
When you want to replace a column with a missing value (NaN) column by column
[Python / Tkinter] Search for Pandas DataFrame → Create a simple search form to display
Error when installing a module with Python pip
How to read a CSV file with Python 2/3
Send a message to LINE with Python (LINE Notify)
[Introduction to Udemy Python 3 + Application] 38. When judging None
Try to draw a life curve with python
I want to make a game with Python
[Python] How to read excel file with pandas
An error occurs when trying to import scikit-learn after connecting to Oracle with SQLAlchemy
[python] Create table from pandas DataFrame to postgres
When I tried to create a virtual environment with Python, it didn't work
Try to make a "cryptanalysis" cipher with Python
How to replace with Pandas DataFrame, which is useful for data analysis (easy)
Decide to assign a laboratory with Python (fiction)
Python Note: When assigning a value to a string
A memo when creating a python environment with miniconda
Steps to create a Twitter bot with python
Let's replace UWSC with Python (5) Let's make a Robot
Try to make a dihedral group with Python
What to do if an error occurs when loading a python project created with poetry into VS Code
A note I was addicted to when running Python with Visual Studio Code
Error when trying to install psycopg2 in Python
I want to write to a file with Python
A story that I was addicted to when I made SFTP communication with python
A layman wants to get started with Python
Materials to read when getting started with Python
A story that required preparation when trying to do a Django tutorial with plain centos7
A story that failed when trying to remove the suffix from the string with rstrip
[Python] I want to use only index when looping a list with a for statement
A story that got stuck when trying to upgrade the Python version on GCE
I tried scraping food recall information with Python to create a pandas data frame
A memo connected to HiveServer2 of EMR with python
[Python] How to draw a line graph with Matplotlib
Try to make a command standby tool with python
Python Ver. To introduce WebPay with a little code.
I tried to draw a route map with Python
How to access with cache when reading_json in pandas
Problems when creating a csv-json conversion tool with python
A story about trying a (Golang +) Python monorepo with Bazel
How to convert JSON file to CSV file with Python Pandas
I want to work with a robot in python.
Things to note when initializing a list in Python
From buying a computer to running a program with python
[Python] How to deal with pandas read_html read error
I tried to automatically generate a password with Python3
Make a CSV formatting tool with Python Pandas PyInstaller