[PYTHON] Examples and countermeasures for "A value is trying to be set on a copy of a slice from a Data Frame." Warning in pandas

What I'm writing in this article

When analyzing data with pandas,

A value is trying to be set on a copy of a slice from a DataFrame.


 I often get the error.

 Translated literally, "A value is about to be assigned to a copy of a slice from a data frame."

 Occurs when trying to assign a value to the data frame to which data was extracted from the data frame under some conditions.

 This seems to occur when it is unclear whether the value is reflected in the original data frame when the value is assigned to the data frame of the extraction destination.

 In this regard, I will summarize the cases and countermeasures I encountered.

# environment
- python 3.7.4
- pandas 0.25.3
- numpy 1.16.1

# Code example

## Example code that gives a warning

```python
#Create a DataFrame
df = pd.DataFrame(np.arange(20).reshape((4,5)), columns = list("abcde"))
print(df)

#     a   b   c   d   e
# 0   0   1   2   3   4
# 1   5   6   7   8   9
# 2  10  11  12  13  14
# 3  15  16  17  18  19

df["f"] = 3 #This is not considered an assignment to slice and there is no error

#From DataFrame".loc"And extract data
#Extract by specifying conditions → This is regarded as slice
df_sub = df.loc[df["e"] % 2 == 0]

df_sub["g"] = 100 #It is regarded as an assignment to slice and a warning is issued.
print(df_sub)
#     a   b   c   d   e  f    g
# 0   0   1   2   3   4  3  100
# 2  10  11  12  13  14  3  100
# c:\program files\python37\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: 
# A value is trying to be set on a copy of a slice from a DataFrame.
# Try using .loc[row_indexer,col_indexer] = value instead

# See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Countermeasure example --Clarify that it is another DataFrame with .copy ()

When extracting by specifying conditions, if .copy () is used to clearly indicate that it is a copy rather than a slice, no warning will be issued. This means that the data will be reflected in the copy destination and not in the copy source.

df_sub = df.loc[df["e"] % 2 == 0].copy()
df_sub["g"] = 100

If you want to reflect the data in the copy source, you can do something like pd.merge ().

Recommended Posts

Examples and countermeasures for "A value is trying to be set on a copy of a slice from a Data Frame." Warning in pandas
In python pandas SettingWithCopyWarning A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc [row_indexer, col_indexer] = value instead
Change the data frame of pandas purchase data (id x product) to a dictionary
How to make a face image data set used in machine learning (2: Frame analysis of video to obtain candidate images)
What to do if pvcreate produces a lot of WARNING and cannot be created
A story of trial and error trying to create a dynamic user group in Slack
Find a guideline for the number of processes / threads to set in the application server
In matplotlib, set the vertical axis on the left side of the histogram to frequency and the vertical axis on the right side to relative frequency (maybe a wicked way)