Introduction

If you were making an app that manipulates csv files using pandas, The result is edited as expected, I was warned that SettingWithCopyWarning.

SettingWithCopyWarning I think it's a warning because it's passed by reference (what is it?). Extracting a part of the original data → Substituting a part of it Did you want to modify "that part of the original data"? You don't know if you wanted to create "new data with some changes"?

Feel free to comment in the comments.

Solution

If you specify in copy () that it is a copy, not a reference, SettingWithCopyWarning has been resolved.

An app that puts your favorite fruits into a csv file

First, In a directory (folder) called output_files Create a csv file named data.csv.

The contents are apple,8 orange,15 banana,4 apple,1 It has become.

`data.csv`


apple,8
orange,15
banana,4
apple,1

This csv file Remove apple, 1 and Rewritten to apple, 9

`data.csv`


apple,9
orange,15
banana,4

The goal is to edit.

`check_data.py`


import pandas as pd

def check():
    df = pd.read_csv('output_files/data.csv', names=['fru_name', 'count'])
    #Most frequently in vc(Duplicate)Store fruit
    vc = df['fru_name'].value_counts().index[0]
    #Store the number of occurrences in fre
    fre = df['fru_name'].value_counts().iat[0]

    if fre > 1:
        #Remove the duplicate fruit line and change it to the variable new_Store in data
        new_data = df.drop_duplicates(subset='fru_name')

        #Dup the original number of overlapping fruits_Store in count
        dup_count = int(new_data.loc[new_data['fru_name'] == vc, 'count'])
        # dup_count+1
        dup_count += 1
        new_data.loc[new_data['fru_name'] == vc, 'count'] = dup_count

        #data.Overwrite csv
        new_data.to_csv('output_files/data.csv', index=False, header=False)

check()

`python`

SettingWithCopyWarning

The csv file is as intended

`data.csv`


apple,9
orange,15
banana,4

Was edited, but SettingWithCopyWarning is issued.

Improvement

In copy (), to make it clear that it is a copy, not a reference

`python`


new_data = df.drop_duplicates(subset='fru_name')

`python`


#copy()Explicitly be a copy, not a reference
new_data = df.drop_duplicates(subset='fru_name').copy()

change to.

`check_data2.py`


import pandas as pd

def check():
    df = pd.read_csv('output_files/data.csv', names=['fru_name', 'count'])
    #Most frequently in vc(Duplicate)Store fruit
    vc = df['fru_name'].value_counts().index[0]
    #Store the number of occurrences in fre
    fre = df['fru_name'].value_counts().iat[0]

    if fre > 1:
        #Remove the duplicate fruit line and change it to the variable new_Store in data
        new_data = df.drop_duplicates(subset='fru_name').copy()

        #Dup the original number of overlapping fruits_Store in count
        dup_count = int(new_data.loc[new_data['fru_name'] == vc, 'count'])
        # dup_count+1
        dup_count += 1
        new_data.loc[new_data['fru_name'] == vc, 'count'] = dup_count

        #data.Overwrite csv
        new_data.to_csv('output_files/data.csv', index=False, header=False)

check()

with this, Without SettingWithCopyWarning, The csv file is as intended It came to be edited.

[PYTHON] pandas SettingWithCopyWarning

Introduction

Solution

An app that puts your favorite fruits into a csv file

data.csv

data.csv

check_data.py

python

data.csv

Improvement

python

python

check_data2.py

`data.csv`

`data.csv`

`check_data.py`

`python`

`data.csv`

`python`

`python`

`check_data2.py`