Isaac B
Isaac B

Reputation: 33

Avoid SettingWithCopyWarning in Pandas

I have the following code, I'm not sure how to rewrite it in order to avoid the SettingWithCopyWarning or should I just disable the warning?

The code is working I just want to assign the left attribute of pd.cut to a new column if the number is positive and the right attribute if negative

import numpy as np
import pandas as pd


bins = np.array([-1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0])
test_data = [{"ID": 1, "Value": -0.5}, {"ID": 2, "Value": 1.5}]

df = pd.DataFrame(test_data)

df["Bin"] = 0.0
df["Bin"][df["Value"] > 0.0] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]
df["Bin"][df["Value"] < 0.0] = [d['right'] for d in [{fn: getattr(f, fn) for fn in ['right']} for f in pd.cut(df["Value"], bins)]]

print(df)

Running the code produces

test.py:11: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["Bin"][df["Value"] > 0.0] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]
e.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["Bin"][df["Value"] < 0.0] = [d['right'] for d in [{fn: getattr(f, fn) for fn in ['right']} for f in pd.cut(df["Value"], bins)]]
   ID  Value  Bin
0   1   -0.5 -0.5
1   2    1.5  1.0

Upvotes: 0

Views: 637

Answers (2)

Partha Mandal
Partha Mandal

Reputation: 1441

Try this:

Edit:

In case of all +ve values pd.cut(df.loc[df["Value"]<0,'Value'], bins, labels=bins[1:]) gives an output of Series([], Name: Value, dtype: category - and hence an error on assignment.

But, a simple try except should avoid that:

from contextlib import suppress
with suppress(ValueError):
    df.loc[df["Value"] > 0.0,"Bin"] = pd.cut(df.loc[df["Value"]>0,'Value'], bins, labels=bins[:-1])
with suppress(ValueError):
    df.loc[df["Value"] < 0.0,"Bin"] = pd.cut(df.loc[df["Value"]<0,'Value'], bins, labels=bins[1:])

Btw here labels=bins[:-1] and labels=bins[1:] is doing the job of left and right in your original code.

Upvotes: 1

Daniel Geffen
Daniel Geffen

Reputation: 1862

You should replace slicing with loc:

df.loc[df["Value"] > 0.0, "Bin"] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]

Upvotes: 0

Related Questions