Reputation: 33
I have the following code, I'm not sure how to rewrite it in order to avoid the SettingWithCopyWarning or should I just disable the warning?
The code is working I just want to assign the left attribute of pd.cut to a new column if the number is positive and the right attribute if negative
import numpy as np
import pandas as pd
bins = np.array([-1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0])
test_data = [{"ID": 1, "Value": -0.5}, {"ID": 2, "Value": 1.5}]
df = pd.DataFrame(test_data)
df["Bin"] = 0.0
df["Bin"][df["Value"] > 0.0] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]
df["Bin"][df["Value"] < 0.0] = [d['right'] for d in [{fn: getattr(f, fn) for fn in ['right']} for f in pd.cut(df["Value"], bins)]]
print(df)
Running the code produces
test.py:11: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df["Bin"][df["Value"] > 0.0] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]
e.py:12: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df["Bin"][df["Value"] < 0.0] = [d['right'] for d in [{fn: getattr(f, fn) for fn in ['right']} for f in pd.cut(df["Value"], bins)]]
ID Value Bin
0 1 -0.5 -0.5
1 2 1.5 1.0
Upvotes: 0
Views: 637
Reputation: 1441
Try this:
Edit:
In case of all +ve
values pd.cut(df.loc[df["Value"]<0,'Value'], bins, labels=bins[1:])
gives an output of Series([], Name: Value, dtype: category
- and hence an error on assignment.
But, a simple try except
should avoid that:
from contextlib import suppress
with suppress(ValueError):
df.loc[df["Value"] > 0.0,"Bin"] = pd.cut(df.loc[df["Value"]>0,'Value'], bins, labels=bins[:-1])
with suppress(ValueError):
df.loc[df["Value"] < 0.0,"Bin"] = pd.cut(df.loc[df["Value"]<0,'Value'], bins, labels=bins[1:])
Btw here labels=bins[:-1]
and labels=bins[1:]
is doing the job of left
and right
in your original code.
Upvotes: 1
Reputation: 1862
You should replace slicing with loc
:
df.loc[df["Value"] > 0.0, "Bin"] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]
Upvotes: 0