Reputation: 1
I am trying to categorized "DistAreaID
" column of my dataset [cf_all
] by grouping to its mean wrt Y.
Code:
round(cf_all.groupby("DistArea_ID")["Counterfeit_Sales"].mean(), 2)
for col in range(len(cf_all)):
if cf_all["DistArea_ID"][col] in \
["Area013", "Area017", "Area018", "Area035", "Area045", "Area046", "Area049"]:
cf_all.loc[col, "DistArea_ID"] = "DistArea_2000"
if cf_all["DistArea_ID"][col] in ["Area010", "Area019"]:
cf_all.loc[col, "DistArea_ID"] = "DistArea_400"
if cf_all["DistArea_ID"][col] in ["Area027"]:
cf_all.loc[col, "DistArea_ID"] = "DistArea_3000"
Error:
The truth value of a Series is ambiguous. Use a.empty, a.bool(),
a.item(), a.any() or a.all().
Can someone please guide me with this error?
Upvotes: 0
Views: 83
Reputation: 862406
I suggest use Series.replace
or Series.map
with Series.fillna
:
d = {"DistArea_2000": ["Area013", "Area017", "Area018",
"Area035", "Area045", "Area046", "Area049"],
"DistArea_400": ["Area010", "Area019"],
"DistArea_3000":["Area027"]}
#swap key values in dict
#http://stackoverflow.com/a/31674731/2901002
d1 = {k: oldk for oldk, oldv in d.items() for k in oldv}
cf_all["DistArea_ID"] = cf_all["DistArea_ID"].replace(d1)
#obviously faster
#cf_all["DistArea_ID"] = cf_all["DistArea_ID"].map(d1).fillna(cf_all["DistArea_ID"])
Upvotes: 1