Reputation: 15
I am trying to create a new column based on the condition of another column, with ranges of that count. However, I am getting a ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I am trying to use the same column twice to make the range, but it does not work. Where is the problem?
df.loc[(df["count_words"] > 100 & df["count_words"] <= 300), "length"] = "keskipitkä"
df.loc[df["count_words"] <= 100, "lenght"] = "lyhyt"
df.loc[df["count_words"] > 300, "length"] = "pitkä"
Upvotes: 1
Views: 101
Reputation: 862406
Problem is in ()
, because priority of operators:
df.loc[(df["count_words"] > 100) & (df["count_words"] <= 300), "length"] = "keskipitkä"
Another idea is here use cut
:
df=pd.DataFrame({'count_words':[10, 100, 200, 300, 4999]})
df["lenght"] = pd.cut(df["count_words"],
bins= [-np.inf, 100, 300, np.inf],
labels=['lyhyt','keskipitkä','pitkä'])
print (df)
count_words lenght
0 10 lyhyt
1 100 lyhyt
2 200 keskipitkä
3 300 keskipitkä
4 4999 pitkä
Upvotes: 1