Reputation: 61
I have read many posts but not been successful. I have a column 'percent' that i wish to but in categories 1,2,3,4. the dataframe is called 'data' . I tried
for i in data.index:
if i > 0.7:
df.at[i,"percent"] =1
if i <0.7 and i>0:
df.at[i, "percent"] = 2
if i <0 and i > -0.4:
df.at[i, "percent"] = 3
if i < 0.4:
df.at[i, "percent"] = 4
but it looks like everything is replaced to 1. what am i doing wrong?
Upvotes: 1
Views: 271
Reputation: 391
import pandas as pd
import numpy as np
df = pd.DataFrame([[0.4,"x"],[0.5,"x"], [0.6,"y"], [0.7,"z"], [0.8,"z"]], columns=["pc","val"])
df['pc_quant'] = np.digitize(df['pc'], [.4, .7])
print(df)
gives you:
pc val pc_quant
0 0.4 x 1
1 0.5 x 1
2 0.6 y 1
3 0.7 z 2
4 0.8 z 2
Upvotes: 1