Reputation: 279
I need to group by pandas with weight
Name weight(kg)
Person1 4.44
Person2 37.3
Person3 36.38
Person4 39.52
Person5 81.57
Person6 43.55
Person7 91.11
Person8 5
Person9 36.48
Person10 38.25
My code is below. Need to group by according to if condition. My code is below. But I am getting 0 to 20 for all the rows.
if 0 <= data_file['weight(kg)'].all() < 20:
data_file['target'] = "0 to 20%"
if 20 < data_file['weight(kg)'].all() < 40:
data_file['target'] = "20 to 40%"
if 40 < data_file['weight(kg)'].all() < 60:
data_file['target'] = "40 to 60%"
if 60 < data_file['weight(kg)'].all() < 80:
data_file['target'] = "60 to 80%"
if 80 < data_file['weight(kg)'].all() <= 100:
data_file['target'] = "80 to 100%"
Expected out
Name weight(kg) Target
Person1 4.44 0 to 20
Person2 37.3 20 to 40
Person3 36.38 20 to 40
Person4 39.52 20 to 40
Person5 81.57 80 to 100
Person6 43.55 40 to 60
Person7 91.11 80 to 100
Person8 5 0 to 20
Person9 36.48 20 to 40
Person10 38.25 20 to 40
Upvotes: 2
Views: 1672
Reputation: 2811
You can do it using np.select
conditions = [
(0 <= df['weight(kg)']) & (df['weight(kg)'] < 20)
, (20 < df['weight(kg)']) & (df['weight(kg)'] < 40)
, (40 < df['weight(kg)']) & (df['weight(kg)'] < 60)
, (60 < df['weight(kg)']) & (df['weight(kg)'] < 80)
, (80 < df['weight(kg)']) & (df['weight(kg)'] <= 100)
]
results = [
"0 to 20%"
,"20 to 40%"
,"40 to 60%"
,"60 to 80%"
,"80 to 100%"
]
df['Target'] = np.select(conditions, results)
output:
Name weight(kg) Target
0 Person1 4.44 0 to 20%
1 Person2 37.30 20 to 40%
2 Person3 36.38 20 to 40%
3 Person4 39.52 20 to 40%
4 Person5 81.57 80 to 100%
Upvotes: 3
Reputation: 113
Very simple, just try using apply in pandas and lambda function:
def classify(x):
if 0 <= x < 20:
y = "0 to 20%"
if 20 < x < 40:
y = "20 to 40%"
if 40 < x < 60:
y = "40 to 60%"
if 60 < x < 80:
y = "60 to 80%"
if 80 < x <= 100:
y = "80 to 100%"
return y
the assuming your Dataframe has two columns 'Name' and 'Weight' we should type:
df['Target'] = df['weight'].apply(lambda x: classify(x))
I hope it helps
Extra: if you want progress bar you can add these lines:
from tqdm import tqdm
tqdm.pandas()
df['Target'] = df['weight'].progress_apply(lambda x: classify(x))
Upvotes: 3
Reputation: 3224
Use pd.cut
df.assign(bins = pd.cut(df["weight(kg)"], [0,20,40,60,80,100],
labels=['0 to 20', '20 to 40', '40 to 60', '60 to 80', '80 to 100']))
result
Name weight(kg) bins
0 Person1 4.44 0 to 20
1 Person2 37.30 20 to 40
2 Person3 36.38 20 to 40
3 Person4 39.52 20 to 40
4 Person5 81.57 80 to 100
5 Person6 43.55 40 to 60
6 Person7 91.11 80 to 100
7 Person8 5.00 0 to 20
8 Person9 36.48 20 to 40
9 Person10 38.25 20 to 40
Upvotes: 6