Reputation: 37
I have a dataframe that is something like this:
Max Min Id
1 10 5 AAA
2 15 10 AAB
3 10 7 AAC
4 20 15 AAD
5 15 10 AAE
I want to add another column to the dataframe with the condition that max and min values are the same, so de rows AAB and AAE would have the same "classification" or "family" .
I would have something like this:
Max Min Id Family
1 10 5 AAA J
2 15 10 AAB K
3 10 7 AAC L
4 20 15 AAD M
5 15 10 AAE K
What's the best way to do this?
Upvotes: 0
Views: 61
Reputation: 786
Records having the same Max and Min will have the same family number.
The use of apply could make it slow for larger dataframes
def func(row):
return df.loc[(df['Max']==row['Max']) & (df['Min']==row['Min'])]['Max'].idxmin()
df['Family'] = df.apply(func, axis=1)
Max Min Family
0 10 5 0
1 15 10 1
2 10 7 2
3 20 15 3
4 15 10 1
EDIT: A faster way to do the same
df['idx'] = df.groupby(['Max', 'Min']).ngroup()
Upvotes: 1