Reputation: 129
I have a pandas dataframe that looks something like this:
df = pd.DataFrame({'Name' : ['Kate', 'John', 'Peter','Kate', 'John', 'Peter'],'Distance' : [23,16,32,15,31,26], 'Time' : [3,5,2,7,9,4]})
df
Distance Name Time
0 23 Kate 3
1 16 John 5
2 32 Peter 2
3 15 Kate 7
4 31 John 9
5 26 Peter 2
I want to add a column that tells me, for each Name, what's the order of the times.
I want something like this:
Order Distance Name Time
0 16 John 5
1 31 John 9
0 23 Kate 3
1 15 Kate 7
0 32 Peter 2
1 26 Peter 4
I can do it using a for loop:
df2 = df[df['Name'] == 'aaa'].reset_index().reset_index() # I did this just to create an empty data frame with the columns I want
for name, row in df.groupby('Name').count().iterrows():
table = df[df['Name'] == name].sort_values('Time').reset_index().reset_index()
to_concat = [df2,table]
df2 = pd.concat(to_concat)
df2.drop('index', axis = 1, inplace = True)
df2.columns = ['Order', 'Distance', 'Name', 'Time']
df2
This works, the problem is (apart from being very unpythonic), for large tables (my actual table has about 50 thousand rows) it takes about half an hour to run.
Can someone help me write this in a simpler way that runs faster? I'm sorry if this has been answered somewhere, but I didn't really know how to search for it.
Best,
Upvotes: 2
Views: 1065
Reputation: 210832
In [67]: df = df.sort_values(['Name','Time']) \
.assign(Order=df.groupby('Name').cumcount())
In [68]: df
Out[68]:
Distance Name Time Order
1 16 John 5 0
4 31 John 9 1
0 23 Kate 3 0
3 15 Kate 7 1
2 32 Peter 2 0
5 26 Peter 4 1
PS I'm not sure this is the most elegant way to do this...
Upvotes: 2
Reputation: 862481
Use sort_values
with cumcount
:
df = df.sort_values(['Name','Time'])
df['Order'] = df.groupby('Name').cumcount()
print (df)
Distance Name Time Order
1 16 John 5 0
4 31 John 9 1
0 23 Kate 3 0
3 15 Kate 7 1
2 32 Peter 2 0
5 26 Peter 4 1
If need first column use insert
:
df = df.sort_values(['Name','Time'])
df.insert(0, 'Order', df.groupby('Name').cumcount())
print (df)
Order Distance Name Time
1 0 16 John 5
4 1 31 John 9
0 0 23 Kate 3
3 1 15 Kate 7
2 0 32 Peter 2
5 1 26 Peter 4
Upvotes: 2