Reputation: 453
I have dataset which has columns:
Unnamed: 0 int64
id object
number_from object
number_to object
time object
And I applied this function on it: data1.groupby('number_from').apply(lambda x: x.sort_values('time'))
I get something like this:
Unnamed id number_from number_to time
17699 d20b3e 934 674 2017-07-03 06:36:20.000
17700 d20b81 934 674 2017-07-03 06:36:22.000
17701 d20b96 934 674 2017-07-03 06:36:23.000
**17703 d20c17 612 235 2017-07-03 06:36:28.000**
17707 d20db5 934 658 2017-07-03 06:36:45.000
17708 d20de9 934 658 2017-07-03 06:36:47.000
17710 d20e05 934 658 2017-07-03 06:36:49.000
17711 d20e41 934 658 2017-07-03 06:36:51.000
17712 d20e73 934 658 2017-07-03 06:36:53.000
17713 d20ecc 934 702 2017-07-03 06:36:57.000
17714 d20ef1 934 702 2017-07-03 06:36:59.000
17715 d20f32 934 702 2017-07-03 06:37:01.000
17716 d20f77 934 702 2017-07-03 06:37:03.000
17717 d20f8d 934 702 2017-07-03 06:37:05.000
17718 d20fd8 934 262 2017-07-03 06:37:08.000
17719 d21017 934 262 2017-07-03 06:37:11.000
17720 d21032 934 262 2017-07-03 06:37:12.000
17721 d2103e 934 262 2017-07-03 06:37:13.000
17722 d2106d 934 262 2017-07-03 06:37:15.000
**17723 d210c4 396 048 2017-07-03 06:37:19.000**
17725 d21147 934 691 2017-07-03 06:37:24.000
17726 d21167 934 691 2017-07-03 06:37:26.000
Note: this is just the subset of the values in the dataframe that I have. It just sorted values and it did not grouped.
And I want to group and to sort time values in each group independently of others.
I also tried data1.sort_values(['time'], ascending=False).groupby('number_from')
but the result is same. What I'm doing wrong?
Upvotes: 2
Views: 573
Reputation: 862511
I believe you need sorting by multiple columns together, but first convert column time
to datetimes:
data1['time'] = pd.to_datetime(data1['time'])
#if need to compare integers
data1['number_from'] = data1['number_from'].astype(int)
data1.sort_values(['number_from','time'])
EDIT:
Tested both solution, working same:
data1['time'] = pd.to_datetime(data1['time'])
df1 = data1.sort_values(['number_from','time'])
print (df1)
id number_from number_to time
Unnamed
17723 d210c4 396 48 2017-07-03 06:37:19
17703 d20c17 612 235 2017-07-03 06:36:28
17699 d20b3e 934 674 2017-07-03 06:36:20
17700 d20b81 934 674 2017-07-03 06:36:22
17701 d20b96 934 674 2017-07-03 06:36:23
17707 d20db5 934 658 2017-07-03 06:36:45
17708 d20de9 934 658 2017-07-03 06:36:47
17710 d20e05 934 658 2017-07-03 06:36:49
17711 d20e41 934 658 2017-07-03 06:36:51
17712 d20e73 934 658 2017-07-03 06:36:53
17713 d20ecc 934 702 2017-07-03 06:36:57
17714 d20ef1 934 702 2017-07-03 06:36:59
17715 d20f32 934 702 2017-07-03 06:37:01
17716 d20f77 934 702 2017-07-03 06:37:03
17717 d20f8d 934 702 2017-07-03 06:37:05
17718 d20fd8 934 262 2017-07-03 06:37:08
17719 d21017 934 262 2017-07-03 06:37:11
17720 d21032 934 262 2017-07-03 06:37:12
17721 d2103e 934 262 2017-07-03 06:37:13
17722 d2106d 934 262 2017-07-03 06:37:15
17725 d21147 934 691 2017-07-03 06:37:24
17726 d21167 934 691 2017-07-03 06:37:26
Added parameter group_keys=False)
for avoid creating level from number_from
column:
df2 = data1.groupby('number_from', group_keys=False).apply(lambda x: x.sort_values('time'))
print (df2)
id number_from number_to time
Unnamed
17723 d210c4 396 48 2017-07-03 06:37:19
17703 d20c17 612 235 2017-07-03 06:36:28
17699 d20b3e 934 674 2017-07-03 06:36:20
17700 d20b81 934 674 2017-07-03 06:36:22
17701 d20b96 934 674 2017-07-03 06:36:23
17707 d20db5 934 658 2017-07-03 06:36:45
17708 d20de9 934 658 2017-07-03 06:36:47
17710 d20e05 934 658 2017-07-03 06:36:49
17711 d20e41 934 658 2017-07-03 06:36:51
17712 d20e73 934 658 2017-07-03 06:36:53
17713 d20ecc 934 702 2017-07-03 06:36:57
17714 d20ef1 934 702 2017-07-03 06:36:59
17715 d20f32 934 702 2017-07-03 06:37:01
17716 d20f77 934 702 2017-07-03 06:37:03
17717 d20f8d 934 702 2017-07-03 06:37:05
17718 d20fd8 934 262 2017-07-03 06:37:08
17719 d21017 934 262 2017-07-03 06:37:11
17720 d21032 934 262 2017-07-03 06:37:12
17721 d2103e 934 262 2017-07-03 06:37:13
17722 d2106d 934 262 2017-07-03 06:37:15
17725 d21147 934 691 2017-07-03 06:37:24
17726 d21167 934 691 2017-07-03 06:37:26
print (df1.equals(df2))
True
Upvotes: 2