ttina
ttina

Reputation: 97

How to sort a DataFrame based on a column

I have a dataframe and I group by it based on a value of a column. Here is what I have tried and it works.

grouped_df = df.groupby(["sender"])

However I want for each group based on 'rcvTime' column, the data of each group gets sorted in ascending order, here is what I have tried and it doesn't work.

for key,item in grouped_df:
  a_group = grouped_df[["sender","rcvTime","pos_x","pos_y","pos_z","spd_x","spd_y","spd_z","acl_x","acl_y","acl_z","hed_x","hed_y","hed_z"]].get_group(key)    
  a_group.sort_values("rcvTime")

Here is my original sample data:

row number,type,rcvTime,sender,pos_x,pos_y,pos_z,spd_x,spd_y,spd_z,acl_x,acl_y,acl_z,hed_x,hed_y,hed_z
0,2,26062.0,2325,766.13,389.14,0.0,-1.97,-0.69,0.0,-1.87,-0.65,0.0,-0.94,-0.33,0.0
1,3,26062.02,2217,403.53,273.79,0.0,12.47,4.31,0.0,0.5,0.17,0.0,0.96,0.28,0.0
2,3,26062.08,2103,904.46,445.78,0.0,13.21,3.84,0.0,0.96,0.28,0.0,0.97,0.24,0.0
3,3,26062.11,1995,893.31,443.23,0.0,0.93,0.33,0.0,0.96,0.28,0.0,0.97,0.25,0.0
4,3,26062.27,1995,893.61,438.85,0.0,0.95,0.27,0.0,0.96,0.28,0.0,0.97,0.23,0.0
5,3,26062.44,1995,880.28,439.89,0.0,0.92,0.24,0.0,0.96,0.28,0.0,0.97,0.25,0.0
6,3,26062.61,1995,881.19,435.89,0.0,0.94,0.28,0.0,0.96,0.28,0.0,0.97,0.23,0.0
7,3,26062.77,1995,869.6,437.17,0.0,0.93,0.24,0.0,0.96,0.28,0.0,0.97,0.25,0.0
8,3,26062.94,1995,715.11,371.04,0.0,-13.35,-4.66,0.0,-0.13,-0.05,0.0,-0.99,0.12,0.0
9,2,26063.0,2325,763.44,388.09,0.0,-3.79,-1.32,0.0,-1.96,-0.68,0.0,-0.95,-0.31,0.0
10,3,26063.02,2217,415.89,278.11,0.0,12.41,4.29,0.0,-0.17,-0.06,0.0,0.96,0.29,0.0
11,3,26063.08,2103,914.7,452.66,0.0,5.09,12.75,0.0,0.08,0.21,0.0,0.41,0.91,0.0
12,3,26063.27,1995,913.5,439.57,0.0,0.08,0.21,0.0,0.08,0.21,0.0,0.41,0.91,0.0
13,3,26063.94,1995,701.54,366.36,0.0,-13.39,-4.67,0.0,0.52,0.18,0.0,-0.99,0.14,0.0
...

And here is one of group result which is not sorted:

a_group:
         sender  rcvTime   pos_x   pos_y  pos_z  spd_x  spd_y  spd_z  acl_x  \
195997        9  25268.6  233.06  799.33    0.0  -2.36  14.31    0.0   0.06   
196001        9  25269.6  230.79  813.62    0.0  -2.36  14.36    0.0  -0.10   
196005        9  25270.6  228.36  827.93    0.0  -2.28  14.30    0.0   0.12   
196009        9  25271.6  226.15  842.24    0.0  -2.25  14.30    0.0   0.08   
196013        9  25272.6  224.15  856.55    0.0  -2.18  14.13    0.0   0.23   
...         ...      ...     ...     ...    ...    ...    ...    ...    ...   
2074950       9  25229.6  220.33  382.88    0.0  -0.99   6.64    0.0   0.60   
2074955       9  25230.6  219.61  388.16    0.0  -0.62   4.15    0.0   0.48   
2074960       9  25231.6  219.76  392.18    0.0   0.06   4.94    0.0   0.03   
2074965       9  25232.6  222.61  397.54    0.0   3.77   5.95    0.0   0.84   
2074970       9  25233.6  229.14  402.37    0.0   7.37   5.10    0.0   1.94   

         acl_y  acl_z  hed_x  hed_y  hed_z  
195997   -0.37    0.0  -0.07   1.00    0.0  
196001    0.58    0.0  -0.07   1.00    0.0  
196005   -0.75    0.0  -0.07   1.00    0.0  
196009   -0.51    0.0  -0.08   1.00    0.0  
196013   -1.47    0.0  -0.07   1.00    0.0  
...        ...    ...    ...    ...    ...  
2074950  -4.04    0.0  -0.07   1.00    0.0  
2074955  -3.20    0.0  -0.07   1.00    0.0  
2074960   2.60    0.0   0.09   1.00    0.0  
2074965   1.33    0.0   0.59   0.80    0.0  
2074970   1.34    0.0   0.86   0.51    0.0 

Upvotes: 1

Views: 47

Answers (1)

safex
safex

Reputation: 2514

While you should provide sample data , in this simple case you can simply sort twice:

df = df.sort_values(['sender', 'rcvTime'], ascending=[True, False])

You can change the flags in ascending to change the ordering, see also the documentation.

Upvotes: 1

Related Questions