Pandas pivot table and sort by multiple values

Question

I am trying to sort this table first by IN_FID ascending from top to bottom, and then by Jurisdiction, ascending from left to right. I was able to pivot the table and sort by IN_FID, but how do I add a second sort from left to right.

df = pd.read_csv(r'C:my\path\myfile.csv')

df['Key']=df.groupby('IN_FID').cumcount()+1
s=df.pivot_table(index='IN_FID',columns='Key',values=['Jurisdiction','CURR_VOL'],aggfunc='first')
s=s.sort_index(level=1,axis=1)
s.columns=s.columns.map('{0[0]}_{0[1]}'.format)                   

s.to_csv(r'C:\my\path\mynewfile.csv')

Where myfile.csv looks like this:

ROUTE_NAME  CURR_VOL    IN_FID  NEAR_RANK   Jurisdiction
test1       test1       1       test1       2
test1       test1       1       test1       3
test2       test2       2       test2       1
test3       test3       3       test3       2
test3       test3       3       test3       1

And mynewfile.csv would look like this:

IN_FID  CURR_VOL_1  Jurisdiction_1  CURR_VOL_2  Jurisdiction_2
1       test1       2               test1       3
2       test2       1       
3       test3       1               test3       2

Currently mynewfile.csv looks like this:

IN_FID  CURR_VOL_1  Jurisdiction_1  CURR_VOL_2  Jurisdiction_2
1       test1       2               test1       3
2       test2       1       
3       test3       2               test3       1

Any tips would be greatly appreciated.

Sumanth · Accepted Answer

You can try this using groupby on IN_FID and then do unstack().

df_new = df.sort_values(['IN_FID','Jurisdiction']) \
         .groupby('IN_FID')['CURR_VOL','Jurisdiction'] \
         .apply(lambda x: pd.DataFrame(x.values, columns['CURR_VOL','Jurisdiction'])) \
         .unstack().sort_index(1, level=1)

df_new.columns = df_new.columns.droplevel(1)
df_new.reset_index(inplace=True)

Output looks like this

IN_FID  CURR_VOL    Jurisdiction    CURR_VOL    Jurisdiction
1       test1       2               test1       3
2       test2       1               None        None
3       test3       1               test3       2

Now you can use this df_new by renaming the columns as you like.

Pandas pivot table and sort by multiple values

Answers (1)

Related Questions