Reputation: 47
Is there a pythonic way of sorting the dataframe described below by a combo key - first by ORD_DT_KEY then by ORD_TM_KEY and then by ORD_KEY; and store the rank in another column ORD_RANK?
Code below does not work since it creates 3 separate ranks and throws an error when I try to assign it to single column.
import pandas as pd
#Read orders data stored in a pipe delimeted file
df_raw_orders = pd.read_table("Dummy_order_data.txt", delimiter="|")
print df_raw_orders.head()
df_sorted_orders = df_raw_orders
cols = ['ORD_DT_KEY','ORD_TM_KEY','ORD_KEY']
df_sorted_orders['ORD_RANK'] = df_sorted_orders[cols].rank(method='dense')
Input data, Dummy_order_data.txt, is as below:
ORD_KEY|ORD_DT_KEY|ORD_TM_KEY|QTY|SKU_KEY
10001|1|0|1|1
10001|1|0|1|2
10001|1|0|1|3
10002|2|0|1|1
10002|2|0|1|3
10003|3|0|1|4
10004|4|0|1|4
10004|4|0|1|5
10005|5|0|1|1
10006|6|0|1|1
10006|6|0|1|4
10007|7|0|1|3
10007|7|0|1|4
10008|8|0|1|5
10009|9|0|1|1
10009|9|0|1|4
10009|9|0|1|5
10010|10|0|2|1
10010|10|0|2|2
10010|10|0|2|3
10011|11|0|1|1
10011|11|0|1|3
10012|12|0|1|4
10012|12|0|1|5
10013|13|0|1|1
10014|14|0|2|1
10014|14|0|2|4
Required output:
ORD_KEY ORD_DT_KEY ORD_TM_KEY SKU_KEY QTY ORD_RANK
10001 1 0 1 1 1
10001 1 0 2 1 1
10001 1 0 3 1 1
10002 2 0 1 1 2
10002 2 0 3 1 2
10003 3 0 4 1 3
10004 4 0 4 1 4
10004 4 0 5 1 4
10005 5 0 1 1 5
10006 6 0 1 1 6
10006 6 0 4 1 6
10007 7 0 3 1 7
10007 7 0 4 1 7
10008 8 0 5 1 8
10009 9 0 1 1 9
10009 9 0 4 1 9
10009 9 0 5 1 9
10010 10 0 1 2 10
10010 10 0 2 2 10
10010 10 0 3 2 10
10011 11 0 1 1 11
10011 11 0 3 1 11
10012 12 0 4 1 12
10012 12 0 5 1 12
10013 13 0 1 1 13
10014 14 0 1 2 14
10014 14 0 4 2 14
Upvotes: 1
Views: 297
Reputation: 323236
Cause in your left there return 3 columns so you have to assign those back two 3 columns rather than one
for x in cols:
df[x + 'rank'] = df[x].rank(method='dense')
Upvotes: 2