Reputation: 105
I have data with duplicate parts that looks like this:
Part | Location | ONHand
A | XY | 5
A | XW | 4
B | XC | 6
B | XV | 8
C | XQ | 9
And I'm trying to convert it all into one row per part, listing all the locations and quantities on hand in each location.
I tried using this code
df_f = df.assign(cc=df.groupby('Part').cumcount()+1).set_index(['Part', 'cc']).unstack()
df_f.columns = [f'{col[0]}{col[1]}' for col in df_f.columns]
df_f.to_csv('parts_multi_location.csv')
But the problem is it returns Location 1, 2, 3 and then ONHand 1, 2, 3 and so forth.
I need the end result to return Location 1, Onhand 1, Location 2, Onhand 2, so the headers should look like this:
Part | Location_1 | Onhand_1 | Location 2| Onhand 2
A | XY | 5 | XW | 4
B | XC | 6 | XV | 8
C | XQ | 9
Upvotes: 2
Views: 38
Reputation: 23099
IIUC, Quang's solution is much slicker, but lets see how we could solve this using the Sorted method from the standard library.
if you use
f'{col[0]}_{col[1]}'
as Quang has suggested we can split by the _
to get the numerical value and use that to sort your columns.
d = {k : int(k.split('_')[1]) for k in df_f.columns}
#{'Location_1': 1, 'Location_2': 2, 'ONHand_1': 1, 'ONHand_2': 2}
df_f[sorted(d,key=d.get)]
Location_1 ONHand_1 Location_2 ONHand_2
Part
A XY 5.0 XW 4.0
B XC 6.0 XV 8.0
C XQ 9.0 NaN NaN
Upvotes: 0
Reputation: 150785
You did most of the job. The only thing missing is sort_index
:
df_f = df.assign(cc=df.groupby('Part').cumcount()+1).set_index(['Part', 'cc']).unstack()
# this is what you are missing
df_f = df_f.sort_index(level=(1,0), axis=1)
df_f.columns = [f'{col[0]}{col[1]}' for col in df_f.columns]
Output:
Location1 ONHand1 Location2 ONHand2
Part
A XY 5.0 XW 4.0
B XC 6.0 XV 8.0
C XQ 9.0 NaN NaN
Upvotes: 2