Reputation: 3751
consider the below pd.DataFrame
temp = pd.DataFrame({'label_0':[1,1,1,2,2,2],'label_1':['a','b','c',np.nan,'c','b'], 'values':[0,2,4,np.nan,8,5]})
print(temp)
label_0 label_1 values
0 1 a 0.0
1 1 b 2.0
2 1 c 4.0
3 2 NaN NaN
4 2 c 8.0
5 2 b 5.0
my desired output is
label_1 1 2
0 a 0.0 NaN
1 b 2.0 5.0
2 c 4.0 8.0
3 NaN NaN NaN
I have tried pd.pivot
and wrangling around with pd.gropuby
but cannot get to the desired output due to duplicate entries. any help most appreciated.
Upvotes: 1
Views: 69
Reputation: 150785
Seems like a straightforward pivot
works:
temp.pivot(columns='label_0', index='label_1', values='values')
Output:
label_0 1 2
label_1
NaN NaN NaN
a 0.0 NaN
b 2.0 5.0
c 4.0 8.0
Upvotes: 0
Reputation: 294488
d = {}
for _0, _1, v in zip(*map(temp.get, temp)):
d.setdefault(_1, {})[_0] = v
pd.DataFrame.from_dict(d, orient='index')
1 2
a 0.0 NaN
b 2.0 5.0
c 4.0 8.0
NaN NaN NaN
OR
pd.DataFrame.from_dict(d, orient='index').rename_axis('label_1').reset_index()
label_1 1 2
0 a 0.0 NaN
1 b 2.0 5.0
2 c 4.0 8.0
3 NaN NaN NaN
Upvotes: 3
Reputation: 153500
Another way is to use set_index and unstack:
temp.set_index(['label_0','label_1'])['values'].unstack(0)
Output:
label_0 1 2
label_1
NaN NaN NaN
a 0.0 NaN
b 2.0 5.0
c 4.0 8.0
Upvotes: 3
Reputation: 323326
You can do fillna
then pivot
temp.fillna('NaN').pivot(*temp.columns).T
Out[251]:
label_0 1 2
label_1
NaN NaN NaN
a 0 NaN
b 2 5
c 4 8
Upvotes: 2