Reputation: 501
my df:
d = {'project_id': [19,20,19,20,19,20],
'task_id': [11,22,11,22,11,22],
"task": ["task_1","task_1","task_1","task_1","task_1","task_1"],
"username": ["tom","jery","tom","jery","tom","jery"],
"image_id":[101,202,303,404,505,606],
"frame":[0,0,9,8,11,11],
"label":['foo','foo','bar','xyz','bar','bar']}
df = pd.DataFrame(data=d)
So my df, is long format, in some duplicate and only image_id
is unique.
I trying pivot my df, with pd.pivot
and pd.merge
reshape to wide format by username
.
My code:
pd.pivot(df, index=['task','frame','image_id'], columns = 'username', values='label')
So, as you see, I don't really need image_id
in my output. Just summary, which user use tag per frame.
Upvotes: 1
Views: 62
Reputation: 260420
You can add a groupby.first
after the pivot
:
(pd.pivot(df, index=['task','frame','image_id'],
columns='username', values='label')
.groupby(level=['task','frame']).first()
)
Or use pivot_table
with aggfunc='first'
:
pd.pivot_table(df, index=['task','frame'],
columns='username', values='label',
aggfunc='first')
Output:
username jery tom
task frame
task_1 0 foo foo
8 xyz None
9 None bar
11 bar bar
Upvotes: 1