Reputation: 1241
I have the following pandas dataframe:
count event date
0 1544 'strike' 2016-11-01
1 226 'defense' 2016-11-01
2 1524 'strike' 2016-12-01
3 246 'defense' 2016-12-01
4 1592 'strike' 2017-01-01
5 245 'defense' 2017-01-01
I want to pivot/transform it in such a way the final output looks like this:
event 2016-11-01 2016-12-01 2017-01-01 2017-02-01 2017-03-01
'strike' 1544 1524 1592 1608 1654
'defense' 226 246 245 210 254
but what i'm getting now upon pivoting is this:
count count count count count\
date 2016-11-01 2016-12-01 2017-01-01 2017-02-01 2017-03-01
event
'strike' 1544 1524 1592 1608 1654
'defense' 226 246 245 210 254
is there any way i could remove the entire empty row ahead of the event
index-name and rename the date
index-name with event
as its index-name and also remove the unwanted count
appearing in the first row of the data frame? The data seems to be transforming correctly i just want to get rid of these headers and indexes and have the renamed and removed properly. I also don't want the row labels in the desired output.
This is what i've been trying till now:
output = df.pivot(index='event', columns='date')
print(output)
Upvotes: 4
Views: 13485
Reputation: 31
I would recommend using the more general version of pd.pivot()
, which is pd.pivot_table()
, like so:
x = pd.pivot_table(df, index = 'event', columns = 'date', values = 'count')
You will get:
date 01/01/2017 01/11/2016 01/12/2016
event
'defense' 245 226 246
'strike' 1592 1544 1524
Next, you can get rid of the 'date' string by setting:
x.columns.name = ' '
Additionally, if you want to change the order of the events, you might want to set the variable up as a categorical variable, before doing the pivoting:
df.event = df.event.astype('category') # cast to categorical
df.event.cat.set_categories(your_list, inplace = True) # force order
where your_list
is the list of your categories, in order.
Hope this helps.
Upvotes: 3
Reputation: 862661
Solution is add parameter values
to pivot
, then add reset_index
for column from index
and rename_axis
fro remove column name:
output=df.pivot(index='event',columns='date',values='count').reset_index().rename_axis(None,1)
print(output)
event 2016-11-01 2016-12-01 2017-01-01
0 'defense' 226 246 245
1 'strike' 1544 1524 1592
What happens if omit it?
print (df)
count event date count1
0 1544 'strike' 2016-11-01 1
1 226 'defense' 2016-11-01 7
2 1524 'strike' 2016-12-01 8
3 246 'defense' 2016-12-01 3
4 1592 'strike' 2017-01-01 0
5 245 'defense' 2017-01-01 1
pivot
use each not used column and create MultiIndex
for distinguish original columns:
output = df.pivot(index='event', columns='date')
print(output)
count count1
date 2016-11-01 2016-12-01 2017-01-01 2016-11-01 2016-12-01 2017-01-01
event
'defense' 226 246 245 7 3 1
'strike' 1544 1524 1592 1 8 0
Upvotes: 5