Reputation: 4044
I am currently having an Dataframe looking as follow :
df = pd.DataFrame({'browser': ['Chrome', 'Chrome', 'firefox', 'firefox', 'Safari'],
'version': ['87', '88', '92', '93', '12'],
'Monday': ['11am', '1pm', '1pm', '12pm', '2pm'],
'Tuesday': ['11am', '1pm', '12pm', '10am', '2pm'],
'Wednesday': ['12pm', '10am', '1pm', '12pm', '12pm'],
'Thursday': ['10am', '11am', '10am', '10am', '12pm'],
'Friday': ['2pm', '10am', '10am', '12pm', '11am'],
'Saturday': ['11am', '12pm', '10am', '11am', '11am'],
'Sunday': ['11am', '12pm', '11am', '12pm', '11am']})
I am trying to pivot the above dataframe so that
The final dataframe would look like this (showing top 5 rows but in total will be 24 rows - 1 for each hour):
I tried looking at the code for pd.pivot
especially the argument index and tried to provide my index but that results in error.
Eg.
index_for_new_df = ['12am'] + [f'{x}am' for x in range(1,12)] + ['12pm'] + [f'{x}pm' for x in range(1,12)]
# will generate list like [... ,'10am', '11am', '12pm', '1pm', '2pm' ,...]
pd.pivot(df,index=index_for_new_df,...)
Is there any other function that can be used ?
I am trying to avoid writing a too many nested FOR
loop and trying to see if there any inbuilt function that can be used ( For loop is fine as long as majority of processing is done by pandas). Is there any other function to try ?
Upvotes: 1
Views: 1198
Reputation: 150745
Try melt
then pivot_table
:
(df.melt(['browser','version'], value_name='hour', var_name='day')
.pivot_table(index='hour', columns='day', values='browser',
aggfunc='size', fill_value=0)
)
Output:
day Friday Monday Saturday Sunday Thursday Tuesday Wednesday
hour
10am 2 0 1 0 3 1 1
11am 1 1 3 3 1 1 0
12pm 1 1 1 2 1 1 3
1pm 0 2 0 0 0 1 1
2pm 1 1 0 0 0 1 0
Upvotes: 2