Spandan Brahmbhatt
Spandan Brahmbhatt

Reputation: 4044

Pandas Pivot the data based on all unique values in Dataframe

I am currently having an Dataframe looking as follow :

df.head()

df = pd.DataFrame({'browser': ['Chrome', 'Chrome', 'firefox', 'firefox', 'Safari'],
 'version': ['87', '88', '92', '93', '12'],
 'Monday': ['11am', '1pm', '1pm', '12pm', '2pm'],
 'Tuesday': ['11am', '1pm', '12pm', '10am', '2pm'],
 'Wednesday': ['12pm', '10am', '1pm', '12pm', '12pm'],
 'Thursday': ['10am', '11am', '10am', '10am', '12pm'],
 'Friday': ['2pm', '10am', '10am', '12pm', '11am'],
 'Saturday': ['11am', '12pm', '10am', '11am', '11am'],
 'Sunday': ['11am', '12pm', '11am', '12pm', '11am']})

I am trying to pivot the above dataframe so that

  1. The index are 24 hours of the day i.e the values seen in columns Monday...Sunday
  2. The columns are Monday to Sunday
  3. Values to be filled is unique count of browser+version ( Assuming each row in input df will always have unique browser+version when string concated)

The final dataframe would look like this (showing top 5 rows but in total will be 24 rows - 1 for each hour): final_df.head()

I tried looking at the code for pd.pivot especially the argument index and tried to provide my index but that results in error. Eg.

index_for_new_df = ['12am'] + [f'{x}am' for x in range(1,12)] + ['12pm'] + [f'{x}pm' for x in range(1,12)]
# will generate list like [... ,'10am', '11am', '12pm', '1pm', '2pm' ,...]
pd.pivot(df,index=index_for_new_df,...)

Is there any other function that can be used ? I am trying to avoid writing a too many nested FOR loop and trying to see if there any inbuilt function that can be used ( For loop is fine as long as majority of processing is done by pandas). Is there any other function to try ?

Upvotes: 1

Views: 1198

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150745

Try melt then pivot_table:

(df.melt(['browser','version'], value_name='hour', var_name='day')
   .pivot_table(index='hour', columns='day', values='browser', 
                aggfunc='size', fill_value=0)
)

Output:

day   Friday  Monday  Saturday  Sunday  Thursday  Tuesday  Wednesday
hour                                                                
10am       2       0         1       0         3        1          1
11am       1       1         3       3         1        1          0
12pm       1       1         1       2         1        1          3
1pm        0       2         0       0         0        1          1
2pm        1       1         0       0         0        1          0

Upvotes: 2

Related Questions