Viren
Viren

Reputation: 180

How to de-normalize data with pandas dataframe

I have a pandas dataframe created out of CSV file. The dataframe looks like this

srvr_name log_type       hour  
server1   impressionWin  18:00:00 
server1   transactionWin 18:00:00 
server2   impressionWin  18:00:00 
server2   transactionWin 18:00:00 

What I would like to get from this is:

srvr_name impressionWin transactionWin hour
server1   true          true           18:00:00
server2   true          true           18:00:00 

What is the best way to achieve this in pandas?

Upvotes: 2

Views: 143

Answers (2)

Joe
Joe

Reputation: 12417

You can use this:

df = pd.crosstab([df.srvr_name, df.hour], df.log_type).astype(bool).rename_axis(None, 1).reset_index()

Output:

  srvr_name      hour  impressionWin  transactionWin
0   server1  18:00:00           True            True
1   server2  18:00:00           True            True

Upvotes: 1

user3483203
user3483203

Reputation: 51175

Using join with get_dummies

df.join(pd.get_dummies(df.log_type)).groupby(['srvr_name', 'hour']).sum().astype(bool)

                    impressionWin  transactionWin
srvr_name hour
server1   18:00:00           True            True
server2   18:00:00           True            True

Upvotes: 2

Related Questions