applying pandas pivot using existing column names with suffix

Question

I'm trying to use pd.pivot for the first time and struggling how to write it correctly. I have the following dataframe

siteid   contactid name   add    
a01      Mr1       Abe    rand1  
a01      Mr2       Sam    rand2  
a02      Ms1       Ann    rand3  
a03      Mr2       Amy    rand2  
a03      Ms2       Ann    rand3

I want to flatten this so that I have a single row for each siteid as follows.

siteid   contactid_1 name_1   add_1    contactid_2 name_2   add_2    contactid_3 name_3   add_3
a01      Mr1         Abe      rand1    Mr2         Sam      rand2   
a02      Ms1         Ann      rand3    
a03      Mr2         Amy      rand2    Ms2         Ann      rand3      Ms5       Dick     rand4

I don't know how many contacts there may be per site (though don't think it will be more than 6), so need to allow for more columns.

I'm not sure if pivot is the correct way to do this, as when I tried it, it wants to aggregate the data...

r-beginners · Accepted Answer

First, we create a cumulative value for each ID. This will be the column number to be expanded horizontally. Next, transform it with pd.pivot_table(). We create a new column name and update the existing column name.

import pandas as pd
import numpy as np
import io

data = '''
siteid contactid name add    
a01 Mr1 Abe rand1  
a01 Mr2 Sam rand2  
a02 Ms1 Ann rand3  
a03 Mr2 Amy rand2  
a03 Ms2 Ann rand3
'''

df = pd.read_csv(io.StringIO(data), sep='\s+')
df['flg'] = 1
df['flg'] = df.groupby('siteid')['flg'].transform(pd.Series.cumsum)
df2 = pd.pivot_table(df, index=['siteid'], values=['contactid','name','add'], columns=['flg'], fill_value='', aggfunc=lambda x: x)
new_cols = ['{}_{}'.format(x,y) for x,y in df2.columns]
df2.columns = new_cols
df2.reset_index()

| siteid   | add_1   | add_2   | contactid_1   | contactid_2   | name_1   | name_2   |
|:---------|:--------|:--------|:--------------|:--------------|:---------|:---------|
| a01      | rand1   | rand2   | Mr1           | Mr2           | Abe      | Sam      |
| a02      | rand3   |         | Ms1           |               | Ann      |          |
| a03      | rand2   | rand3   | Mr2           | Ms2           | Amy      | Ann      |

applying pandas pivot using existing column names with suffix

Answers (1)

Related Questions