Reputation: 974

python pandas: pivot table

I have a table like so:

Name   | ID | Contact_method | Contact
sarah    1   house            h1
sarah    1   mobile           m1
sarah    1   email            sarah@mail
bob      2   house            h2
bob      2   mobile           m2
bob      2   email            bob@mail
jones    3   house            h3
jones    3   mobile           m3
jones    3   email            jones@mail
jones    4   house            h4
jones    4   mobile           m4
jones    4   email            jones2@mail

And I want it like so:

Name  | ID | house | mobile | email
sarah   1    h1      m1       sarah@mail
bob     2    h2      m2       bob@mail
jones   3    h3      m3       jones@mail
jones   4    h4      m4       jones2@mail

I can already do this, but only through a very expensive pd.concat operation iterated over all the unique IDs. Is there a simple way to do this? I've also tinkered with pivot() and transpose(). Note that the duplicate name is there so that I can't rely on uniqueness of column values to, say, do a join.

Upvotes: 2

Answers (3)

jezrael

Reputation: 863611

I think piRSquared's solution is very nice, but if get:

ValueError: Index contains duplicate entries, cannot reshape

print (df)
     Name  ID Contact_method      Contact
0   sarah   1          house           h1
1   sarah   1         mobile           m1
2   sarah   1          email   sarah@mail
3     bob   2          house           h2
4     bob   2         mobile           m2
5     bob   2          email     bob@mail
6   jones   3          house           h3
7   jones   3         mobile           m3
8   jones   3          email   jones@mail <-for same Name,ID and Contact_method get duplicate
9   jones   3          email     joe@mail <-for same Name,ID and Contact_method get duplicate
10  jones   4          house           h4
11  jones   4         mobile           m4
12  jones   4          email  jones2@mail

use pivot_table or groubpy with aggregating join:

cols = ['Name','ID','house','mobile','email']
df1 = df.pivot_table(index=['ID','Name'],
                     columns='Contact_method', 
                     values='Contact', 
                     aggfunc=','.join)
        .rename_axis(None, 1)
        .reset_index()
        .reindex_axis(cols, axis=1)
print (df1)
    Name  ID house mobile                email
0  sarah   1    h1     m1           sarah@mail
1    bob   2    h2     m2             bob@mail
2  jones   3    h3     m3  jones@mail,joe@mail <- join duplicates
3  jones   4    h4     m4          jones2@mail

df1 = df.groupby(['Name', 'ID', 'Contact_method'])['Contact']
        .apply(','.join)
        .unstack()
        .rename_axis(None, 1)
        .reset_index()
        .reindex_axis(cols, axis=1)
print (df1)
    Name  ID house mobile                email
0  sarah   1    h1     m1           sarah@mail
1    bob   2    h2     m2             bob@mail
2  jones   3    h3     m3  jones@mail,joe@mail <- join duplicates
3  jones   4    h4     m4          jones2@mail

Upvotes: 0

piRSquared

Reputation: 294516

Set the index with all columns except 'Contact_method', then unstack

df.set_index(
    ['Name', 'ID', 'Contact_method']
)['Contact'].unstack().rename_axis(None, 1).reset_index()

    Name  ID        email house mobile
0    bob   2     bob@mail    h2     m2
1  jones   3   jones@mail    h3     m3
2  jones   4  jones2@mail    h4     m4
3  sarah   1   sarah@mail    h1     m1

Upvotes: 2

greedy52

Reputation: 1405

One way is to build a contact dictionary (of dictionaries) based on ID, 'manualy'. Not sure if it is more efficient though.

people = dict()
for index, row in pd.iterrows():
    ID = row['ID']
    if ID not in people:
        people[ID] = {'ID': ID, 'Name': row['Name']}
    people[ID][row['Contact_method']] = row['Contact']

print pandas.DataFrame(people).transpose()

And output is:

  ID   Name        email house mobile
1  1  sarah   sarah@mail    h1     m1
2  2    bob     bob@mail    h2     m2
3  3  jones   jones@mail    h3     m3
4  4  jones  jones2@mail    h4     m4

Upvotes: 0

python pandas: pivot table

Answers (3)

Related Questions