Gihan
Gihan

Reputation: 73

Transposing values in a column of a Pandas data frame

I want to create a data frame containing the values of the 'atoms' column in df1 transposed, so that the resulting data frame looks like df2.

df1:

    name    atoms
0   CH4     C
1   CH4     H
2   CH4     H
3   CH4     H
4   CH4     H
5   NH3     N
6   NH3     H
7   NH3     H
8   NH3     H

df2:

    name    a0  a1  a2  a3  a4
0   CH4     C   H   H   H   H
1   CH4     C   H   H   H   H
2   CH4     C   H   H   H   H
3   CH4     C   H   H   H   H
4   CH4     C   H   H   H   H
5   NH3     N   H   H   H   NaN
6   NH3     N   H   H   H   NaN
7   NH3     N   H   H   H   NaN
8   NH3     N   H   H   H   NaN

Is there a way to achieve this using Pandas? I used groupby for this as follows:

df2 = pd.DataFrame(columns=['name','a0','a1','a2','a3','a4'], index=np.arange(9))

c = df1.groupby('name')

df2['name'] = df1['name']

for mol in df1.name.unique():

    df2.iloc[c.indices[mol],np.arange(1,len(c.indices[mol]) +1)] = c.get_group(mol)['atoms'].values

But I feel like there should be a less complicated and faster way to do this.

Upvotes: 1

Views: 73

Answers (1)

user3483203
user3483203

Reputation: 51165

This is mostly a crosstab, but with a couple additional steps.

u = df.assign(key=df.groupby('name').cumcount()).set_index('name')

i = pd.crosstab(u.index, u['key'], u['atoms'], aggfunc='first')

# Cleanup and formatting
i.reindex(u.index).add_prefix('a').rename_axis(None, axis=1).reset_index()

  name a0 a1 a2 a3   a4
0  CH4  C  H  H  H    H
1  CH4  C  H  H  H    H
2  CH4  C  H  H  H    H
3  CH4  C  H  H  H    H
4  CH4  C  H  H  H    H
5  NH3  N  H  H  H  NaN
6  NH3  N  H  H  H  NaN
7  NH3  N  H  H  H  NaN
8  NH3  N  H  H  H  NaN

Upvotes: 1

Related Questions