Reputation: 73
I want to create a data frame containing the values of the 'atoms' column in df1 transposed, so that the resulting data frame looks like df2.
name atoms
0 CH4 C
1 CH4 H
2 CH4 H
3 CH4 H
4 CH4 H
5 NH3 N
6 NH3 H
7 NH3 H
8 NH3 H
name a0 a1 a2 a3 a4
0 CH4 C H H H H
1 CH4 C H H H H
2 CH4 C H H H H
3 CH4 C H H H H
4 CH4 C H H H H
5 NH3 N H H H NaN
6 NH3 N H H H NaN
7 NH3 N H H H NaN
8 NH3 N H H H NaN
Is there a way to achieve this using Pandas? I used groupby for this as follows:
df2 = pd.DataFrame(columns=['name','a0','a1','a2','a3','a4'], index=np.arange(9))
c = df1.groupby('name')
df2['name'] = df1['name']
for mol in df1.name.unique():
df2.iloc[c.indices[mol],np.arange(1,len(c.indices[mol]) +1)] = c.get_group(mol)['atoms'].values
But I feel like there should be a less complicated and faster way to do this.
Upvotes: 1
Views: 73
Reputation: 51165
This is mostly a crosstab
, but with a couple additional steps.
u = df.assign(key=df.groupby('name').cumcount()).set_index('name')
i = pd.crosstab(u.index, u['key'], u['atoms'], aggfunc='first')
# Cleanup and formatting
i.reindex(u.index).add_prefix('a').rename_axis(None, axis=1).reset_index()
name a0 a1 a2 a3 a4
0 CH4 C H H H H
1 CH4 C H H H H
2 CH4 C H H H H
3 CH4 C H H H H
4 CH4 C H H H H
5 NH3 N H H H NaN
6 NH3 N H H H NaN
7 NH3 N H H H NaN
8 NH3 N H H H NaN
Upvotes: 1