Pandas column concatenation

Question

I have a dataframe (example DF1) with 300 columns of experimental data, where some of the experiments are repeated several times. I am able to use the set default method to get the column names (index), and I was wondering if there was a was to vertically append columns with similar names to a new data frame (example DF2)? I appreciate any help :)

ALollz · Accepted Answer

You can melt then use groupby + cumcount to determine the row label and then you pivot.

Sample Data

import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(1,25).reshape(8,3).T,
                  columns=['E1', 'E1', 'E2', 'E3', 'E4', 'E4', 'E4', 'E5'])

Code

df2 = df.melt()
df2['idx'] = df2.groupby('variable').cumcount()
df2 = (df2.pivot(index='idx', columns='variable', values='value')
          .rename_axis(index=None, columns=None))

    E1   E2    E3    E4    E5
0  1.0  7.0  10.0  13.0  22.0
1  2.0  8.0  11.0  14.0  23.0
2  3.0  9.0  12.0  15.0  24.0
3  4.0  NaN   NaN  16.0   NaN
4  5.0  NaN   NaN  17.0   NaN
5  6.0  NaN   NaN  18.0   NaN
6  NaN  NaN   NaN  19.0   NaN
7  NaN  NaN   NaN  20.0   NaN
8  NaN  NaN   NaN  21.0   NaN

Pandas column concatenation

Answers (1)

Sample Data

Code

Related Questions