Rearrange the columns according to the order present in a data frame in pandas

Question

I have a data frame

df_in = pd.DataFrame([[1,2,3,4,5,6,7,8,9]], columns=["ab","ef","cd","ij","klm","kln","ghw","ghx","klo"])

I have another data frame, where the order is defined

df_order = pd.DataFrame([["ab","gh"],["cd","ij"],["ef","kl"]], columns=["col1","col2"])

I want to rearrange the columns of the data frame df_in using df_order on the below way.

1st Column name present in col1 then all the columns that start with string present in col2. Then, the column name present in col1 then all the columns that start with string present in col2, then again next row and repeat.

Expected output:

df_out = pd.DataFrame([[1,7,8,3,4,2,5,6,9]], columns=["ab","ghw","ghx","cd","ij","ef","klm","kln","klo"])

How to do it?

sushanth · Accepted Answer

Here is a solution you can try out,

from itertools import chain

# create a numeric index for each key to sort latter.
order_ = {
    v: idx for idx, v in enumerate(chain.from_iterable(df_order.to_numpy()))
}

df_in.loc[:, sorted(df_in.columns, key=lambda x: order_[x[:2]])]

   ab  ghw  ghx  cd  ij  ef  klm  kln  klo
0   1    7    8   3   4   2    5    6    9

Rearrange the columns according to the order present in a data frame in pandas

Answers (1)

Related Questions