Apply permutation matrix to pandas DataFrame

Question

I have two identically sized DataFrames (call them A and B) with the same set of index and column names. A and B have a different ordering of their (row/column) labels and I want them to be identically labeled so I can directly manipulate the matrices in other programs. Mathematically, there is a permutation matrix P that reshuffles one matrix labels to another, so I can apply this transformation by constructing the matrix. I feel however, that this is overkill and a solution should exist within pandas itself.

Is there an easy way to do this pandas without the ugly hack below?

# construct A,B ...
import numpy as np
import pandas as pd

# Construct the permutation matrix
permutation = [np.where(B.index==a_idx)[0][0] for a_idx in A.index]
P = np.zeros(A.shape)
for k,row in zip(permutation,P):  row[k] = 1

B2 = (P).dot(B.values).dot(P.T)
B2 = pd.DataFrame(B2, index = A.index, columns = A.index)

Alex Riley · Accepted Answer

You could use reindex_like to reorder the rows/columns of one DataFrame to conform to another DataFrame.

>>> vals = np.arange(9).reshape(3,3)
>>> df1 = pd.DataFrame(vals, columns=list('def'), index=list('abc'))
>>> df2 = pd.DataFrame(vals, columns=list('efd'), index=list('bac'))

Then to reorder df1 so that it is the same as df2:

>>> df1.reindex_like(df2)
   e  f  d
b  4  5  3
a  1  2  0
c  7  8  6

Apply permutation matrix to pandas DataFrame

Answers (1)

Related Questions