Joining key columns on an index

Question

there are two DataFrame, left and right.

left = pd.DataFrame({'C': ['C0', 'C1'],
                     'D': ['D0', 'D1']},
                     index=['K0', 'K1'])

right = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                      'B': ['B0', 'B1', 'B2', 'B3'],
                      'key': ['K0', 'K1', 'K0', 'K1']})

left:

right:

in the left frame keys are "k0, k1" that are index too, and are unique. in the right frame keys are "k0, k1, k0, k1", they can occure any number of time. the question is: how join those frames that the result be:

jezrael · Accepted Answer

First need reshape right by pivot_table:

right1 = right.pivot_table(index='key', 
                           columns=right.groupby('key').cumcount(),
                           aggfunc='first').sort_index(axis=1, level=1)
right1.columns = right1.columns.droplevel(0)
print (right1)
      0   0   1   1
key                
K0   A0  B0  A2  B2
K1   A1  B1  A3  B3

Another solution for right1 is with groupby is with GroupBy.first:

g = right.groupby('key').cumcount()
right1 = right.groupby([right.key, g]).first().unstack().sort_index(axis=1, level=1)
right1.columns = right1.columns.droplevel(0)
print (right1)
      0   0   1   1
key                
K0   A0  B0  A2  B2
K1   A1  B1  A3  B3

And last use concat with inner join or merge too (how='inner' can be omit, because default parameter):

df = pd.concat([left, right1], axis=1, join='inner')
#assign default column names (0,1,2...)
df.columns = np.arange(len(df.columns))
df.index.name = None
print (df)
     0   1   2   3   4   5
K0  C0  D0  A0  B0  A2  B2
K1  C1  D1  A1  B1  A3  B3

df1 = pd.merge(left, right1, left_index=True, right_index=True)
#assign default column names (0,1,2...)
df1.columns = np.arange(len(df1.columns))
print (df1)
     0   1   2   3   4   5
K0  C0  D0  A0  B0  A2  B2
K1  C1  D1  A1  B1  A3  B3

Joining key columns on an index

Answers (2)

Speed Test

Related Questions