Krishnang K Dalal
Krishnang K Dalal

Reputation: 2566

Pandas JOIN/MERGE/CONCAT Data Frame On Specific Indices

I want to join two data frames specific indices as per the map (dictionary) I have created. What is an efficient way to do this?

Data:

df = pd.DataFrame({"a":[10, 34, 24, 40, 56, 44], 
                    "b":[95, 63, 74, 85, 56, 43]}) 

print(df)

    a   b
0  10  95
1  34  63
2  24  74
3  40  85
4  56  56
5  44  43


df1 = pd.DataFrame({"c":[1, 2, 3, 4], 
                   "d":[5, 6, 7, 8]}) 

print(df1)
   c  d
0  1  5
1  2  6
2  3  7
3  4  8

d = { 
    (1,0):0.67,
    (1,2):0.9,
    (2,1):0.2,
    (2,3):0.34
    (4,0):0.7,
    (4,2):0.5
}

Desired Output:

  a  b  c  d  ratio
0 34 63 1  5  0.67
1 34 63 3  7  0.9
...
5 56 56 3  7  0.5

I'm able to achieve this but it takes a lot of time since my original data frames' map has about 4.7M rows to map. I'd love to know if there is a way to MERGE, JOIN or CONCAT these data frames on different indices.

My Approach:

matched_rows = []
for key in d.keys():                                              
    s = df.iloc[key[0]].tolist() + df1.iloc[key[1]].tolist() + [d[key]]
    matched_rows.append(s)

df_matched = pd.DataFrame(matched_rows, columns = df.columns.tolist() + df1.columns.tolist() + ['ratio']

I would highly appreciate your help. Thanks a lot in advance.

Upvotes: 1

Views: 26

Answers (1)

jezrael
jezrael

Reputation: 863751

Create Series and then DaatFrame by dictioanry, DataFrame.join both and last remove first 2 columns by positions:

df = (pd.Series(d).reset_index(name='ratio')
        .join(df, on='level_0')
        .join(df1, on='level_1')
        .iloc[:, 2:])
print (df)
   ratio   a   b  c  d
0   0.67  34  63  1  5
1   0.90  34  63  3  7
2   0.20  24  74  2  6
3   0.34  24  74  4  8
4   0.70  56  56  1  5
5   0.50  56  56  3  7

And then if necessary reorder columns:

df = df[df.columns[1:].tolist() + df.columns[:1].tolist()]
print (df)
    a   b  c  d  ratio
0  34  63  1  5   0.67
1  34  63  3  7   0.90
2  24  74  2  6   0.20
3  24  74  4  8   0.34
4  56  56  1  5   0.70
5  56  56  3  7   0.50

Upvotes: 2

Related Questions