Reputation: 2566
I want to join two data frames specific indices as per the map (dictionary
) I have created. What is an efficient way to do this?
Data:
df = pd.DataFrame({"a":[10, 34, 24, 40, 56, 44],
"b":[95, 63, 74, 85, 56, 43]})
print(df)
a b
0 10 95
1 34 63
2 24 74
3 40 85
4 56 56
5 44 43
df1 = pd.DataFrame({"c":[1, 2, 3, 4],
"d":[5, 6, 7, 8]})
print(df1)
c d
0 1 5
1 2 6
2 3 7
3 4 8
d = {
(1,0):0.67,
(1,2):0.9,
(2,1):0.2,
(2,3):0.34
(4,0):0.7,
(4,2):0.5
}
Desired Output:
a b c d ratio
0 34 63 1 5 0.67
1 34 63 3 7 0.9
...
5 56 56 3 7 0.5
I'm able to achieve this but it takes a lot of time since my original data frames' map has about 4.7M rows to map. I'd love to know if there is a way to MERGE
, JOIN
or CONCAT
these data frames on different indices.
My Approach:
matched_rows = []
for key in d.keys():
s = df.iloc[key[0]].tolist() + df1.iloc[key[1]].tolist() + [d[key]]
matched_rows.append(s)
df_matched = pd.DataFrame(matched_rows, columns = df.columns.tolist() + df1.columns.tolist() + ['ratio']
I would highly appreciate your help. Thanks a lot in advance.
Upvotes: 1
Views: 26
Reputation: 863751
Create Series
and then DaatFrame
by dictioanry
, DataFrame.join
both and last remove first 2 columns by positions:
df = (pd.Series(d).reset_index(name='ratio')
.join(df, on='level_0')
.join(df1, on='level_1')
.iloc[:, 2:])
print (df)
ratio a b c d
0 0.67 34 63 1 5
1 0.90 34 63 3 7
2 0.20 24 74 2 6
3 0.34 24 74 4 8
4 0.70 56 56 1 5
5 0.50 56 56 3 7
And then if necessary reorder columns:
df = df[df.columns[1:].tolist() + df.columns[:1].tolist()]
print (df)
a b c d ratio
0 34 63 1 5 0.67
1 34 63 3 7 0.90
2 24 74 2 6 0.20
3 24 74 4 8 0.34
4 56 56 1 5 0.70
5 56 56 3 7 0.50
Upvotes: 2