Reputation: 521
I've 2 data frames in pandas.
in_degree:
Target in_degree
0 2 1
1 4 24
2 5 53
3 6 98
4 7 34
out_degree
Source out_degree
0 1 4
1 2 4
2 3 5
3 4 5
4 5 5
By comparing 2 columns, I'd like to create a new data frame which should add columns "in_degree" and "out_degree" and display the result.
The Sample output should look like
Source/Target out_degree
0 1 4
1 2 5
2 3 5
3 4 29
4 5 58
Any help would be appreciated.
Thanks.
Upvotes: 1
Views: 75
Reputation: 402523
Traditionally, this would need a merge, but I think you can take advantage of pandas' index aligned arithmetic to do this a bit faster.
x = df2.set_index('Source')
y = df1.set_index('Target').rename_axis('Source')
y.columns = x.columns
x.add(y.reindex(x.index), fill_value=0).reset_index()
Source out_degree
0 1 4.0
1 2 5.0
2 3 5.0
3 4 29.0
4 5 58.0
The "traditional" SQL way of solving this would be using merge:
v = df1.merge(df2, left_on='Target', right_on='Source', how='right')
dct = dict(
Source=v['Source'],
out_degree=v['in_degree'].add(v['out_degree'], fill_value=0))
pd.DataFrame(dct).sort_values('Source')
Source out_degree
3 1 4.0
0 2 5.0
4 3 5.0
1 4 29.0
2 5 58.0
Upvotes: 2