Reputation: 35
I have two dataframes as described above
I would like to create in the second table an additional feature (Col_to_create) related to the value of feature A.
Table 2 has more than 800 000 samples so that I ask for a faster way to do that.
First table:
a b
1 100
2 400
3 500
Second table:
id Refer_to_A Col_to_create
0 3 500
1 1 100
2 3 500
3 2 400
4 1 100
Upvotes: 1
Views: 1227
Reputation: 17824
You can use the method map
:
df2['Col_to_create'] = df2['Refer_to_A'].map(df1.set_index('a')['b'])
Output:
Refer_to_A Col_to_create
id
0 3 500
1 1 100
2 3 500
3 2 400
4 1 100
Upvotes: 3
Reputation: 12669
One possible way is you can apply the function on new column of the dataset :
If your dataset is :
dataframe_a = pd.DataFrame({'a': [1,2,3], 'b': [100,400,500]})
dataframe_b = pd.DataFrame({'Refer_to_A': [3,1,3,2,1]})
You can try something like :
dataframe_b['Col_to_create'] = dataframe_b['Refer_to_A'].apply(lambda col: dataframe_a['b'][col-1])
output:
Refer_to_A Col_to_create
0 3 500
1 1 100
2 3 500
3 2 400
4 1 100
Upvotes: 2