Reputation: 275
I have two dataframes of two large excel files. The dataframe 1 is always smaller than dataframe 2. The elements of dataframe 1 are unique while in dataframe there can be many elements repeated by ID and with the same code.
I'm trying to add a new column to my dataframe 1, the new column is the 'code' of the dataframe 2 (add the value if the ID of both dataframes match).
I managed to solve this with two nested for loops, but the process is too slow. Is there an alternative to add the new column?.
The following dataframes are very small and are just to illustrate the example, actually I have a large amount of data with a large number of columns.
import pandas as pd
details_1 = {'ID':['ID01', 'ID02', 'ID03', 'ID04', 'ID05'],
'Qty': [1,2,3,4,5]}
details_2 = {'ID':['IDA01' ,'ID03', 'ID01','ID02','IDA02','IDX12' 'IDA03', 'IDA04', 'IDA05', 'ID04', 'ID05'],
'code': ['ab','yz','acv','abc','efs','xw2','fgt','axf','ard','afd','x01']
}
df1 = pd.Datafrme(details_1, columns = ['ID', 'Qty'])
df2 = pd.Datafrme(details_2, columns = ['ID', 'code'])
output: print(df3)
ID Qty new_code
0 ID01 1 acv
1 ID02 2 abc
2 ID03 3 yz
3 ID04 4 afd
4 ID05 5 x01
Upvotes: 0
Views: 64
Reputation: 2614
Use merge method instead.
Here is the example code.
df3 = pd.merge(df1, df2, on = "ID")
Upvotes: 2