Reputation: 849
Considering the two dataframes below:
import pandas as pd
df = pd.DataFrame({'ID_ANTENNA':[1,2,3,4,5], 'LAT':['-21.12699', '-21.12699', '-21.48194','-21.12699', '-21.11886'],
'LONG': ['-44.26144', '-44.26144', '-44.33039', '-44.21979', '-44.16478']})
city_ids = ['SJDR1', 'LD', 'SVM', 'TR']
sjdr = [1,2]
ld = [3]
svm = [4]
tr = [5]
df2 = pd.DataFrame({'ID_ANTENNA':[sjdr, ld, svm, tr], 'CITY_IDS':city_ids})
df2['ID_ANTENNA'] = df2['ID_ANTENNA'].astype(str)
df2['ID_ANTENNA'] = df2['ID_ANTENNA'].str.replace('[','')
df2['ID_ANTENNA'] = df2['ID_ANTENNA'].str.replace(']','')
I need to compare the column ID_ANTENNA
from df
with the same column from df2
. If the value of ID_ANTENNA
appears within df2
's ID_ANTENNA
, the CITY_IDS
column must be added to df
. An example of desired output:
ID | LAT | LONG | CITY_IDS
1 |-21.12699| -44.26144| SJDR1
2 |-21.12699| -44.26144| SJDR1
3 |-21.48194| -44.33039| LD
4 |-21.12699| -44.21979| SVM
5 |-21.11886| -44.16478| TR
However, I have no idea how to iterate over this column. Can anyone help?
P.s.: I transform the ID_ANTENNA
column from df2
to string because it is the column format in the real scenario, the result of reading a .csv file.
Upvotes: 0
Views: 55
Reputation: 46
for i in range(len(df['ID_ANTENNA'])):
for j in range(len(df2['ID_ANTENNA'])):
val=map(int,df2['ID_ANTENNA'][j].split(','))
if df['ID_ANTENNA'][i] in val:
df['CITY_IDS'].iloc[i]=df2['CITY_IDS'][j]
df
Run above code to get desired output:
Upvotes: 1
Reputation: 150735
You can split the ID_ANTENNA
, explode it and merge. Note that the ID_ANTENNA
in df
is integer while ID_ANTENNA
in df2
is string type:
df.merge(df2.assign(ID_ANTENNA=df2['ID_ANTENNA'].str.split(', '))
.explode('ID_ANTENNA')
.assign(ID_ANTENNA=lambda x: x['ID_ANTENNA'].astype(int)),
on='ID_ANTENNA', how='left')
Output:
ID_ANTENNA LAT LONG CITY_IDS
0 1 -21.12699 -44.26144 SJDR1
1 2 -21.12699 -44.26144 SJDR1
2 3 -21.48194 -44.33039 LD
3 4 -21.12699 -44.21979 SVM
4 5 -21.11886 -44.16478 TR
Upvotes: 1