Iterate over pandas dataframe using value in separate dataframe, filtered by shared column

Question

I have two dataframes in the following form:

df1

id	name	df2_id
one	foo	template_x
two	bar	template_y
three	baz	template_z

df2

id	name	value
template_x	aaa	zzz
template_x	bbb	yyy
template_y	ccc	xxx
template_y	ddd	www
template_z	eee	vvv
template_z	fff	uuu

For each value in df1 where df2_id == df2.id, I'd like to iterate over df2 and append the value of df1.id to name and value in each row to get:

df3

id	concat_name	concat_val
template_x	aaa_one	zzz_one
template_x	bbb_one	yyy_one
template_y	ccc_two	xxx_two
template_y	ddd_two	www_two
template_z	eee_three	vvv_three
template_z	fff_three	uuu_three

Constraints/caveats:

All relevant values are strings, no integers.
Sometimes df2.value is empty, and I would like to keep it empty.

My approach was to use nested for loop with df.iterrows, but it's giving me trouble.

user7864386 · Accepted Answer

Seems like you can merge the DataFrames and add relevant columns together:

merged = df1[['id','df2_id']].merge(df2, left_on='df2_id', right_on='id', suffixes=('_',''))
merged['name'] += '_' + merged['id_']
merged['value'] += '_' + merged['id_']
merged = merged.drop(columns=['id_', 'df2_id']).rename(columns={'name':'concat_name', 'value':'concat_val'})

Output:

           id concat_name concat_val
0  template_x     aaa_one    zzz_one
1  template_x     bbb_one    yyy_one
2  template_y     ccc_two    xxx_two
3  template_y     ddd_two    www_two
4  template_z   eee_three  vvv_three
5  template_z   fff_three  uuu_three

Iterate over pandas dataframe using value in separate dataframe, filtered by shared column

Answers (2)

Related Questions