Reputation: 81
I have two dataframes in the following form:
df1
id | name | df2_id |
---|---|---|
one | foo | template_x |
two | bar | template_y |
three | baz | template_z |
df2
id | name | value |
---|---|---|
template_x | aaa | zzz |
template_x | bbb | yyy |
template_y | ccc | xxx |
template_y | ddd | www |
template_z | eee | vvv |
template_z | fff | uuu |
For each value in df1
where df2_id
== df2.id
, I'd like to iterate over df2
and append the value of df1.id
to name
and value
in each row to get:
df3
id | concat_name | concat_val |
---|---|---|
template_x | aaa_one | zzz_one |
template_x | bbb_one | yyy_one |
template_y | ccc_two | xxx_two |
template_y | ddd_two | www_two |
template_z | eee_three | vvv_three |
template_z | fff_three | uuu_three |
Constraints/caveats:
df2.value
is empty, and I would like to keep it empty.My approach was to use nested for loop with df.iterrows
, but it's giving me trouble.
Upvotes: 0
Views: 193
Reputation:
Here's a pretty sleek one-liner:
df2[['name', 'value']] = df2[['name', 'value']].T.add('_' + df1.set_index('df2_id').loc[df2['id'], 'id'].reset_index(drop=True)).T
Output:
>>> df2
id name value
0 template_x aaa_one zzz_one
1 template_x bbb_one yyy_one
2 template_y ccc_two xxx_two
3 template_y ddd_two www_two
4 template_z eee_three vvv_three
5 template_z fff_three uuu_three
Upvotes: 1
Reputation:
Seems like you can merge
the DataFrames and add relevant columns together:
merged = df1[['id','df2_id']].merge(df2, left_on='df2_id', right_on='id', suffixes=('_',''))
merged['name'] += '_' + merged['id_']
merged['value'] += '_' + merged['id_']
merged = merged.drop(columns=['id_', 'df2_id']).rename(columns={'name':'concat_name', 'value':'concat_val'})
Output:
id concat_name concat_val
0 template_x aaa_one zzz_one
1 template_x bbb_one yyy_one
2 template_y ccc_two xxx_two
3 template_y ddd_two www_two
4 template_z eee_three vvv_three
5 template_z fff_three uuu_three
Upvotes: 1