Reputation: 9437
I have two dataframes of word counts, let's say the first is...
word count
0 HELLO 8
1 MORE 10
2 HELP 19
3 NO 100
and second is...
word count
0 HELLO 10
1 MORE 12
2 NONE 20
3 NO 56
The result should be...
word count
0 HELLO 18
1 MORE 22
2 HELP 19
2 NONE 20
3 NO 156
The order does not matter. But I must ensure that all words are preserved. If the word exists in both dataframes, we sum the count. If one does not exist in the other, we simply add it.
I figured out how to add two dataframes...
df_add = df1.add(df2, fill_value=0)
But this is all I know. Any help is appreciated.
Upvotes: 3
Views: 290
Reputation: 410
You can also use append and groupby to get the result using this one liner -
df_1.append(df_2).groupby('word', as_index=False).sum()
Upvotes: 2
Reputation: 42916
You can use the following, first pandas.merge
the dataframes together. Then sum both count
columns and finally use fillna
to fill the NaN
df3 = pd.merge(df1, df2, on='word', how='outer', suffixes=['', '_2'])
df3['count'] = df3['count'].fillna(0) + df3['count_2'].fillna(0)
df3['count'].fillna(df3['count_2'], inplace=True)
df3.drop('count_2', axis=1, inplace=True)
print(df3)
word count
0 HELLO 18.0
1 MORE 22.0
2 HELP 19.0
3 NO 156.0
4 NONE 20.0
Upvotes: 1
Reputation: 38415
You can merge the dataframes and sum,
new_df = df1.merge(df2, on = 'word', how = 'outer')
new_df['count'] = new_df[['count_x', 'count_y']].sum(1)
new_df.drop(['count_x', 'count_y'], 1, inplace = True)
word count
0 HELLO 18.0
1 MORE 22.0
2 HELP 19.0
3 NO 156.0
4 NONE 20.0
Upvotes: 3