Reputation: 2978
I have a large pandas DataFrame and a separate subset of that DataFrame with a value calculated. I want to merge the values of the subset DataFrame back into the larger one without changing any of the other values.
df_large:
index col_1 col_2 col_3
1 10 15 33
2 23 16 nan
3 33 92 34
4 132 123 nan
5 32 59 nan
And a subset:
df_small:
index col_1 col_2 col_3
2 23 16 34
4 132 123 87
I want the resulting DataFrame to overwrite the values in df_large.col_3 with values in df_small.col_3 only where the particular index exists in df_small:
df_large:
index col_1 col_2 col_3
1 10 15 33
2 23 16 34
3 33 92 34
4 132 123 87
5 32 59 nan
I have tried looking at merge, but I'm not sure how to do this elegantly.
Upvotes: 1
Views: 794
Reputation: 210832
one way (among many) to do it:
df_large.ix[df_small.index, 'col_3'] = df_small.col_3
it seems to be faster compared to combine_first()
In [408]: %timeit df = df_large.combine_first(df_small)
100 loops, best of 3: 6.45 ms per loop
In [409]: %timeit df_large.ix[df_small.index, 'col_3'] = df_small.col_3
100 loops, best of 3: 2.43 ms per loop
Upvotes: 2