acpigeon
acpigeon

Reputation: 1729

Combine data frames while adding values in Pandas

New to Pandas and having trouble with something that seems simple. I've tried various merge and concat commands but haven't hit on the proper one yet.

Have two dfs that look like this:

        newstatus  count()
0       field1     1
1       field2     7
2       field3     146
3       field4     94
[4 rows x 2 columns]


        newstatus  count()
0       field2     1
1       field3     32
2       field4     60
[3 rows x 2 columns]

What is the most efficient operation to combine them and add the count() field to get:

        newstatus  count()
0       field1     1
1       field2     8
2       field3     178
3       field4     154
[4 rows x 2 columns]

Note that the dfs don't necessarily have values for each field, i.e. field1 doesn't show up in the second df at all.

Upvotes: 0

Views: 57

Answers (1)

BrenBarn
BrenBarn

Reputation: 251438

Not sure if this is the most efficient performance-wise, but I think it's the most straightforward:

>>> d1.set_index('newstatus').add(d2.set_index('newstatus'), fill_value=0).reset_index()
  newstatus  count
0    field1      1
1    field2      8
2    field3    178
3    field4    154

[4 rows x 2 columns]

It is slightly more awkward because you have newstatus as a column rather than the index. If you instead set your data up with newstatus as the index, like this:

           count
newstatus       
field1         1
field2         7
field3       146
field4        94

Then you can just do

>>> d1.add(d2, fill_value=0)
           count
newstatus       
field1         1
field2         8
field3       178
field4       154

Upvotes: 2

Related Questions