Reputation: 1729
New to Pandas and having trouble with something that seems simple. I've tried various merge and concat commands but haven't hit on the proper one yet.
Have two dfs that look like this:
newstatus count()
0 field1 1
1 field2 7
2 field3 146
3 field4 94
[4 rows x 2 columns]
newstatus count()
0 field2 1
1 field3 32
2 field4 60
[3 rows x 2 columns]
What is the most efficient operation to combine them and add the count() field to get:
newstatus count()
0 field1 1
1 field2 8
2 field3 178
3 field4 154
[4 rows x 2 columns]
Note that the dfs don't necessarily have values for each field, i.e. field1 doesn't show up in the second df at all.
Upvotes: 0
Views: 57
Reputation: 251438
Not sure if this is the most efficient performance-wise, but I think it's the most straightforward:
>>> d1.set_index('newstatus').add(d2.set_index('newstatus'), fill_value=0).reset_index()
newstatus count
0 field1 1
1 field2 8
2 field3 178
3 field4 154
[4 rows x 2 columns]
It is slightly more awkward because you have newstatus
as a column rather than the index. If you instead set your data up with newstatus
as the index, like this:
count
newstatus
field1 1
field2 7
field3 146
field4 94
Then you can just do
>>> d1.add(d2, fill_value=0)
count
newstatus
field1 1
field2 8
field3 178
field4 154
Upvotes: 2