Reputation: 878
I have a LONG list of dataframes (with many columns ...) like so:
DF1
nameOne total_one total_two nameTwo nameThree ...
0 nn1 2 3 nn4 nn7
1 nn2 2 3 nn5 nn8
2 nn3 2 3 nn6 nn9
DF2
nameOne total_one total_two nameTwo nameThree. ....
0 nn1 4 6 nn4 nn7
1 nn2 4 6 nn5 nn8
2 nn3 4 6 nn6 nn9
What I want the result to look like:
RESULT_DF
nameOne total_one total_two nameTwo nameThree. ....
0 nn1 6 9 nn4 nn7
1 nn2 6 9 nn5 nn8
2 nn3 6 9 nn6 nn9
In other words, I just want to add the number columns.
I tried doing this: DF1.add(DF2, fill_value=0)
But I get this error: TypeError: unsupported operand type(s) for +: 'int' and 'str'
For brevity, here I only showed one Str column. But I have many more STR columns, and many more int/float columns.
Explanation: Please note this is not the same as THIS question because I mentioned that I have multiple string fields.
Upvotes: 0
Views: 129
Reputation: 26676
Another way
use df.filter
to filter out DF2
's numeric columns. Add them to DF1
. What results is aggregated results of numeric columns and NaNs for the str. Chain combine_first
and this updates the NaNS
Option1; 2 dataframes
If have only two datframes
RESULT_DF=(DF2.filter(like='total').add(DF1)).combine_first(DF1)
RESULT_DF
nameOne nameThree nameTwo total_one total_two
0 nn1 nn7 nn4 6 9
1 nn2 nn8 nn5 6 9
2 nn3 nn9 nn6 6 9
Option2 list of datframes
If you have lists; select non numeric dtypes and join to summation after using reduce in from functools
DF3 = DF2.copy()
DF4 = DF1.copy()
LST=[DF1,DF2,DF3]
from functools import reduce
df = DF1.select_dtypes(exclude='int64').join(reduce(lambda x, y:\
(x.filter(like='total')).add(y.filter(like='total'), fill_value=0), LST))
print(df)
nameOne nameTwo nameThree total_one total_two
0 nn1 nn4 nn7 10 15
1 nn2 nn5 nn8 10 15
2 nn3 nn6 nn9 10 15
Upvotes: 0
Reputation: 153500
You can try using select_dtypes
and pd.concat
:
pd.concat([df1.select_dtypes(include='number').add(df2.select_dtypes(include='number')),
df1.select_dtypes(exclude='number'),
df2.select_dtypes(exclude='number')], axis=1)
Output:
total_one total_two nameOne nameOne
0 6 9 nn1 nn1
1 6 9 nn2 nn2
2 6 9 nn3 nn3
df3 = df1.copy()
df4 = df2.copy()
lofdfs = [df1, df2, df3, df4]
from functools import reduce
df_add = reduce(lambda x,y: x.select_dtypes(include='number').add(y.select_dtypes('number')), lofdfs)
pd.concat([i.select_dtypes(exclude='number') for i in lofdfs]+[df_add], axis=1, keys=['df1','df2','df3','df4','df_add'])
Output:
df1 df2 df3 df4 df_add
nameOne nameOne nameOne nameOne total_one total_two
0 nn1 nn1 nn1 nn1 12 18
1 nn2 nn2 nn2 nn2 12 18
2 nn3 nn3 nn3 nn3 12 18
Moving totals to the front:
df_add = reduce(lambda x,y: x.select_dtypes(include='number').add(y.select_dtypes('number')), lofdfs)
print(pd.concat([df_add]+[i.select_dtypes(exclude='number') for i in lofdfs], axis=1, keys=['df1','df2','df3','df4','df_add']))
Output:
df1 df2 df3 df4 df_add
total_one total_two nameOne nameOne nameOne nameOne
0 12 18 nn1 nn1 nn1 nn1
1 12 18 nn2 nn2 nn2 nn2
2 12 18 nn3 nn3 nn3 nn3
Upvotes: 2