Reputation: 502
I have two lists (l1
and l2
) consisting of required columns of two dataframes (df1
and df2
) that I would like to apply operation upon.
One list has all required columns ending with an _x
and the other with an _y
.
I would like to subtract the values of these columns by index, for instance,
df_final['first_col_sub'] = first element of l2 - first element of l1
df_final['second_col_sub'] = second element of l2 - second element of l1 and so on
Actually both of the dataframes have the same column headers and I cannot directly use the column headers to perform the operation which is why I added _x
and _y
in the column headers so that it may help to do the subtraction.
For example,
df1,
1_x | 2_x | 3_x | 4_x | 5_x ...
1 | 2 | 3 | 4 | 5
df2,
1_y | 2_y | 3_y | 4_y | 5_y ...
5 | 4 | 3 | 2 | 1
df_final,
first_col_sub | first_col_sub | first_col_sub | first_col_sub | first_col_sub ...
4 | 2 | 0 | -2 | -4
How may I achieve this? Any help is greatly appreciated.
(If there is anything unclear please let me know.)
Upvotes: 3
Views: 1566
Reputation: 31
You can subtract pandas dataframes directly:
df_final = df2 - df1
Or based on index names:
for x in df_x.columns:
base = x[0:-2]
if base + '_y' in df_y.columns:
df_f[base + '_final'] = df_x[x] - df_y[base + '_y']
Upvotes: 1
Reputation: 862921
If same number of rows and same number of columns with matched values before _
use:
l1 = ['1_x','2_x'...]
l2 = ['1_y','2_y'...]
df2[l2].sub(df1[l1].to_numpy())
Or better is remove _x
and _y
for match values before _
in subtract:
f = lambda x: x[:-2]
df2[l2].rename(columns=f).sub(df1[l1].rename(columns=f))
Also is possible filter by ending y
and x
columns in DataFrame.filter
:
f = lambda x: x[:-2]
df2.filter(regex='y$').rename(columns=f).sub(df1.filter(regex='x$').rename(columns=f))
Or:
df22 = df2.filter(regex='y$')
df11 = df1.filter(regex='x$')
df22.columns = df22.columns.str[:-2]
df11.columns = df11.columns.str[:-2]
df22.sub(df11)
Upvotes: 2