operations with columns in pandas

Question

I have a df that looks like this:

   I_0 I_1 I_2 10 20 30 SF
q
0    2   3   4  5  1  1 0.2
1    5   4   3  4  7  8 0.3
2    4   8   7  9  8  7 0.4

I would like to make operations on it, such as:

multiplicate columns 10, 20 and 30 by SF
subtract those new columns from I_0, I_1 and I_2.

For example:

I_0 - (column 10 * SF)
I_1 - (column 20 * SF)
I_2 - (column 30 * SF)

and then rename the calculated columns to I_0_sub, I_1_sub and I_2_sub.

I have tried this so far:

df = pd.merge(df1, df2, how = "left", left_index = True, right_index = True)
df["SF"] = df["I_q_2"] / df[10]
test = df.subtract(df2*SF, axis=0)
test.drop(labels=["10", "20", "30", "SF"], axis = "columns", inplace = True)
test = test.add_suffix("_sub")

Getting NaN in for df.

Any help would be appreciated.

jezrael · Accepted Answer

Use filter for select columns starts with I:

df1 = df.filter(regex='^I')
print (df1)
   I_0  I_1  I_2
q               
0    2    3    4
1    5    4    3
2    4    8    7

Then select columns by subset and multiple by mul:

df2 = df[["10", "20", "30"]].mul(df["SF"], axis=0)
print (df2)
    10   20   30
q               
0  1.0  0.2  0.2
1  1.2  2.1  2.4
2  3.6  3.2  2.8

Last substract by sub - but columns are different, so convert values to numpy array:

df = df1.sub(df2.values, 0).add_suffix('_sub')
print (df)
   I_0_sub  I_1_sub  I_2_sub
q                           
0      1.0      2.8      3.8
1      3.8      1.9      0.6
2      0.4      4.8      4.2

If columns names not same in both df get NaNs, because data are not aligned:

print (df1.sub(df2, axis=0))
   10  20  30  I_0  I_1  I_2
q                           
0 NaN NaN NaN  NaN  NaN  NaN
0 NaN NaN NaN  NaN  NaN  NaN
2 NaN NaN NaN  NaN  NaN  NaN

operations with columns in pandas

Answers (1)

Related Questions