Alice_inwonderland
Alice_inwonderland

Reputation: 348

Perform calculation on multiple columns at once with pandas

I have a big dataframe with more than 1 million rows. The current df has only columns X,a,b,c. I want to perform a calculation that yields new columns: new_a,new_b,new_c (see picture)

The calculation is: new_a = a/(X^2)

I already have a way to do it in python:

col_list = ['a','b','c']

def new(col,X):
    score = col/(X**2)
    return score

new_col = ['new_a','new_b','new_c']

def calculate(df):
    for i in range(len(new_col)):
        df[new_col[i]] = df.apply(lambda row: new(row[col_list[i]],row['X']),axis=1)

calculate(df)

I wonder if there is another way to achieve the same goal? This current way of doing it is fine but takes a lot of time to run and somehow yields weird results for certain operations. Thank you.

enter image description here

Upvotes: 2

Views: 5122

Answers (2)

Silenced Temporarily
Silenced Temporarily

Reputation: 1004

Do you want a/X^2 or a/X? You ask for one but your example shows the other.

for col in col_list:
    new_col = 'new_' + col
    df[new_col] = df[col] / (df['X']**2)

will give you what you ask for, if what you want is actually a/X adjust accordingly.

Upvotes: 1

cs95
cs95

Reputation: 402413

col_list = ['a','b','c']
df = pd.concat(
    [df, df[col_list].div(df['X'] ** 2, axis=0).add_prefix('new_')], axis=1
)

df
   X  a  b  c     new_a     new_b     new_c
0  5  3  4  5  0.120000  0.160000  0.200000
1  7  2  4  2  0.040816  0.081633  0.040816

Pandas performs index-aligned division on each column, just concatenate the result afterwards.

Upvotes: 3

Related Questions