Conditionally Aggregating Pandas DataFrame

Question

I have a DataFrame that looks like:

import pandas as pd

df = pd.DataFrame([[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0],
                   [9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0],
                   [17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0]], 
                   columns=['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'])

      A     B     C     D     E     F     G     H
0   1.0   2.0   3.0   4.0   5.0   6.0   7.0   8.0
1   9.0  10.0  11.0  12.0  13.0  14.0  15.0  16.0
2  17.0  18.0  19.0  20.0  21.0  22.0  23.0  24.0

And I have a list of columns:

l = ['A', 'C', 'D', 'E']

For each element of my list, I want to get the mean of the dataframe columns that precede it plus twice the value in its own column. So, A will only depend on itself, C will depend on A and itself, D will depend on the sum of A, C, and itself, and E will depend on A, C, D, and itself. I have accomplished what I need in the following way:

for i, col in enumerate(l):
    other_cols = l[:i]
    df['tmp_' + col] = df[other_cols].mean(axis=1) + 2.0 * df[col]

      A     B     C     D     E     F     G     H  tmp_A  tmp_C  tmp_D  \
0   1.0   2.0   3.0   4.0   5.0   6.0   7.0   8.0    NaN    7.0   10.0   
1   9.0  10.0  11.0  12.0  13.0  14.0  15.0  16.0    NaN   31.0   34.0   
2  17.0  18.0  19.0  20.0  21.0  22.0  23.0  24.0    NaN   55.0   58.0   

       tmp_E  
0  12.666667  
1  36.666667  
2  60.666667

I was wondering if there was an even more Pythonic way to accomplish the same thing rather than having to run through the for loop?

Conditionally Aggregating Pandas DataFrame

Answers (1)

Related Questions