finstats
finstats

Reputation: 1409

Pandas: pairwise multiplication of columns based on column name

I have the following DataFrame

>>> df = pd.DataFrame({'ap1_X':[1,2,3,4], 'as1_X':[1,2,3,4], 'ap2_X':[2,2,2,2], 'as2_X':[3,3,3,3]})
>>> df
   ap1_X  as1_X  ap2_X  as2_X
0      1      1      2      3
1      2      2      2      3
2      3      3      2      3
3      4      4      2      3

I would like to multiply ap1_X with as1_X and put that value in as1_X, similarly for ap2_X with as2_X. The common identifier here is the number that comes after the ap or as.

The final DataFrame should look like this

>>> df
   ap1_X  as1_X  ap2_X  as2_X
0      1      1      2      6
1      2      4      2      6
2      3      9      2      6
3      4      16     2      6

I know I can loop through the columns and multiply the columns that have the same 3rd character in the column name, but I was wondering if there is a more "pandas" way of doing this?

UPDATE: The number ID in the column name can be multiple digits (ex: 1, 2, ..., 12, ..., 100). So basically, the ID is the number between 'ap' or 'as', and '_X'.

Upvotes: 3

Views: 766

Answers (2)

cs95
cs95

Reputation: 402902

You can use filter here:

df.filter(like='p') * df.filter(like='s').values

   ap1_X  ap2_X
0      1      6
1      4      6
2      9      6
3     16      6

Another solution is to argsort the column names and slice. This should be very efficient.

idx = np.argsort(df.columns.str[1])
l = len(df) // 2
df.iloc[:, idx[:l]] * df.iloc[:, idx[l:]].values 

   ap1_X  ap2_X
0      1      6
1      4      6
2      9      6
3     16      6

Upvotes: 2

BENY
BENY

Reputation: 323356

You can do groupby with axis=1 and key is the common number

df.groupby(df.columns.str[2],axis=1).prod()
Out[73]: 
    1  2
0   1  6
1   4  6
2   9  6
3  16  6

Upvotes: 3

Related Questions