kjo
kjo

Reputation: 35301

Multiplication of Pandas DataFrame with Pandas Series

I have a Pandas Series v, with numeric entries v0, v1, ..., vn, and a Pandas DataFrame C, with columns C0, C1, ..., Cn. I want to generate the DataFrame whose columns are the n scaled columns C0*v0, C1*v1*, ..., Cn*vn.

What's the "idiomatic" expression for such a product? Does this kind of product have a standard name?

Could the best solution entail working with one or both of the underlying numpy.ndarray's v.values and C.values?

Upvotes: 0

Views: 239

Answers (1)

Warren Weckesser
Warren Weckesser

Reputation: 114781

That's matrix multiplication of the matrix C by the matrix with diagonal v.

For example, here's a Series v and a DataFrame C:

In [65]: v
Out[65]: 
0    1
1   -2
2    5
dtype: int64

In [66]: C
Out[66]: 
    0   1   2
0   0   1   2
1   3   4   5
2   6   7   8
3   9  10  11
4  12  13  14

Here's the product:

In [67]: C.dot(np.diag(v))
Out[67]: 
    0   1   2
0   0  -2  10
1   3  -8  25
2   6 -14  40
3   9 -20  55
4  12 -26  70

You could also compute that using element-wise multiplication and broadcasting. The DataFrame multiply method and the * operator handle broadcasting, so you can write:

In [102]: C * v
Out[102]: 
    0   1   2
0   0  -2  10
1   3  -8  25
2   6 -14  40
3   9 -20  55
4  12 -26  70

Some testing on a DataFrame with 50 rows and 100 columns shows that it is much more efficient to work with the numpy arrays, as follows:

In [113]: C.values * v.values
Out[113]: 
array([[  0,  -2,  10],
       [  3,  -8,  25],
       [  6, -14,  40],
       [  9, -20,  55],
       [ 12, -26,  70]])

Upvotes: 3

Related Questions