Reputation: 35301
I have a Pandas Series
v
, with numeric entries v0, v1, ..., vn
, and a Pandas DataFrame
C
, with columns C0, C1, ..., Cn
. I want to generate the DataFrame
whose columns are the n scaled columns C0*v0, C1*v1*, ..., Cn*vn
.
What's the "idiomatic" expression for such a product? Does this kind of product have a standard name?
Could the best solution entail working with one or both of the underlying numpy.ndarray
's v.values
and C.values
?
Upvotes: 0
Views: 239
Reputation: 114781
That's matrix multiplication of the matrix C by the matrix with diagonal v.
For example, here's a Series v and a DataFrame C:
In [65]: v
Out[65]:
0 1
1 -2
2 5
dtype: int64
In [66]: C
Out[66]:
0 1 2
0 0 1 2
1 3 4 5
2 6 7 8
3 9 10 11
4 12 13 14
Here's the product:
In [67]: C.dot(np.diag(v))
Out[67]:
0 1 2
0 0 -2 10
1 3 -8 25
2 6 -14 40
3 9 -20 55
4 12 -26 70
You could also compute that using element-wise multiplication and broadcasting. The DataFrame multiply
method and the *
operator handle broadcasting, so you can write:
In [102]: C * v
Out[102]:
0 1 2
0 0 -2 10
1 3 -8 25
2 6 -14 40
3 9 -20 55
4 12 -26 70
Some testing on a DataFrame with 50 rows and 100 columns shows that it is much more efficient to work with the numpy arrays, as follows:
In [113]: C.values * v.values
Out[113]:
array([[ 0, -2, 10],
[ 3, -8, 25],
[ 6, -14, 40],
[ 9, -20, 55],
[ 12, -26, 70]])
Upvotes: 3