Reputation: 2000
What is the right way to multiply two sorted pandas Series?
When I run the following
import pandas as pd
x = pd.Series([1,3,2])
x.sort()
print(x)
w = [1]*3
print(w*x)
I get what I would expect - [1,2,3]
However, when I change it to a Series:
w = pd.Series(w)
print(w*x)
It appears to multiply based on the index of the two series, so it returns [1,3,2]
Upvotes: 1
Views: 119
Reputation: 30444
Your results are essentially the same, just sorted differently.
>>> w*x
0 1
2 2
1 3
>>> pd.Series(w)*x
0 1
1 3
2 2
>>> (w*x).sort_index()
0 1
1 3
2 2
The rule is basically this: Anytime you multiply a dataframe or series by a dataframe or series, it will be done by index. That's what makes it pandas and not numpy. As a result, any pre-sorting is necessarily ignored.
But if you multiply a dataframe or series by a list or numpy array of a conforming shape/size, then the list or array will be treated as having the exact same index as the dataframe or series. The pre-sorting of the series or dataframe can be preserved in this case because there can not be any conflict with the list or array (which don't have an index at all).
Both of these types of behavior can be very desirable depending on what you are trying do. That's why you will often see answers here that do something like df1 * df2.values
when the second type of behavior is desired.
In this example, it doesn't really matter because your list is [1,1,1]
and gives the same answer either way, but if it was [1,2,3]
you would get different answers, not just differently sorted answers.
Upvotes: 1