Reputation: 1495
I have a pandas dataframe 'pivoted' :
Reason CE CS DG DR IC IO
Warehouse Month
01 01 9 4 4 0 1 8
I also have a variable that accumulates the total of the issues. It's an "int" type variable. In this case it is 626
When I run the following line: total_percentages = pivoted/total_issues
I'm not getting the correct (or expected) results:
01 01 0.021143 0.009397 0.009397 0.000000 0.002349 0.018793
I would expect this:
01 01 0.014376 0.006389 0.006389 0.00000 0.001597 0.012780
My full code:
issue_df = pd.read_sql(issue_query, cnxn)
issue_df.rename(columns={'00001' : 'Invoices', 'OBWHID' : 'Warehouse', 'OBRTRC':'Reason', 'INV_MONTH':'Month', '00005':'Date'}, inplace=True)
pivoted = pd.pivot_table(issue_df, index=["Warehouse", "Month"], values=["Invoices"], columns=['Reason'], aggfunc='count', fill_value=0)
pivoted.loc['Column Total'] = pivoted.sum()
print(pivoted.dtypes)
#Percentages of Warehouse Returns by Month
warehouse_percentages = pivoted[:] = 100 * pivoted[:].div(pivoted[:].sum(axis=1), axis=0)
print(warehouse_percentages)
print(total_issues)
total_percentages = pivoted.div(total_issues)`
Upvotes: 1
Views: 339
Reputation: 29710
With the line
warehouse_percentages = pivoted[:] = 100 * pivoted[:].div(pivoted[:].sum(axis=1), axis=0)
you are re-assigning all of the values of pivoted
to be the result of the right hand side of the operation, which is why your expected output isn't matching what you print pivoted
as before the line above.
Thus, if you don't intend on modifying pivoted
with this operation, remove pivoted[:]
- I'm not sure if you think that without the copy you'll modify pivoted
but it is not necessary - Pandas operations almost always by default do not operate in place, but return another object.
warehouse_percentages = 100* pivoted.div(pivoted.sum(axis=1), axis=0)
Upvotes: 2