Someguywhocodes
Someguywhocodes

Reputation: 781

Dividing two dataframes gives NaN

I have two dataframes, one with a metric as of the last day of the month. The other contains a metric summed for the whole month. The former (monthly_profit) looks like this:

              profit
yyyy_mm_dd
2018-01-01    8797234233.0  
2018-02-01    3464234233.0
2018-03-01    5676234233.0
...
2019-10-01    4368234233.0

While the latter (monthly_employees) looks like this:

              employees
yyyy_mm_dd
2018-01-31    924358  
2018-02-28    974652
2018-03-31    146975
...
2019-10-31    255589

I want to get profit per employee, so I've done this:

profit_per_employee = (monthly_profit['profit']/monthly_employees['employees'])*100

This is the output that I get:

yyyy_mm_dd
2018-01-01   NaN
2018-01-31   NaN
2018-02-01   NaN
2018-02-28   NaN

How could I fix this? The reason that one dataframe is the last day of the month and the other is the first day of the month is due to rolling vs non-rolling data.

monthly_profit is the result of grouping and summing daily profit data:

monthly_profit = df.groupby(['yyyy_mm_dd'])[['proft']].sum()
monthly_profit = monthly_profit.resample('MS').sum()

While monthly_employees is a running total, so I need to take the current value for the last day of each month:

monthly_employees = df.groupby(['yyyy_mm_dd'])[['employees']].sum()
monthly_employees = monthly_employees.groupby([monthly_employees.index.year, monthly_employees.index.month]).tail(1)

Upvotes: 2

Views: 230

Answers (1)

jezrael
jezrael

Reputation: 863166

Change MS to M for end of months for match both DatatimeIndex:

monthly_profit = monthly_profit.resample('M').sum()

Upvotes: 1

Related Questions