Reputation: 79
df is a csv file that contains ship_date, order_date and cumulative_ordered. cumulative_ordered is the sum of orders added on every day up until ship_date. There are 30 days prior to every ship_date and those days only counts for one ship_date. after ship_date 2018-07-01, then next ship_date will be 2018-08-01 with same procedure.
My problem is that I can't get days remaining(see last code output below) when I calculate the percentage average of cumulative_ordered on each of those 30 days prior.
I have the following code, that gives me the days_remaining from a csv file with several different ship_date and order_date counting down to each separate ship_date.
df['days_remaining'] = pd.to_datetime(df['ship_date']).sub\
(pd.to_datetime(df['order_date'])).dt.days
df['difference'] = df['ship_date'] - df['order_date']
df.head()
output:
ship_date Order_date cumulative_ordered days_remaining difference
2018-07-01 2018-06-01 7 30 30 days
2018-07-01 2018-06-02 10 29 29 days
2018-07-01 2018-06-03 15 28 28 days
2018-07-01 2018-06-04 30 28 27 days
2018-07-01 2018-06-05 41 28 26 days
I then try to find total ordered on each day prior to ship_date
m = df.groupby("difference").mean()
m.head()
which gives me this output:
cumulative ordered days_remaining
difference
0 days 352.458124 0.0
1 days 291.234747 1.0
2 days 244.122137 2.0
3 days 201.178765 3.0
4 days 190.153641 4.0
I'm running into an issue when I try to find an average of cumulative ordered on each day in terms of percentage filled from cumulative_ordered output on 0 days above, by running this code:
v = m/m[m.index.days == 0].iloc[0]
v.head()
cumulative_ordered days_remaining
difference
0 days 1.000000 NaN
1 days 0.891324 inf
2 days 0.812534 inf
3 days 0.752339 inf
4 days 0.673745 inf
days_remaining changes to NaN and inf.. How can I keep it so it still gives me integers?
Upvotes: 0
Views: 49
Reputation: 2335
The NaN
and inf
result from dividing by 0.0
.
It seems that you are trying to apply the operation only to the cumulative_ordered
column, so you should just run this for your last block of code:
m['cumulative_ordered'] = m['cumulative_ordered'] / m['cumulative_ordered'][m['cumulative_ordered'].index.days == 0]
Upvotes: 1