Clueless
Clueless

Reputation: 79

find average in remaining days using python pandas

df is a csv file that contains ship_date, order_date and cumulative_ordered. cumulative_ordered is the sum of orders added on every day up until ship_date. There are 30 days prior to every ship_date and those days only counts for one ship_date. after ship_date 2018-07-01, then next ship_date will be 2018-08-01 with same procedure.

My problem is that I can't get days remaining(see last code output below) when I calculate the percentage average of cumulative_ordered on each of those 30 days prior.

I have the following code, that gives me the days_remaining from a csv file with several different ship_date and order_date counting down to each separate ship_date.

df['days_remaining'] = pd.to_datetime(df['ship_date']).sub\
(pd.to_datetime(df['order_date'])).dt.days
df['difference'] = df['ship_date'] - df['order_date']

df.head()

output:

ship_date    Order_date   cumulative_ordered   days_remaining    difference

2018-07-01   2018-06-01     7                  30               30 days
2018-07-01   2018-06-02     10                 29               29 days
2018-07-01   2018-06-03     15                 28               28 days
2018-07-01   2018-06-04     30                 28               27 days
2018-07-01   2018-06-05     41                 28               26 days

I then try to find total ordered on each day prior to ship_date

m = df.groupby("difference").mean()
m.head()

which gives me this output:

             cumulative ordered    days_remaining
difference                            
      0 days    352.458124             0.0
      1 days    291.234747             1.0
      2 days    244.122137             2.0
      3 days    201.178765             3.0
      4 days    190.153641             4.0

I'm running into an issue when I try to find an average of cumulative ordered on each day in terms of percentage filled from cumulative_ordered output on 0 days above, by running this code:

   v = m/m[m.index.days == 0].iloc[0]
   v.head()

          cumulative_ordered      days_remaining
difference                           
0 days        1.000000              NaN
1 days        0.891324              inf
2 days        0.812534              inf
3 days        0.752339              inf
4 days        0.673745              inf

days_remaining changes to NaN and inf.. How can I keep it so it still gives me integers?

Upvotes: 0

Views: 49

Answers (1)

pjw
pjw

Reputation: 2335

The NaN and inf result from dividing by 0.0.

It seems that you are trying to apply the operation only to the cumulative_ordered column, so you should just run this for your last block of code:

m['cumulative_ordered'] = m['cumulative_ordered'] / m['cumulative_ordered'][m['cumulative_ordered'].index.days == 0]

Upvotes: 1

Related Questions