Reputation: 11
I have a Pandas dataframe which look like this
The customer number is unique to each customer, but repeats itself if the customer visits again. I want to groupby customer number. Then in each groupby object, I want to find out the duration between visits.
So, I do it like this..
df['Date'] = pd.to_datetime(df['Date'], format='%d %b %y')
grouped = df.groupby('Customer no')
My question is, how do I iterate over the grouped rows and find out the time (in days) between subsequent visits.
Upvotes: 1
Views: 77
Reputation: 862681
I think you need groupby
with diff
:
print (df.groupby('Customer no')['Date'].diff())
13 NaT
22 0 days
26 0 days
Name: Date, dtype: timedelta64[ns]
#if need convert days to numeric
print (df.groupby('Customer no')['Date'].diff() / np.timedelta64(1, 'D'))
13 NaN
22 0.0
26 0.0
Name: Date, dtype: float64
Upvotes: 1