Osca
Osca

Reputation: 1694

Subtract dates row by row if the ids are the same

I'd like to subtract dates if the next row's id is the same. I'm able to subtract dates, but stuck on creating the condition to check if the next row has the same id.

d = {'date':['2021-01', '2020-01', '2020-05', '2021-01'], 'id':['a', 'a', 'b', 'b']}
df = pd.DataFrame(data=d)
date    id
2021-01 a
2020-01 a
2020-05 b
2021-01 b

My code

df = df.sort_values(by=['id', 'date'])
df['date_diff'] = pd.to_datetime(df['date']) - pd.to_datetime(df['date'].shift())

result

date    id  date_diff
2020-01 a   NaT
2021-01 a   366 days
2020-05 b   -245 days
2021-01 b   245 days

Expected result should as below, which the dates only be subtracted when the ids are the same.

enter image description here

Upvotes: 1

Views: 527

Answers (2)

BENY
BENY

Reputation: 323236

Chain with groupby

df['date'] = pd.to_datetime(df['date'])
df['date_diff'] = df.groupby('id')['date'].diff()

Upvotes: 2

wwnde
wwnde

Reputation: 26676

df['date']=pd.to_datetime(df['date'])
df['date_diff']=df.groupby('id')['date'].diff()

Upvotes: 2

Related Questions