Taimur Islam
Taimur Islam

Reputation: 990

How can I groupby and calculate the date difference from a dataframe?

I have a dataframe like

df = id -    date
     ps1  - 2020-01-2
     ps1  - 2020-01-3
     ps2  - 2020-01-3
     ps2  - 2020-01-5

I would like to get output like

output = id - date_diff
         ps1  - 1
         ps2  - 2

How can i group the ids and calculate the data differences ?

Upvotes: 0

Views: 32

Answers (2)

wwnde
wwnde

Reputation: 26676

convert date to datetime, diff and drop all the nulls.

df["date"] = pd.to_datetime(df["date"])
    df["date_diff"] = df.groupby("id")["date"].diff().astype('timedelta64[D]')
    df[df.date_diff.notnull()]

Upvotes: 2

Henry Yik
Henry Yik

Reputation: 22503

First coerce your date column to datetime, and then groupby and diff:

df["date"] = pd.to_datetime(df["date"])
df["date_diff"] = df.groupby("id")["date"].diff()
print (df.drop_duplicates("id",keep="last"))

    id       date date_diff
1  ps1 2020-01-03    1 days
3  ps2 2020-01-05    2 days

Upvotes: 2

Related Questions