Reputation: 338

Obtain number of rows with the same ID but with an earlier date

I have a dataframe like this:

ID	Date
01	2020-01-02
01	2020-01-03
02	2020-01-02

I need to create a new column, that for each specific ID and Date, gives me the number of rows that have the same ID but of an earlier date.

So the output of the example df will look like this

ID	Date	Count
01	2020-01-02	0
01	2020-01-03	1
02	2020-01-02	0

I have tried working with aux tables, and also with group by using a lambda function, but with no real idea how to continue

Upvotes: 0

Answers (3)

Sanjay Pant

Reputation: 150

This will create a new column with the count.

df['Date'] = pd.to_datetime(df['Date'])
df['Count'] = df.groupby('ID')['Date'].rank(ascending=True).astype(int) - 1

Upvotes: 5

BENY

Reputation: 323226

Let us try factorize

df['new'] = df.sort_values('Date').groupby('ID')['Date'].transform(lambda x : x.factorize()[0])
df
   ID        Date  new
0   1  2020-01-02    0
1   1  2020-01-03    1
2   2  2020-01-02    0

Upvotes: 0

Cristobal Ignacio

Reputation: 45

First you need to be sure that you are comparing dates.

df["Date"] = pd.to_datetime(df['Date'], format="%Y-%m-%d")

Then you can create new column called 'Count' iterating over each row using df.apply.

def count_earlier_dates(row):
    return df[df['Date'] < row['Date']].count()['ID']

df['Count'] = df.apply(lambda row: count_earlier_dates(row), axis=1)

Upvotes: 2

Obtain number of rows with the same ID but with an earlier date

Answers (3)

Related Questions