dorien
dorien

Reputation: 5387

How to select the next date in a Pandas dataframe with date index and missing dates

I have a dataframe indexed on date with some dates missing (that's ok, they are non-trading data and this is stock data).

How do I access the next row, when I know the previous date, e.g.

date  Open
01-01-2021  501
02-01-2021  508
04-01-2021  511
05-01-2021  518

I would like a function that, when I input '02-01-2021', it outputs the values for 04-01-2021 (without knowing how many days in between might be missing. I am assuming there may be some iterator, or index number I can access?

Upvotes: 1

Views: 2408

Answers (2)

Dani Mesejo
Dani Mesejo

Reputation: 61900

If the rows are sorted by date, use searchsorted:

idx = df['date'].searchsorted(pd.to_datetime('02-01-2021'), side='right')
print(df.loc[idx, 'date'])

Output

2021-04-01 00:00:00

The time complexity of searchsorted is O(logN). Otherwise, use:

idx = df['date'].gt(pd.to_datetime('02-01-2021')).idxmax()
print(df.loc[idx, 'date'])

Output

2021-04-01 00:00:00

A third alternative is to use next:

date = pd.to_datetime('02-01-2021')
idx = next(i for i, x in zip(df.index, df['date']) if x > date)
print(df.loc[idx, 'date'])

Although it needs benchmarking, the last alternative could be faster for unordered data according to this answer.

Upvotes: 3

anky
anky

Reputation: 75080

You can do a compare with the date column and return head(1) for the immediate next row:

inp = '02-01-2021'
df[df['date']>'02-01-2021'].head(1)

        date  Open
2 2021-04-01   511

Note: this is assuming date is a column and the dtype is a datetime and the date column is sorted by date.

Upvotes: 3

Related Questions