anguyen1210
anguyen1210

Reputation: 543

How to move the timestamp bounds for datetime in pandas (working with historical data)?

I'm working with historical data, and have some very old dates that are outside the timestamp bounds for pandas. I've consulted the Pandas Time series/date functionality documentation, which has some information on out of bounds spans, but from this information, it still wasn't clear to me what, if anything I could do to convert my data into a datetime type.

I've also seen a few threads on Stack Overflow on this, but they either just point out the problem (i.e. nanoseconds, max range 570-something years), or suggest setting errors = coerce which turns 80% of my data into NaTs.

Is it possible to turn dates lower than the default Pandas lower bound into dates? Here's a sample of my data:

import pandas as pd

df = pd.DataFrame({'id': ['836', '655', '508', '793', '970', '1075', '1119', '969', '1166', '893'], 
                   'date': ['1671-11-25', '1669-11-22', '1666-05-15','1673-01-18','1675-05-07','1677-02-08','1678-02-08', '1675-02-15', '1678-11-28', '1673-12-23']})

Upvotes: 6

Views: 615

Answers (1)

jezrael
jezrael

Reputation: 863166

You can create day periods by lambda function:

df['date'] = df['date'].apply(lambda x: pd.Period(x, freq='D'))

Or like mentioned @Erfan in comment (thank you):

df['date'] = df['date'].apply(pd.Period)

print (df)
     id        date
0   836  1671-11-25
1   655  1669-11-22
2   508  1666-05-15
3   793  1673-01-18
4   970  1675-05-07
5  1075  1677-02-08
6  1119  1678-02-08
7   969  1675-02-15
8  1166  1678-11-28
9   893  1673-12-23

Upvotes: 3

Related Questions