Reputation: 43
Recently I received some data with epoch timeparse. After taking it to timestamp using pandas, I noticed that the year returned is 1970, but the data is from around 2018 videogame stats.
I tried
df['date'] = pd.to_datetime(df.creationTime, inferdatetime_format=True)
df['date'].describe()
count 51490
unique 51052
top 1970-01-01 00:25:04.380431622
freq 3
first 1970-01-01 00:24:56.891694922
last 1970-01-01 00:25:04.707332198
Name: date, dtype: object
the provider says the time unit is seconds but, for example for
1504279457970
pd.to_datetime(1504279457970, infer_datetime_format=True)
Timestamp('1970-01-01 00:25:04.279457970')
and
pd.to_datetime(1504279457970, unit = 's')
...
OutOfBoundsDatetime: cannot convert input with unit 's'
Em'i doing something wrong?
I'm new to Python, so I don't know if I'm being naive.
Thanks!
Upvotes: 1
Views: 1339
Reputation: 308
It is likely that the timestamp was given to you in ms precision. As you've shown, trying to convert the timestamp to a datetime using second precision results in the OutOfBoundsDatetime
error. If you assume that the timestamp has a precision of milliseconds then you get a date in 2017 which is more likely.
It appears that pandas was guessing that you were using nanosecond precise timestamps when you supplied the method with the inferdatetime_format=True
argument.
>>> pd.to_datetime(1504279457970, unit = 's')
Traceback (most recent call last):
...
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: cannot convert input with unit 's'
>>> pd.to_datetime(1504279457970, unit = 'ms')
Timestamp('2017-09-01 15:24:17.970000')
>>> pd.to_datetime(1504279457970, unit = 'ns')
Timestamp('1970-01-01 00:25:04.279457970')
Upvotes: 1