roschach
roschach

Reputation: 9396

pandas convert from datetime to integer timestamp

Considering a pandas dataframe in python having a column named time of type integer, I can convert it to a datetime format with the following instruction.

df['time'] = pandas.to_datetime(df['time'], unit='s')

so now the column has entries like: 2019-01-15 13:25:43.

What is the command to revert the string to an integer timestamp value (representing the number of seconds elapsed from 1970-01-01 00:00:00)?

I checked pandas.Timestamp but could not find a conversion utility and I was not able to use pandas.to_timedelta for this.

Is there any utility for this conversion?

Upvotes: 50

Views: 147155

Answers (5)

Ignacio Peletier
Ignacio Peletier

Reputation: 2206

The easiest and fastest way is to use .view(int):

df['time'] = df['time'].view(int)//1e9

Other options:

df['time'] = df['time'].apply(lambda x: x.value)//1e9
df['time'] = df['time'].astype(int)//1e9

Using %%timeit on 1000 dates I measured:

  • .view: 119 µs ± 998 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
  • .astype: 129 µs ± 676 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
  • .apply: 629 µs ± 5.38 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

 

Upvotes: 41

Grigory Sizov
Grigory Sizov

Reputation: 99

One can also use .view(...):

import pandas as pd
df = pd.DataFrame({'time': [pd.to_datetime('2019-01-15 13:25:43')]})
df_unix_sec = pd.to_datetime(df['time']).view(int) // 10 ** 9
print(df_unix_sec)

Casting with .astype(int), recommended above, is deprecated in pandas 1.3.0, and throws a warning:

FutureWarning: casting datetime64[ns] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.

Upvotes: 9

Jared M
Jared M

Reputation: 977

As @Ignacio recommends, this is what I am using to cast to integer:

df['time'] = df['time'].apply(lambda x: x.value)

Then, to get it back:

df['time'] = df['time'].apply(pd.Timestamp)

Upvotes: 3

ALollz
ALollz

Reputation: 59579

Use .dt.total_seconds() on a timedelta64:

import pandas as pd
df = pd.DataFrame({'time': [pd.to_datetime('2019-01-15 13:25:43')]})

# pd.to_timedelta(df.time).dt.total_seconds() # Is deprecated
(df.time - pd.to_datetime('1970-01-01')).dt.total_seconds()

Output

0    1.547559e+09
Name: time, dtype: float64

Upvotes: 9

A l w a y s S u n n y
A l w a y s S u n n y

Reputation: 38552

You can typecast to int using astype(int) and divide it by 10**9 to get the number of seconds to the unix epoch start.

import pandas as pd
df = pd.DataFrame({'time': [pd.to_datetime('2019-01-15 13:25:43')]})
df_unix_sec = pd.to_datetime(df['time']).astype(int)/ 10**9
print(df_unix_sec)

Upvotes: 45

Related Questions