Ayy
Ayy

Reputation: 21

timestamp parsing

I have a huge dataset and I am trying to parse timestamp column to have the following format: YYYY-MM-DD HH:MM:SS:DEC but the output I have in the dataframe have something like: 210309_131046520_11.

Hence: I want to split the output column to have the following column:

date,  day,  month,  year, dec,  hh,  mm, ss

eg: 210309_131046520_11

date 21-03-09
day 09
month 03
year 2021
dec 520
hh 13
mm 10
ss 46

I tried to use datetime modules but nothing seems to be helpful. Can someone help?

Example of my timestamp column:

0      210309_131046520_11.sgy
1      210309_131046520_05.sgy
2      210309_131046528_02.sgy
3      210309_131046528_12.sgy
4      210309_131049712_07.sgy
...
162    210309_132658584_07.sgy
163    210309_132659632_10.sgy
164    210309_132701232_05.sgy
165    210309_132704256_02.sgy
166    210309_132707136_12.sgy

Example of my date column:

0      131046520
1      131046520
2      131046528
3      131046528
4      131049712
...
162    132658584
163    132659632
164    132701232
165    132704256
166    132707136

One thing I attempted:

Using split:

s_df["date"] = s_df["timestamp"].apply(lambda row: row.split("_")[1])

Output:

TypeError: list indices must be integers or slices, not str

Upvotes: 2

Views: 541

Answers (1)

rgk
rgk

Reputation: 1015

You can use datetime's strptime to interpret your dates from the standard library:

from datetime import datetime as dt
dt.strptime('210309_131045620', '%y%m%d_%H%M%S%f')

Out: datetime.datetime(2021, 3, 9, 13, 10, 45, 620000)

In a pandas dataframe, this is implemented more efficiently using the built-in, which conforms to the same formatting syntax:

import pandas as pd
pd.to_datetime(df['date'], format='%y%m%d_%H%M%S%f')

For details about customizing the format string, see the strftime reference: https://strftime.org

Upvotes: 3

Related Questions