Reputation: 47

How to extract timestamp from filename list in python and convert to Timestamp format?

Here is the list of filenames with timestamp in it. I need loop through the list and extract only the timestamp value in the list and strip the values and convert to timestamp.

s = ['Asbdnfe_20200404_000101.csv',
     'sdndvd_20200404_010202.csv',
     'vdfvdfvdfvd_20190303_030303.csv']

length = len(s)
for i in range(length):
    match = re.search(r"_((\d+)_(\d+))", s[i])
    print(match.group(1))

Result: 20200404_000001, 20200404_010202, 20190303_030303

But what I want is:

[2020-04-04 00:01:01.000,
2020-04-04 01:02:02.000,
2019-03-03 03:03:03.000]

Upvotes: 0

Answers (4)

paxton4416

Reputation: 565

Whenever you need to do the same thing to a bunch of similar inputs, look for a common pattern and start there. In this case, the pattern is pretty simple, so the regex is actually overkill.

import datetime as dt
from pathlib import Path

s = ['Asbdnfe_20200404_000101.csv',
     'sdndvd_20200404_010202.csv',
     'vdfvdfvdfvd_20190303_030303.csv']

datetimes = []
for filename in s:
    name = Path(filename).stem    # or os.path.splitext(filename)[0]
    timestamp_str = name[-15:]
    file_dt = dt.strptime(timestamp_str, '%Ym%d_%H%M%S')
    datetimes.append(file_dt)

All your file names are in the form of <some_prefix>_<YYYYMMDD>_<HHMMSS>.csv. So no matter what <some_prefix> is, you can index the string from the right, and pull out the date and time information in the same way every time. And as others have noted, once you do, the datetime module's strptime function exists exactly for this use.

Even if you have a case where the inputs aren't as clean and regular as the few file names you posted, just look for a slightly more abstract pattern and write code around that.

Upvotes: 1

Bincy

Reputation: 73

You can use DateTime parsing and formating as follows

from datetime import datetime 
import re

s = ['Asbdnfe_20200404_000101.csv',
     'sdndvd_20200404_010202.csv',
     'vdfvdfvdfvd_20190303_030303.csv']

length = len(s)
for i in range(length):
    match = re.search(r"_((\d+)_(\d+))", s[i])
    #print(match.group(1))
    print(datetime.strptime(match.group(1), '%Y%m%d_%H%M%S').strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])

You will get the output as

2020-04-04 00:01:01.000
2020-04-04 01:02:02.000
2019-03-03 03:03:03.000

Thanks,

Upvotes: 1

Nick

Reputation: 147146

You can use datetime.strptime to convert the extracted strings into datetime objects:

from datetime import datetime
import re

s = ['Asbdnfe_20200404_000101.csv','sdndvd_20200404_010202.csv','vdfvdfvdfvd_20190303_030303.csv']

for f in s:
    match = re.search(r"_((\d+)_(\d+))", f)
    d = datetime.strptime(match.group(1), '%Y%m%d_%H%M%S')
    print(d)

Output:

2020-04-04 00:01:01
2020-04-04 01:02:02
2019-03-03 03:03:03

If you want to print the dates with milliseconds, use datetime.strftime:

print(d.strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])

The %f specifier prints microseconds, so we use [:-3] to strip it back to a millisecond value.

To produce a list of results, just append them to a list rather than printing them:

d = []
for f in s:
    match = re.search(r"_((\d+)_(\d+))", f)
    dt = datetime.strptime(match.group(1), '%Y%m%d_%H%M%S')
    d.append(dt.strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])
    
print(d)

Or you can use a list comprehension:

d = [datetime.strptime(re.search(r"_((\d+)_(\d+))", f).group(1), '%Y%m%d_%H%M%S').strftime('%Y-%m-%d %H:%M:%S.%f')[:-3] for f in s]

The output is the same:

['2020-04-04 00:01:01.000', '2020-04-04 01:02:02.000', '2019-03-03 03:03:03.000']

Upvotes: 5

Narendra Prasath

Reputation: 1531

You can use datetime

import datetime import datetime

s = ['Asbdnfe_20200404_000101.csv',
     'sdndvd_20200404_010202.csv',
     'vdfvdfvdfvd_20190303_030303.csv']

length = len(s)
for i in range(length):
    match = re.search(r"_((\d+)_(\d+))", s[i])
    time_str = match.group(1)
    print(datetime.strptime(time_str, "%Y%m%d_%H%M%S").strftime("%Y-%m-%d %H:%M:%S"))

Upvotes: 0

How to extract timestamp from filename list in python and convert to Timestamp format?

Answers (4)

Related Questions