Ilumtics
Ilumtics

Reputation: 107

Extract strings that follow changing time strings

So I've been trying to extract the strings that follow the "dot" character in the text file, but only for lines that follow the pattern as below, that is, after the date and time:

09 May 2018 10:37AM • 6PR, Perth (Mornings)

The problem is for each of those lines, the date and time would change so the only common pattern is that there would be AM or PM right before the "dot".

However, if I search for "AM" or "PM" it wouldn't recognize the lines because the "AM" and "PM" are attached to the time.

This is my current code:

for i,s in enumerate(open(file)):
    for words in ['PM','AM']:
      if re.findall(r'\b' + words + r'\b', s):
        source=s.split('•')[0]

Any idea how to get around this problem? Thank you.

Upvotes: 0

Views: 52

Answers (2)

BcK
BcK

Reputation: 2821

I guess your regex is the problem here.

for i, s in enumerate(open(file)):
    if re.findall(r'\d{2}[AP]M', s):
        source = s.split('•')[0]

# 09 May 2018 10:37AM

Upvotes: 1

Rakesh
Rakesh

Reputation: 82785

If you are trying to extract the datetime try using regex.

Ex:

import re

s = "09 May 2018 10:37AM • 6PR, Perth (Mornings)"
m = re.search("(?P<datetime>\d{2}\s+(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{4}\s+\d{2}\:\d{2}(AM|PM))", s)
if m:
    print m.group("datetime")

Output:

09 May 2018 10:37AM

Upvotes: 1

Related Questions