atr1
atr1

Reputation: 25

How to grab dates from a string of conjoined dates

This is the string I am dealing with:'5Nov20217Dec202110Jan2022'

The string could also be:

'5Nov2021 7Dec2021 10Jan2022'

I would like to obtain a list like:

['5Nov2021','7Dec2021','10Jan2022']

I am currently using regex but to no avail:

re.findall('^\d{1,2}[a-zA-Z]{3}\d{4}$','5Nov20217Dec202110Jan2022')

A regex solution is not a must.

Upvotes: 2

Views: 55

Answers (1)

Ajax1234
Ajax1234

Reputation: 71461

Based on the variability of your input, I suggest combining re with string slicing in a while loop:

def extract_dates(d):
   while d:
      if (k:=re.findall('^\d{1,2}[a-zA-Z]{3}\d{4}', d)):
          if not (l:=d[len(k[0]):]) or l[0].isdigit():
             yield k[0]
             d = l
             continue
      if (k:=re.findall('^\d{1,2}[a-zA-Z]{3}\d{2}', d)):
          yield k[0]
          d = d[len(k[0]):]
      else:
          d = d[1:]
           

dates = ['5Nov20217Dec202110Jan2022', '5Nov217Dec2110Jan22', '5Nov21 7Dec21 10Jan22']
results = [list(extract_dates(i)) for i in dates]

Output:

[['5Nov2021', '7Dec2021', '10Jan2022'], ['5Nov21', '7Dec21', '10Jan22'], ['5Nov21', '7Dec21', '10Jan22']]

Upvotes: 4

Related Questions