Andy Thompson
Andy Thompson

Reputation: 97

Possible bug in Python Regex... or maybe I have missed something - failure to match

So I have a string s:

'MakeMoney EURUSD,M1 2021.08.06-2021.08.10'

and a piece of code:

import re
pat1 = re.compile(r'''
(?P<robot>[^\s]+) 
\s 
(?P<asset>[^","]+)
"," 
(?P<tf>[^\s]+)
\s 
(?P<datefrom>[0-9]{4}[.][0-9]{2}[.][0-9]{2})
"-" 
(?P<dateto>[0-9]{4}[.][0-9]{2}[.][0-9]{2})
''', re.VERBOSE)

m = pat1.search(s)
m.groupdict()

with the following error (ignore the line numbers - other code which does not affect this):

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_32277/1686568363.py in <module>
     50 
     51 m = pat1.search(s)
---> 52 m.groupdict()
     53 

AttributeError: 'NoneType' object has no attribute 'groupdict'

What I am after is:

{'robot': 'MakeMoney', 'asset': 'EURUSD', 'tf': 'M1', 'datefrom': '2021.08.06', 'dateto': '2021.08.10'}

I can not work out why it is not matching... I have searched regex101 and tried pythex - buggy - with no solution. What am I missing? Adding \A and \Z to beginning and end of pattern does not help.

The really annoying thing is I am using the same method elsewhere with no issue but a different pattern and string structure...

Upvotes: 0

Views: 82

Answers (1)

Niel Godfrey P. Ponciano
Niel Godfrey P. Ponciano

Reputation: 10699

Your original regex has only 2 mistakes, namely:

  1. ","
  2. "-"

In the original string, there are no quotes:

MakeMoney EURUSD,M1 2021.08.06-2021.08.10
                ^             ^

Unless you are matching

MakeMoney EURUSD","M1 2021.08.06"-"2021.08.10
                ^^^             ^^^

Then it is applicable.

So you should remove the quotes.

...
pat1 = re.compile(r'''
(?P<robot>[^\s]+) 
\s 
(?P<asset>[^","]+)
, 
(?P<tf>[^\s]+)
\s 
(?P<datefrom>[0-9]{4}[.][0-9]{2}[.][0-9]{2})
- 
(?P<dateto>[0-9]{4}[.][0-9]{2}[.][0-9]{2})
''', re.VERBOSE)
...

It should work after.

Side note: If your worry is escaping special characters used in regex, note that backslash \ is used e.g. if you want to match the period character then \. and not quotes "." nor brackets [.].

Upvotes: 1

Related Questions