Neil Walker
Neil Walker

Reputation: 6848

regex fails to find in Python

With a given string: Surname,MM,Forename,JTA19 R <[email protected]>

I can match all the groups with this:

([A-Za-z]+),([A-Z]+),([A-Za-z]+),([A-Z0-9]+)\s([A-Z])\s<([A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4})

However, when I apply it to Python it always fails to find it

regex=re.compile(r"(?P<lastname>[A-Za-z]+),"
                 r"(?P<initials>[A-Z]+)"
                 r",(?P<firstname>[A-Za-z]+),"
                 r"(?P<ouc1>[A-Z0-9]+)\s"
                 r"(?P<ouc2>[A-Z])\s<"
                 r"(?P<email>[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4})"                       
                 )

I think I've narrowed it down to this part of email:

[A-Z0-9._%+-]

What is wrong?

Upvotes: 0

Views: 76

Answers (2)

Burhan Khalid
Burhan Khalid

Reputation: 174624

You are passing multiple strings to the compile method, you need to pass in one, whole, regular expression.

exp = '''
         (?P<lastname>[A-Za-z]+),
         (?P<initials>[A-Z]+),
         (?P<firstname>[A-Za-Z]+),
         (?P<ouc1>[A-Z0-9]+)\s
         (?P<ouc2>[A-Z])\s<
         (?P<email>[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4})'''

regex = re.compile(exp, re.VERBOSE)

Although I have to say, your string is just comma separated, so this might be a bit easier:

>>> s = "Surname,MM,Forename,JTA19 R <[email protected]>"
>>> lastname,initials,firstname,rest = s.split(',')
>>> ouc1,ouc2,email = rest.split(' ')
>>> lastname,initials,firstname,ouc1,ouc2,email[1:-1]
('Surname', 'MM', 'Forename', 'JTA19', 'R', '[email protected]')

Upvotes: 1

Ionut Hulub
Ionut Hulub

Reputation: 6326

Replace

r"(?P<email>[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4})"

with

r"(?P<email>[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4})"

to allow for lowercase letters too.

Upvotes: 1

Related Questions