Reputation: 117
I'm having a hard time grasping regex no matter how much documentation I read up on. I'm trying to match everything between a a string and the first occurrence of &
this is what I have
link = "group.do?sys_id=69adb887157e450051e85118b6ff533c&&"
rex = re.compile("group\.do\?sys_id=(.?)&")
sysid = rex.search(link).groups()[0]
I'm using https://regex101.com/#python to help me validate my regex and I can kinda get rex = re.compile("user_group.do?sys_id=(.*)&")
to work but the .*
is greedy and matches to the last & and im looking to match to the first &
I thought .?
matches zero to 1 time
Upvotes: 2
Views: 142
Reputation: 21
.*
is greedy but
.*?
should not be in regex.
.?
would only look for any character 0-1 times while
.*?
will look for it up to the earliest matching occurrence. I hope that explains it.
Upvotes: 2
Reputation: 1667
You can simply regex out to the &
instead of the final &
like so:
import re
link = "user_group.do?sys_id=69adb887157e450051e85118b6ff533c&&"
rex = re.compile("user_group\.do\?sys_id=(.*)&&")
sysid = rex.search(link).groups()[0]
print(sysid)
Upvotes: 2
Reputation: 474161
You don't necessarily need regular expressions here. Use urlparse
instead:
>>> from urlparse import urlparse, parse_qs
>>> parse_qs(urlparse(link).query)['sys_id'][0]
'69adb887157e450051e85118b6ff533c'
In case of Python 3 change the import to:
from urllib.parse import urlparse, parse_qs
Upvotes: 7