Reputation: 127
So currently I've got the string
'JavaScript:doEdit('41228', '', '', 2);'
I'd like to do regex in python on it to filter out ONLY the 41228. I've tried two methods and came up with issues on both. The first was trying to find things that aren't digits of length 5 by using
re.sub('^\d{5}', string )
Then I tried re.match and re.compile which gives me the error TypeError: unsupported operand type(s) for &: 'str' and 'int'.
The only thing close I've done is use re.sub{'\D', string} but then it sticks in the extra 2 I don't want.
I guess it wouldn't be an issue to just find the 19th through 24th characters in the string. since the string should never change composition. When I generate a new id.
SOLVED: working code is
screen_id = 'JavaScript:doEdit('41228', '', '', 2);'
reduced_screenc_id = re.search(r'\d{5}', screenc_id)
print (reduced_screenc_id.group())
Upvotes: 0
Views: 2952
Reputation: 626728
If your string is always in this format, you may use a mere split operation with '
:
s = "JavaScript:doEdit('41228', '', '', 2);"
print(s.split("'")[1])
# => 41228
If you plan to study regex, you may use re.search
with either \d{5}
or doEdit'(\d{5})'
regex.
import re
s = "JavaScript:doEdit('41228', '', '', 2);"
res = re.search(r"\d{5}", s)
if res:
print(res.group()) # Here, we need the whole match
res = re.search(r"doEdit\('(\d{5})'", s)
if res:
print(res.group(1)) # Grabbing Group 1 value only
Since you need only the first match, there is no point using re.findall
.
See this Python demo.
Upvotes: 0
Reputation: 27273
Use re.findall
to find all (non-overlapping) instances of the pattern in your string:
>>> import re
>>> string = "JavaScript:doEdit('41228', '', '', 2);"
>>> pattern = '\d{5}' # 5 digits
>>> number = re.findall(pattern, string)[0]
>>> number
'41228'
You might want to cast the "number" to an actual number using number = int(number)
then.
Upvotes: 3