Reputation: 115
I have a string like this:
var hours_tdate = ['22','23','<span style="color:#1d953f;">0</span>','<span style="color:#1d953f;">1</span>'];
This is a part of a js file. Now I want to use regex to extract the numbers from the above string, and having the output like this:
[22,23,0,1]
I have tried:
re.findall('var hours_tdate = \[(.*)\];', string)
And it gives me:
'22','23','<span style="color:#1d953f;">0</span>','<span style="color:#1d953f;">1</span>'
I don't know why it has no match when I tried:
re.findall('var hours_tdate = \[(\d*)\];', string)
Upvotes: 2
Views: 74
Reputation: 43169
To provide another examples:
['>](\d+)['<]
# one of ' or >
# followed by digits
# followed by one of ' or <
In Python
Code:
import re
rx = r"['>](\d+)['<]"
matches = [match.group(1) for match in re.finditer(rx, string)]
Or use lookarounds to only match what you want (no additional group needed, that is):
(?<=[>'])\d+(?=[<'])
Again, in Python
Code:
import re
rx = r"(?<=[>'])\d+(?=[<'])"
matches = re.findall(rx, string)
Upvotes: 0
Reputation: 11032
Use \d+
along with word boundary to extract the numbers only
\b\d+\b
Python Code
p = re.compile(r'\b\d+\b')
test_str = "var hours_tdate = ['22','23','<span style=\"color:#1d953f;\">0</span>','<span style=\"color:#1d953f;\">1</span>'];"
print(re.findall(p, test_str))
NOTE :- Even if there will be digits in variable name, it won't matter as long as your format of variable is correct
Upvotes: 1