Reputation: 63
Could you please correct my regex?
I need to match all <img>
tags which have a ?contextId
inside of src
. For instance the following string should be matched:
<img xmlns="http://www.w3.org/1999/xhtml" src="http://10.3.34.34:8080/Bilder/pic.png?contextId=qualifier123" alt="Bild" />
I wrote the regular expression and it does what I need:
(?i)<img[^>]+? src\s*?=\s*?"(.*?\?contextId.*?)"[^\/]+?\/>
But it seems to me it takes too many steps (380 here) to parse: regex demo
Input string can be up to 30,000 characters and I worry that Java regex engine may fail with my non-optimized expression.
Upvotes: 0
Views: 648
Reputation: 4504
I made some changes to your regex:
<img.*?src\s*=\s*"([^"]*\?contextId[^"]*)
1) *? to [^"]* # match non "(double quotes) characters instead of .(dot)
2) "[^\/]+?\/> # no need to match this part
Upvotes: 1
Reputation: 1233
98 steps (regex demo):
<img.*?src="[^"]+\?contextId[^>]+>
This regex makes the assumption that the html is not malformed and particularly expects that each img
tag has a src
attribute.
EDIT: 104 steps to take both the img
and the src
link (regex demo):
(<img.*?src="([^"]+\?contextId[^"]+)"[^>]+>)
Upvotes: 1