Reputation: 723
I am using this regex ("http:|"https:)\/\/.*\/content\/amc\/tdd\/.*?"
to find all the urls which starts with http or https and contains /content/amc/tdd
But for the text
"<a id='cdq_element_175_link' href='http://google.com' data-href='edit' >
<img src=\"http://localhost:8080/content/amc/tdd/abc/download_1.jpeg?
ch_ck=1548843340209\" alt=\"\" id=\"element_175\" style=\"height: 135.575px; width: 135.575px;\" data-href=\"edit\">
<img src=\"http://localhost:8080/content/amc/tdd/abc/download_1.jpeg?ch_ck=1548843340209\" alt=\"\" id=\"element_175\" style=\"height: 135.575px; width: 135.575px;\" data-href=\"edit\">
</a>"
I am not getting two strings which matches the pattern, instead I am getting the complete string starting from first instance to the last.
What am I doing wrong ?
Upvotes: 0
Views: 52
Reputation: 10360
Try this Regex:
"https?:\/\/(?:[^\/]*\/)*?content\/amc\/tdd[^"]*"
Explanation:
"https?:\/\/
- matches "http://
or "https://
(?:[^\/]*\/)*?
- matches 0+ occurrences of any character which is not a /
followed by /
. This subpattern is repeated 0 or more times, as least as possible.content\/amc\/tdd
- matches content/amc/tdd
[^"]*"
- matches 0+ occurrences of any character that is not a "
followed by "
Upvotes: 2
Reputation: 343
Because inside your regex .*
is greedy match, it will eat all your string.
You should change it to .*?
Like this:
("http:|"https:)\/\/.*?\/content\/amc\/tdd\/.*?"
Upvotes: 2