Reputation: 33
Is there a way to fix the following regex? I have included an example in regex101. Basically it captures too much and a wrong part between ()[]
tags. It kind of does what it's supposed to but in turn I lose text and another tag.
https://regex101.com/r/OPRCuh/1
regex:
\[(.+?)\]\((https.+?)\)
sample text
_“[Developer Interview](/blog/tags/developer_interview.html)” is a new series here at Semaphore blog. We’ll interview developers from some of the companies using [text text text](https://textapp.com) to find out how they work and share their insights with you.
Upvotes: 1
Views: 113
Reputation: 626709
The .
pattern matches any char other than a line break char. So, it can match [
, ]
, (
and )
, too, until it finds a valid match. Since the regex parses the string from left to right, the regex engine finds the first [
and then finds ]
after Interview
, then finds (
before /blog
but gives it up since it is not followed with https
, but still goes on to match chars until it finds (https
and thus returns a valid match.
You may use
r'\[([^][]*)]\((https[^()]*)\)'
See the regex demo
The [^][]*
pattern matches 0+ chars other than [
and ]
and [^()]*
matches 0+ chars other than (
and )
.
Upvotes: 1