Matthew Chambers
Matthew Chambers

Reputation: 877

regular expression to find word after a character and before another one if included

I have a url like:

image/media-group/rugby-league-programme-covers-3436?sort=title

or

image/media-group/rugby-league-programme-covers-3436

I need to get everything after media-group and not including ? or anything after.

So in both instances rugby-league-programme-covers-3436 is what I need to return

I used the regular expression /media-group/(.*)\? which works for the instance where there is a query string but not in the instance where there is no query string.

I am using the below code

var patt=new RegExp('/media-group/(.*)\?');
return patt.exec(url)[1];

Your help on this would be most appreciated

Upvotes: 4

Views: 4096

Answers (1)

Rob Raisch
Rob Raisch

Reputation: 17357

I believe the best pattern would be:

/^[^\#\?]+\/media-group\/([^\?]+).*$/

which breaks out as:

^                 - start of string
[^\#\?]+          - one or more non-hash, non-question-marks
\/                - literal char
media-group       - literal chars
\/                - literal char
(                 - start capture group
  [^\?]+          - one or more chars non-question-marks
)                 - end of capture group
.*                - zero or more chars
$                 - end of string

The reason this works is because [^\?]+ is "greedy" in that it will attempt the longest possible match, which encompasses either a question-mark followed by arbitrary chars, or nothing, since all chars to the end of the string have already been captured in the non-question-mark capture group.

So, using

var RE=new RegExp(/^[^\#\?]+\/media-group\/([^\?]+).*$/),
    url="image/media-group/rugby-league-programme-covers-3436?sort=title";

console.log(url.match(RE)[1])

prints: rugby-league-programme-covers-3436 and changing url to image/media-group/rugby-league-programme-covers-3436, produces the same result.

Update

Modified the pattern re David Foerster's comment.

Upvotes: 5

Related Questions