aperture
aperture

Reputation: 2895

Optional non-capturing group for matching substring

Given the following example strings

//www.youtube.com/embed/OYb_N_XEYas?rel=0&showinfo=0
//www.youtube.com/embed/STH9ZpeFH2o

I need to capture the sub-string after 'embed/' up to either the end of the string or either a '?' or '/' character.

How do I specify the optional second non-capturing group?

using (embed\/)(.*) works for the second string, and (embed\/)(.*)(\?|\/) works for the first, but neither works in both cases.

Upvotes: 0

Views: 52

Answers (3)

nu11p01n73R
nu11p01n73R

Reputation: 26667

You can use negated character class as

/embed\/[^?\/]*/
  • [^?\/] matches anything other than a ? or /

  • * Quantifier. Matches zero or more occurence of the presceding regex

Regex Demo

Test

preg_match("/embed\/[^?\/]*/", "//www.youtube.com/embed/OYb_N_XEYas?rel=0&showinfo=0", $matches);
=> Array ( [0] => embed/OYb_N_XEYas )

preg_match("/embed\/[^?\/]*/", "//www.youtube.com/embed/STH9ZpeFH2o", $matches);
=> Array ( [0] => embed/STH9ZpeFH2o )

You can also try look aheads following a non greedy .*?

/embed\/.*?(?=(?:\?|\/|$))/
  • (?=(?:\?|\/|$)) Positive look ahead. Check if the matched string is followed by ? or / or $, end of string. This is an assertion and wont consume those characters inside. That is as you can see from the test, the output doesnt inclue the ?

Regex Demo

Test

preg_match("/embed\/.*?(?=(?:\?|\/|$))/", "//www.youtube.com/embed/OYb_N_XEYas?rel=0&showinfo=0", $matches);
=> Array ( [0] => embed/OYb_N_XEYas )

Upvotes: 1

Biffen
Biffen

Reputation: 6355

You could use this regex:

embed\/([^?\/]*)

In short it'll match all characters that aren't ? or /, and put them in group 1. Thus it'll work regardless of the presence of such characters, i.e. it'll work for both sample strings.

Note that this will catch an empty string, just substitute + for * if you don't want it to.

I also removed the group around embed\/ as I saw no reason to have it.

Upvotes: 1

Lorenz Meyer
Lorenz Meyer

Reputation: 19915

This will work for both

(embed\/)([^\/?]*)

While

.*

matches any character

[^\/?]*

matches all but / and ?

Upvotes: 1

Related Questions