trainoasis
trainoasis

Reputation: 6720

Regex: using alternatives

Let's say I would like to get all the 'href' values from HTML. I could run a regex like this on the content:

a[\s]+href[\s]*=("|')(.)+("|')

which would match

a href="something" 

OR

a href = 'something' // quotes, spaces ... 

which is OK; but with ("|') I get too many groups captured which is something I do not want.

How does one use alternative in regex without capturing groups as well?

The question could also be stated like: how do I delimit alternatives to match? (start and stop). I used parenthesis since this is all that worked...

(I know that the given regex is not perfect or very good, I'm just trying to figure this alternating with two values thing since it is not perfectly clear to me)

Thanks for any tips

Upvotes: 0

Views: 82

Answers (1)

Patrik Oldsberg
Patrik Oldsberg

Reputation: 1550

Use non-capture groups, like this: (?:"|'), the key part being the ?:at the beginning. They act as a group but do not result in a separate match.

Upvotes: 2

Related Questions