Fredrik
Fredrik

Reputation: 4171

Trying to pick out a specific part of a string with regex

I've tried and tried again to find a regex for this pattern. I have a string like this picked from HTML source.

<!-- TAG=Something / Something else -->

And sometimes it's just:

<!-- TAG=Something -->

In both cases I want the regex to just match "Something", i.e. everything between TAG= and an optional /.

My first attempt was:

TAG=(.*)[/]?(.*) -->

But the first parenthesis matches everything between TAG= and --> no matter what. So what is the correct way here?

Upvotes: 2

Views: 184

Answers (3)

Alin P.
Alin P.

Reputation: 44376

Try this:

TAG=([^/]*)(?:/(.*))?-->

Group 1 will contain "Something".
Group 2 will contain "Something else" or null.

Test it.

Upvotes: 2

Mark Byers
Mark Byers

Reputation: 839074

Use a non-greedy modifier ?:

TAG=(.*?)[/]?.* -->

Also your usage of [/] seems unusual - you don't need a character class to write a single character. The most likely explanation for this unusual syntax is probably because you are using / as the regular expression delimiter, meaning that / is treated as a special character. In many (not all) regex dialects it is possible solve this issue by using a different delimiter, such as #. This prevents you from needing to escape the slashes.

Upvotes: 1

Gabi Purcaru
Gabi Purcaru

Reputation: 31564

<!--.*?=(.*?)(-->|/)

It matches everything you need.

Upvotes: 2

Related Questions