LynchDev
LynchDev

Reputation: 813

Match Beginning and Ending of a String that can start with " or ' with Regex

I have two strings

string a = "text 'text'"
string b = 'text "text"'

In this language both the " and ' can be used to start and end a string literal, and they can contain the other symbol inside them (quotation marks are valid inside apostrophes, and vice versa)

I need a regex that can deal with both, currently I have:

(?:\"|')(?<content>[^\"']*)(?:\"|')

but used on string a, this would only give me "text ' and not the full string.

Upvotes: 2

Views: 205

Answers (2)

Alan Moore
Alan Moore

Reputation: 75252

The basic technique is:

(["'])((?:(?!\1).)*)\1

The opening quote is captured in group #1, and (?:(?!\1).)* matches zero or more of any character except the one that was captured. That's enclosed in another set of capturing parens, so the contents are captured in group #2. Then the final \1 matches the closing quote.

But you're using a named group to capture the contents, so it's probably best that you use named groups throughout (especially since you don't say which flavor you're using, and the interaction between named and numbered groups is not consistent from one flavor to the next). This should work in .NET or PHP:

(?<quote>["'])(?<content>(?:(?!\k<quote>).)*)\k<quote>

But if you're using .NET I recommend using this instead:

(?:"(?<content>[^"]*)"|'(?<content>[^']*)')

Most flavors make it difficult or impossible to reuse group names within the same regex, but in .NET anything goes.

Upvotes: 0

Toto
Toto

Reputation: 91508

How about:

('|")(?<content>[^\1]*)\1

Upvotes: 1

Related Questions