Matt
Matt

Reputation: 26971

Regex expression working except in dotNet

I am trying to match an expression of type

field:"value"

but not

field:value

I have written

([a-z]+)\s*?:\s*?"(.+)"\s*?

and this works excepts in dotNet

Is there any reason why this might be? Something I am missing?

Edit -- I typed the question wrong (figures)

I'm trying to match

field:value 

but not

field:"value"

This is the regex I have

(?<field>[a-z]+)\s*?:\s*?(?<value>[^"].+[^"])\s*?

But that's not working, it matches even what's in quotes.

Edit 2:

Below is the code -- SearchResources.StringRegexNotQuotedText is ([a-z]+)\s*?:\s*?([^"]+)\s*? stringQuery is (what: "Hello_bob")

All relevant things are escaped....

Regex regexNotQuoted = new Regex(SearchResources.StringRegexNotQuotedText, RegexOptions.IgnoreCase);

MatchCollection matchesNotQuoted = regexNotQuoted.Matches(stringQuery);

 // It should go into this block if the value is not in quotes
 if (regexNotQuoted.IsMatch(stringQuery))
 {

 }

Upvotes: 0

Views: 149

Answers (3)

R. Martinho Fernandes
R. Martinho Fernandes

Reputation: 234504

This part: [^"].+ means: a character other than double quotes, followed by one or more characters. This means you only match sequences of at least two characters that start with non-double-quotes. You probably want [^"].*, that is, sequences of at least one character, that isn't a double quote;

Now, to the issue at hand. You're getting a match, because of this part \s*?([^"].+[^"]). The \s*? part will match no spaces because it is non-greedy (that's what the question mark means here). Then the [^"] part will match a space, .+ will match "Hello_bo and the last [^"] will match the final b. The rest of the line (") will not be captured.

You need to stop using the non-greedy operators (that's almost never what you want), and to make sure you capture everything to the end of the line, using the end-of-line anchor: $.

So, this is what I recommend you use:

(?<field>[a-z]+)\s*:\s*(?<value>[^"].*[^"])\s*$

With this C# code:

Regex re = new Regex(@"(?<field>[a-z]+)\s*:\s*(?<value>[^""].*[^""])\s*$");

Upvotes: 1

Alan Moore
Alan Moore

Reputation: 75242

It sounds like you want to match one or more characters other than double-quotes or whitespace, so write it just like that:

[^"\s]+

The full regex would be:

(?<field>[a-z]+)\s*:\s*(?<value>[^"\s]+)

Upvotes: 1

Hun1Ahpu
Hun1Ahpu

Reputation: 3355

works for me. Here is the code I've used:

resultString = Regex.Match(subjectString, @"([a-z]+)\s*?:\s*?""(.+)""\s*?", RegexOptions.Singleline | RegexOptions.IgnoreCase).Value;

Upvotes: 0

Related Questions