Umer Mehmood
Umer Mehmood

Reputation: 87

Get numbers between quotes and backslash with regex in C#

Following is my string:

"\"26201\",7\0\"

Quotes are part of the string. I want to get the number 26201 out of it using regular expressions.

Similarly, I want to extract number 123 out of "\"123\",7111\0000\".

This is actually where I am stuck:

string input = "\"\\\"123\\\",7\\0\\0\"";
System.Console.WriteLine($"input is {input}");
string firstPattern = @"[0-9]{0,10}\\""";
Regex extractRelevantRgx = new Regex(firstPattern);
MatchCollection matches = extractRelevantRgx.Matches(input);
            
//here i get 2 match 1st is \" and second is 26201\"
if (matches.Count > 0)
{
    //now trying to get only numbers from the second match
    Regex numberPatternRgx = new Regex(@"[0-9]{0,10}"); //to now seprate numbers
    foreach (Match match in matches)
    {
        string substring = match.Value;
        MatchCollection results = numberPatternRgx.Matches(substring);
        // I should be getting only 1 match that also only for second match, instead I get 3 matches in first iteration alone
        if (results.Count > 0)
        {
            string resultfinal = results[0].Value;
            System.Console.WriteLine($"final result {resultfinal}");
            break;
        }
    }
}

Upvotes: 0

Views: 1160

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626893

If you needed to match a digit sequence between two double quotes, you could use var resultfinal = Regex.Match(input, @"(?<="")[0-9]+(?="")")?.Value;. However, your original string contains escaped quotation marks, so you need to extract a digit sequence in between two \" substrings.

You can use

var input = "\\\"26201\\\",7\\0\\0";
var firstPattern = @"(?<=\\"")[0-9]+(?=\\"")";
var resultfinal = Regex.Match(input, firstPattern)?.Value;
Console.WriteLine($"final result: '{resultfinal}'");
// => final result: '26201'

See the C# demo. The pattern is (?<=\\")[0-9]+(?=\\"), see its online demo. Details:

  • (?<=\\") - a positive lookbehind that requires a \" substring to occur immediately to the left of the current location
  • [0-9]+ - one or more ASCII digits (note \d+ matches any one or more Unicode digits including Hindi, Persian etc. digits unless the RegexOptions.ECMAScript option is used)
  • (?=\\") - a positive lookahead that requires a \" substring to occur immediately to the right of the current location.

Upvotes: 1

Ann
Ann

Reputation: 1

Did you try trim(""").split(""")[0]?

Upvotes: 0

Andy
Andy

Reputation: 658

I would use a regular expression like ^"(?<someGroupName>\d+)". You'll have to escape the quotes to get it into a C# string.

This matches the start of the string with ^ followed by a literal " character followed by one or more digits followed by another literal " character. The digits are captured in a group called someGroupName, which you would want to change to be something more meaningful.

Here is a link to this regular expression running on regex101.com. They have a searchable quick reference section and also give awesome live updating explanations of what your regex is doing.

https://regex101.com/r/TovYcQ/1

Here is a link to a dotnetfiddle where you can see this running in the context of a C# test bed: https://dotnetfiddle.net/Pb09Vm

Upvotes: 0

Related Questions