Ezzy
Ezzy

Reputation: 1483

Regex to match multiple number groups between two characters

I have a string that looks like the following:

<@399969178745962506> hello to <@!104729417217032192>

I have a dictionary containing both that looks like following:

{"399969178745962506", "One"},
{"104729417217032192", "Two"}

My goal here is to replace the <@399969178745962506> into the value of that number key, which in this case would be One

Regex.Replace(arg.Content, "(?<=<)(.*?)(?=>)", m => userDic.ContainsKey(m.Value) ? userDic[m.Value] : m.Value);

My current regex is as following: (?<=<)(.*?)(?=>) which only matches everything in between < and > which would in this case leave both @399969178745962506 and @!104729417217032192

I can't just ignore the @ sign, because the ! sign is not there every time. So it could be optimal to only get numbers with something like \d+

I need to figure out how to only get the numbers between < and > but I can't for the life of me figure out how.

Very grateful for any help!

Upvotes: 4

Views: 5963

Answers (4)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627607

In C#, you may use 2 approaches: a lookaround based on (since lookbehind patterns can be variable width) and a capturing group approach.

Lookaround based approach

The pattern that will easily help you get the digits in the right context is

(?<=<@!?)\d+(?=>)

See the regex demo

The (?<=<@!?) is a positive lookbehind that requires <= or <=! immediately to the left of the current location and (?=>) is a positive lookahead that requires > char immediately to the right of the current location.

Capturing approach

You may use the following pattern that will capture the digits inside the expected <...> substrings:

<@!?(\d+)>

Details

  • <@ - a literal <@ substring
  • !? - an optional exclamation sign
  • (\d+) - capturing group 1 that matches one or more digits
  • > - a literal > sign.

Note that the values you need can be accessed via match.Groups[1].Value as shown in the snippet above.

Usage:

var userDic = new Dictionary<string, string> {
        {"399969178745962506", "One"},
        {"104729417217032192", "Two"}
    };
var p =  @"<@!?(\d+)>";
var s = "<@399969178745962506> hello to <@!104729417217032192>";
Console.WriteLine(
    Regex.Replace(s, p, m => userDic.ContainsKey(m.Groups[1].Value) ?
        userDic[m.Groups[1].Value] : m.Value
    )
); // => One hello to Two
// Or, if you need to keep <@, <@! and >
Console.WriteLine(
    Regex.Replace(s, @"(<@!?)(\d+)>", m => userDic.ContainsKey(m.Groups[2].Value) ?
        $"{m.Groups[1].Value}{userDic[m.Groups[2].Value]}>" : m.Value
    )
); // => <@One> hello to <@!Two>

See the C# demo.

Upvotes: 3

Srdjan M.
Srdjan M.

Reputation: 3405

Regex: (?:<@!?(\d+)>)

Details:

(?:) Non-capturing group

<@ matches the characters <@ literally

? Matches between zero and one times

(\d+) 1st Capturing Group \d+ matches a digit (equal to [0-9])

Regex demo

string text = "<@399969178745962506> hello to <@!104729417217032192>";
Dictionary<string, string> list = new Dictionary<string, string>() { { "399969178745962506", "One" }, { "104729417217032192", "Two" } };

text = Regex.Replace(text, @"(?:<@!?(\d+)>)", m => list.ContainsKey(m.Groups[1].Value) ? list[m.Groups[1].Value] : m.Value);

Console.WriteLine(text); \\ One hello to Two
Console.ReadLine();

Upvotes: 0

Irvingz
Irvingz

Reputation: 119

To extract just the numbers from you're given format, use this regex pattern:

(?<=<@|<@!)(\d+)(?=>)

See it work in action: https://regexr.com/3j6ia

Upvotes: 2

Patrick Artner
Patrick Artner

Reputation: 51683

You can use non-capturing groups to exclude parts of the needed pattern to be inside the group:

(?<=<)(?:@?!?)(.*?)(?=>)

alternativly you could name the inner group and use the named group to get it:

(?<=<)(?:@?!?)(?<yourgroupname>.*?)(?=>)

Access it via m.Groups["yourgroupname"].Value - more see f.e. How do I access named capturing groups in a .NET Regex?

Upvotes: 0

Related Questions