bugfixr
bugfixr

Reputation: 8077

C# RegEx - get only first match in string

I've got an input string that looks like this:

level=<device[195].level>&name=<device[195].name>

I want to create a RegEx that will parse out each of the <device> tags, for example, I'd expect two items to be matched from my input string: <device[195].level> and <device[195].name>.

So far I've had some luck with this pattern and code, but it always finds both of the device tags as a single match:

var pattern = "<device\\[[0-9]*\\]\\.\\S*>";
Regex rgx = new Regex(pattern);
var matches = rgx.Matches(httpData);

The result is that matches will contain a single result with the value <device[195].level>&name=<device[195].name>

I'm guessing there must be a way to 'terminate' the pattern, but I'm not sure what it is.

Upvotes: 6

Views: 8286

Answers (5)

ΩmegaMan
ΩmegaMan

Reputation: 31616

Use named match groups and create a linq entity projection. There will be two matches, thus separating the individual items:

string data = "level=<device[195].level>&name=<device[195].name>";

string pattern = @"
(?<variable>[^=]+)     # get the variable name
(?:=<device\[)         # static '=<device'
(?<index>[^\]]+)       # device number index
(?:]\.)                # static ].
(?<sub>[^>]+)          # Get the sub command
(?:>&?)                # Match but don't capture the > and possible &  
";

 // Ignore pattern whitespace is to document the pattern, does not affect processing.
var items = Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace)
                .OfType<Match>()
                .Select (mt => new
                  {
                     Variable = mt.Groups["variable"].Value,
                     Index    = mt.Groups["index"].Value,
                     Sub      = mt.Groups["sub"].Value
                  })
                 .ToList();

items.ForEach(itm => Console.WriteLine ("{0}:{1}:{2}", itm.Variable, itm.Index, itm.Sub));

/* Output
level:195:level
name:195:name
*/

Upvotes: 1

hwnd
hwnd

Reputation: 70732

Change your repetition operator and use \w instead of \S

var pattern = @"<device\[[0-9]+\]\.\w+>";

String s = @"level=<device[195].level>&name=<device[195].name>";
foreach (Match m in Regex.Matches(s, @"<device\[[0-9]+\]\.\w+>"))
         Console.WriteLine(m.Value);

Output

<device[195].level>
<device[195].name>

Upvotes: 2

Braj
Braj

Reputation: 46841

I want to create a RegEx that will parse out each of the <device> tags

I'd expect two items to be matched from my input string: 
   1. <device[195].level>
   2. <device[195].name>

This should work. Get the matched group from index 1

(<device[^>]*>)

Live demo

String literals for use in programs:

@"(<device[^>]*>)"

Upvotes: 3

Keith Nicholas
Keith Nicholas

Reputation: 44298

depends how much of the structure of the angle blocks you need to match, but you can do

"\\<device.+?\\>"

Upvotes: 3

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51330

Use non-greedy quantifiers:

<device\[\d+\]\.\S+?>

Also, use verbatim strings for escaping regexes, it makes them much more readable:

var pattern = @"<device\[\d+\]\.\S+?>";

As a side note, I guess in your case using \w instead of \S would be more in line with what you intended, but I left the \S because I can't know that.

Upvotes: 8

Related Questions