Reputation: 3665
I have some text like "item number - item description" eg "13-40 - Computer Keyboard" that I want to split into item number and item description.
Is this possible with 1 regular expression, or would I need 2 (one for item and one for description)?
I can't work out how to "group" it - like the item number can be this and the description can be this, without it thinking that everything is the item number. Eg:
(\w(\w|-|/)*\w)-.*
matches everything as 1 match.
This is the code I'm using:
Regex rx = new Regex(RegExString, RegexOptions.Compiled | RegexOptions.IgnoreCase);
MatchCollection matches = rx.Matches("13-40 - Computer Keyboard");
Assert.AreEqual("13-40", matches[0].Value);
Assert.AreEqual("Computer Keyboard", matches[1].Value);
Upvotes: 0
Views: 211
Reputation: 825
This isn't as elegant as CaffineFueled's answer but maybe easier to read for a regex beginner.
String RegExString = "(\d*-\d*)\s*-\s*(.*)";
Regex rx = new Regex(RegExString, RegexOptions.Compiled | RegexOptions.IgnoreCase);
MatchCollection matches = rx.Matches("13-40 - Computer Keyboard");
Assert.AreEqual("13-40", matches[0].Value);
Assert.AreEqual("Computer Keyboard", matches[1].Value);
or even more readable:
String RegExString = "(\d*-\d*) - (.*)";
Upvotes: 0
Reputation: 90012
You don't seem to want to match groups, but have multiple matches.
Maybe this will do what you want?
(:^.+(?=( - ))|(?<=( - )).+$)
Split up:
(: Used to provide two possible matches
^.+ Match item ID text
(?=( - )) Text must be before " - "
| OR
(?<=( - )) Test must be after " - "
.+$ Match description text
)
Upvotes: 0
Reputation: 30699
CaffeineFueled's answer is correct for C#.
Match match = Regex.Match("13-40 - Computer Keyboard", @"^([\d\-]+) \- (.+)$");
Console.WriteLine(match.Groups[1]);
Console.WriteLine(match.Groups[2]);
Results:
13-40
Computer Keyboard
Upvotes: 1
Reputation: 969
([0-9-]+)\s-\s(.*)
Group 1 contains the item number, and group 2 contains the description.
Upvotes: 1
Reputation: 38346
From the code you posted, you are using regex wrong. You should be having one regex pattern to match the whole product and using the captures within the match to extract the number and description.
string RegExString = @"(?<number>[\d-]+)\s-\s(?<description>.*)";
Regex rx = new Regex(RegExString, RegexOptions.Compiled | RegexOptions.IgnoreCase);
Match match = rx.Match("13-40 - Computer Keyboard");
Debug.Assert("13-40" == match.Groups["number"].Value);
Debug.Assert("Computer Keyboard" == match.Groups["description"].Value);
Upvotes: 4
Reputation: 38346
If your text is always divided by a dash and you don't have to handle dashes within the data, you don't have to use regex.
string[] itemProperties = item.Split(new string[] { "-" });
itemProperties = itemProperties.Select(p => p.Trim());
Item item = new Item()
{
Number = itemProperties[0],
Name = itemProperties[1],
Description = itemProperties[2]
}
Upvotes: 0
Reputation: 1603
Here is a regexp that works in Ruby - not sure if there are any differences in c# regexp:
/^([\d\-]+) \- (.+)$/
Upvotes: 1