Reputation: 217
I'm having trouble parsing some text. Here's an example of the text:
201 BBQ 0.000 9.000 0.099 0.891 9.000 0.000 0.000 0.000
705 W 1 PC 0.000 135.000 0.295 39.825 0.000 0.000 135.000 0.000
2106 ONL 9.99 41.141 3.000 4.110 12.330 3.000 0.000 0.000 29.970
Here's the latest incarnation of the code I've been trying:
objInfo = System.Text.RegularExpressions.Regex.Split(
newLine,"(\d{3,5})|([0-9]+[.]+[0-9]+)|(\w*)")
I'm having trouble because I'm avoiding getting many blank spaces in the array after splitting. I'm trying to avoid using the optional |
character but I get no results when I set it up without it!
I've spent much of the evening reviewing regular expressions and I've downloaded the following programs:
RegEx Designer.NET Antix RegEx Tester Expresso
I'm having trouble because the description contains a decimal point SOMETIMES and sometimes it doesn't. The description sometimes contains a whole number sometimes it doesn't.
My friend recommended I use awk to divide it into columns. The thing is...I teach a Community Education class with Visual Basic .Net and I need to improve my RegEx skills. Perhaps someone can give me some guidance so I can better help my students.
Upvotes: 1
Views: 1139
Reputation: 138007
Since you know the number of columns, you can start by reading 8 decimals from the end of the line, and take the rest as the title. You can avoid a regex, but here's a simple solution with one:
Match match = Regex.Match(line, @"^(.*)((?:\d+\.\d+\s*){8})$");
string title = match.Groups[1].Value.Trim();
IEnumerable<decimal> numbers = match.Groups[2].Value
.Split(" \t".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.Select(Convert.ToDecimal);
The regex captures two groups: ((?:\d+\.\d+\s*){8})$
is eight decimals at the end, and (.*)
is the start of the sting until them. If you have extra decimals, as your third example, they will be added to the title.
Similarly, you may choose a non-regex solution (actually, this one is better if you don't mind losing a few spaces at the title
):
string[] words = line.Split(" ".ToCharArray(),
StringSplitOptions.RemoveEmptyEntries);
int position = words.Count() - 8;
IEnumerable<decimal> numbers = words.Skip(position).Select(Convert.ToDecimal);
string title = String.Join(" ", words.Take(position));
Upvotes: 2