Reputation: 187
I want to match all whitespaces (if any) before words.
Regex re = new Regex(@"(\d+);([\d\.]+);([\d\.]+);([\w-\(\)\.,\/]+);(\d+);(\d+);([\d,]+);(\d+)", RegexOptions.Compiled);
The above regex is working for Example-1 but not for Example-2. Where do I need to change the regex for Example-2?
Example-1:
44;52.93; 8.24;GROSSENKNETEN;201902;28;408.7;28;509.86
71;48.22; 8.98;ALBSTADT-BADKAP;201902;28;475.3;28;-999.9
73;48.62;13.05;ALDERSBACH-KRIESTORF;201902;28;519.8;28;561.76
Example-2:
00044;52.93; 8.24; GROSSENKNETEN;201907;31; 53.4;9; 28.6
00071;48.22; 8.98; ALBSTADT-BADKAP;201907;31; 49.0;8;-999.9
00073;48.62;13.05; ALDERSBACH-KRIESTORF;201907;31; 0.0;0; 15.7
Upvotes: 1
Views: 69
Reputation: 626929
If you have a normal access to full C# functionality, just read a file line by line and split with ;
to get all the fields.
If you are using a .NET regex based tool and need to extract specific data from lines of text, you may use
(?m)^(\d+);\s*([\d.]+);\s*([\d.]+);\s*([\w-().,\/]+);\s*(\d+);\s*(\d+);\s*([\d.]+);\s*(\d+);\s*([-+]?\d*\.?\d+)\r?$
See the regex demo
In a multiline mode, the $
in .NET regex does not match before a CR, that is why there is a \r?
.
Pattern details
(?m)
- multiline mode on^
- start of a line(\d+)
- Group 1: one or more digits;
- a semi-colon\s*
- 0+ whitespaces([\d.]+)
- Group 2: 1+ digits or dots;\s*([\d.]+);\s*
- ;
, 0+ whitespaces, Group 3: 1+ digits/dots, ;
, 0+ whitespaces([\w-().,/]+)
- Group 4: 1+ word, -
, (
, )
, .
, ,
, /
chars;\s*(\d+);\s*(\d+);\s*
- ;
, 0+ whitespaces, Group 5: 1+ digits, ;
, 0+ whitespaces, Group 6: 1+ digits, ;
, 0+ whitespaces([\d.]+)
- Group 7: 1+ digits/dots;\s*(\d+)
- ;
, 0+ whitespaces, Group 8: 1+ digits;\s*
- ;
and 0+ whitespaces([-+]?\d*\.?\d+)
- Group 9: -
or +
optionally, then 0+ digits, an optional .
, 1+ digits\r?$
- an optional CR char and the end of the line.Upvotes: 1