Reputation: 643
I have a .txt file with this format
content-length: 20
blahblahblah
-stop-
content-length: 10bum
-step-
content-length: 0<---empty space--->
-step-
content-length: 10huba
-step-
I use regex to separate the section per content length, which is use step or stop to make it become end of the section. My regex is
((content-length:)\s(\d)[\r\n]+([\s\S]+?)(-stop-|-step-))*
However, if the content length is zero which means before step or stop there is whitespace, it also capture the next content length section. Any idea to prevent this?
Upvotes: 0
Views: 122
Reputation: 2557
Try this
(?:(?:content-length):\s(?<length>\d+)\n+(?<content>.*?)\n*(?:-stop-|-step-))
Input:
content-length: 20
blahblahblah
-stop-
content-length: 10
bum
-step-
content-length: 0
-step-
content-length: 10
huba
-step-
Output:
MATCH 1
length [16-18] `20`
content [20-32] `blahblahblah`
MATCH 2
length [56-58] `10`
content [60-63] `bum`
MATCH 3
length [87-88] `0`
2. [91-91] ``
MATCH 4
length [114-116] `10`
content [118-122] `huba`
Upvotes: 0
Reputation: 4504
I come up with the following regex, not sure if it is what you want:
var pattern = @"(content-length:\s\d+(?:[\s\S]*?)?-(?:stop|step)-)";
var input = @"content-length: 20
blahblahblah
-stop-
content-length: 10
bum
-step-
content-length: 0
-step-
content-length: 10
huba
-step-";
var result = Regex.Split(input, pattern);
Output:
Upvotes: 1
Reputation: 1
Try this code:
((content-length:)\s(\d)[\r\n]\*([\s\S]\*?)(-stop-|-step-))
Upvotes: 0
Reputation: 99
((content-length:)\s(\d+)[\r\n]+(.*)\n*(-stop-|-step-)). Check out the regex here https://regex101.com/r/wU9uA4/1
Upvotes: 0
Reputation: 7361
(?:(?:content-length:))\s(\d+)[\r\n]+(.*)?[\r\n]+(?:-stop-|-step-)
Upvotes: 0