Reputation: 2918
Assuming a string like Foo: Some Text Bar: Some Other Text FooBar: Even More Text
and a goal to have it split into:
Foo: Some Text
Bar: Some Other Text
FooBar: Even More Text
I can't figure out the Regex for it at all. I can split it based on the words I want like (Foo:)|(Bar:)|(FooBar:)
but I can't figure out how to include from the beginning of each group to the beginning of the next group (or end of text if last group).
Upvotes: 2
Views: 81
Reputation: 626784
You can use Regex.Split
to split the string with
(?<!^)\s+(?=\b(?:Bar|Foo(?:Bar)?):)
See the regex demo. Details:
(?<!^)
- not at the start of string\s+
- 1 or more whitespaces(?=\b(?:Bar|Foo(?:Bar)?):)
- immediately to the right, there must be
\b
- a word boundary(?:Bar|Foo(?:Bar)?)
- Bar
, Foo
or FooBar
:
- a colon.C# demo:
var s = "Foo: Some Text Bar: Some Other Text FooBar: Even More Text";
var res = Regex.Split(s, @"(?<!^)\s+(?=\b(?:Bar|Foo(?:Bar)?):)");
Console.WriteLine(string.Join("\n", res));
Output:
Foo: Some Text
Bar: Some Other Text
FooBar: Even More Text
Another idea: matching any word before a colon and all up to the next word with a :
after:
var matches = Regex.Matches(s, @"\w+(?:-\w+)*:.*?(?=\s*(?:\w+(?:-\w+)*:|$))", RegexOptions.Singleline)
.Cast<Match>()
.Select(x => x.Value)
.ToList();
See this regex demo.
Details
\w+(?:-\w+)*:
- 1 or more word chars (letters/digits/underscores), and then 0 or more repetitions of -
and 1+ word chars.*?
- any 0 or more chars, as few as possible(?=\s+(?:\w+(?:-\w+)*:|$))
- up to the first occurrence of
\s*
- 0 or more whitespaces
(?:\w+(?:-\w+)*:
- either 1 or more word chars (letters/digits/underscores), and then 0 or more repetitions of -
and 1+ word chars and then a colon|
- or
$
- end of string)
See the C# demo.
Upvotes: 1