Reputation: 6233
I have been trying for hours now and also read the Regex wiki here on Stackoverflow but can't seem to accomplish this regex. This is the string I have:
Lorem: 8 FB / Ipsum-/Dolor: Some Text / Dolor: Some text with (brackets) / Sit amet.: Some Text with/slash / foobar / Last one: 36 foos
What I would like to extract is: Lorem
, Ipsum-/Dolor
, Dolor
, Sit amet.
, Last one
. So basically everything from the beginning of the sentence or after a slash until the colon.
Whatever I try the problem is always the foobar
since it always sticks together with Last one
. What I tried for example so far is: ( \/ |\A)([^(?!.* \/ )].*?):
which I hoped would extract everything starting from a slash going until a colon but not if there is /
(empty space, slash, empty space). That way I wanted to make sure not to get foobar / Last one
returned.
Could someone provide me with some hint
Upvotes: 1
Views: 1484
Reputation: 626690
Note that you make a common mistake placing a sequence of patterns into a character class ([...]
) thus making the regex engine match a single character from the defined set. [^(?!.* \/ )]
matches a single character other than (
, ?
, !
, .
, etc.
You may use a tempered greedy token:
(?: \/ |\A)((?:(?! \/ )[^:])+):
^^^^^^^^^^^^^^^^
See the regex demo. The literal spaces may be replaced with \s
(if you can match any whitespaces) or \h
(to only match horizontal whitespaces).
Details:
(?: \/ |\A)
- either space + /
+ space or start of string ((?:(?! \/ )[^:])+)
- Group 1 capturing one or more symbols other than :
([^:]
) that is not a starting point for a space + /
+ space sequence:
- a literal colon.Upvotes: 6