Reputation: 215
I have 2 variations of text string:
10.09.2016 | 45 Min. | SWR Fernsehen | UT
or
07.09.2016 | 57 Min. WDR Fernsehen
I am looking to end up with:
SWR Fernsehen | UT
and
WDR Fernsehen
This is what I have tried to get the capturing group:
\\.\s(.*)
This returns:
| SWR Fernsehen | UT
WDR Fernsehen
I cant work out how to say " take everything after the "." but ignore the "|"
any ideas?
Upvotes: 1
Views: 54
Reputation: 626861
You may use the following regular expression:
.*\.(?:\s*\|)?\s*(.*)
See the regex demo
The .*\.
will match up to and including the last .
(because *
is a greedy quantifier), (?:\s*\|)?
will match one or zero sequences of 0+ whitespaces + |
, \s*
- zero or more whitespaces and (.*)
will grab the rest into Group 1, just access this group contents with the tool/language features.
Upvotes: 1
Reputation: 382150
You can use \.[|\s]*([^.]+)$
to get everything after the last dot, omitting the possible spaces and | at the start.
For example in Javascript:
str.match(/\.[|\s]*([^.]+)$/)[1]
gives you the desired part. If you're unsure whether it matches, start by checking str.match(/\.[|\s]*([^.]+)$/)
isn't null
.
Upvotes: 1
Reputation: 43169
What about:
\b((?:SWR|WDR).+)
This returns both SWR Fernsehen | UT
and WDR Fernsehen
, see a demo on regex101.com.
Upvotes: 0