Alex Bamford
Alex Bamford

Reputation: 215

Rexex Everything after the "." but ignore "|"

I have 2 variations of text string:

10.09.2016 | 45 Min. | SWR Fernsehen | UT

or

07.09.2016 | 57 Min. WDR Fernsehen

I am looking to end up with:

SWR Fernsehen | UT 

and

WDR Fernsehen

This is what I have tried to get the capturing group:

\\.\s(.*)

This returns:

| SWR Fernsehen | UT
WDR Fernsehen

I cant work out how to say " take everything after the "." but ignore the "|"

any ideas?

Upvotes: 1

Views: 54

Answers (4)

Justin Steele
Justin Steele

Reputation: 2250

\.(\s|\s\|\s)(.*)

Or

\.(\s(.*)|\s\|\s(.*))

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626861

You may use the following regular expression:

.*\.(?:\s*\|)?\s*(.*)

See the regex demo

The .*\. will match up to and including the last . (because * is a greedy quantifier), (?:\s*\|)? will match one or zero sequences of 0+ whitespaces + |, \s* - zero or more whitespaces and (.*) will grab the rest into Group 1, just access this group contents with the tool/language features.

Upvotes: 1

Denys Séguret
Denys Séguret

Reputation: 382150

You can use \.[|\s]*([^.]+)$ to get everything after the last dot, omitting the possible spaces and | at the start.

For example in Javascript:

str.match(/\.[|\s]*([^.]+)$/)[1]

gives you the desired part. If you're unsure whether it matches, start by checking str.match(/\.[|\s]*([^.]+)$/) isn't null.

Upvotes: 1

Jan
Jan

Reputation: 43169

What about:

\b((?:SWR|WDR).+)

This returns both SWR Fernsehen | UT and WDR Fernsehen, see a demo on regex101.com.

Upvotes: 0

Related Questions