Python Regex Get Text Either side of Specific Characters

Question

I have blocks of text that contain strings like the one below. I need to get the text either side of "rt" and including rt but excluding text/numbers on different lines

Example:

1.99

  Jim Smith rt Tom Ross

Random

So, here the desired result would be "Jim Smith rt Tom Ross".

I am new to regex and cannot get close. I think I need to lookahead and lookbehind then bound the result in some way but I'm struggling.

Any help would be appreciated.

RavinderSingh13 · Accepted Answer

With your shown samples please try following regex. Here is the Online demo for above regex.

^\d+(?:\.\d+)?
+\s+(.*?rt[^
]+)
+\s*\S+$

Python3 code: Code is written and tested in Python3x. Its using Python3's re module's findall function which also has re.M flag enabled in it to deal with the variable value.

import re
var = """1.99

  Jim Smith rt Tom Ross

Random"""

re.findall(r'^\d+(?:\.\d+)?
+\s+(.*?rt[^
]+)
+\s*\S+$',var,re.M)
['Jim Smith rt Tom Ross']

Explanation of regex:

^\d+          ##From starting of the value matching 1 or more occurrences of digits.
(?:\.\d+)?    ##In a non-capturing group matching literal dot followed by 1 or more digits.

+\s+        ##Followed by 1 or more new lines followed by 1 or more spaces.
(.*?rt[^
]+) ##In a CAPTURING GROUP using lazy match to match till string rt just before a new line.

+\s*\S+$    ##Followed by new line(s), followed by 0 or more occurrences of spaces and NON-spaces at the end of this value.

Python Regex Get Text Either side of Specific Characters

Answers (2)

Related Questions