Reputation: 153
Trying to come up with a regex to search for keyword match at end of line and beginning of next line(if present)
I have tried below regex and does not seem to return desired result
re.compile(fr"\s(?!^)(keyword1|keyword2|keyword3)\s*\$\n\r\((\w+\W+|W+\w+))", re.MULTILINE | re.IGNORECASE)
My input for example is
sentence = """ This is my keyword
/n value"""
Output in above case should be keyword value
Thanks in advance
Upvotes: 1
Views: 1763
Reputation: 163352
You could match the keyword (Or use an alternation) to match more keywords and take trailing tabs and spaces into account after the keyword and after matching a newline.
Using 2 capturing groups as in the pattern you tried:
(?<!\S)(keyword)[\t ]*\r?\n[\t ]*(\w+)(?!\S)
Explanation
(?<!\S)
Negative lookbehind, assert what is directly on the left is not a non whitespace char(keyword)
Capture in group 1 matching the keyword[\t ]*
Match 0+ tabs or spaces\r?\n
Match newline[\t ]*
Match 0+ tabs or spaces(\w+)
Capture group 2 match 1+ word chars(?!\S)
Negative lookahead, assert what is directly on the right is not a non whitespace charFor example:
import re
regex = r"(?<!\S)(keyword)[\t ]*\r?\n[\t ]*(\w+)(?!\S)"
test_str = (" This is my keyword\n"
" value")
matches = re.search(regex, test_str)
if matches:
print('{} {}'.format(matches.group(1), matches.group(2)))
Output
keyword value
Upvotes: 1
Reputation: 27723
My guess is that, depending of the number of new lines that you might have, an expression similar to:
\b(keyword1|keyword2|keyword3)\b[r\n]{1,2}(\S+)
might be somewhat close and the value
is in \2
, you can make the first group non-captured, then:
\b(?:keyword1|keyword2|keyword3)\b[r\n]{1,2}(\S+)
\1
is the value
.
If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
Upvotes: 0
Reputation: 5059
How about \b(keyword)\n(\w+)\b
?
\b(keyword)\n(\w+)\b
\b get a word boundary
(keyword) capture keyword (replace with whatever you want)
\n match a newline
(\w+) capture some word characters, one or more
\b get a word boundary
Because keyword
and \w+
are in capture groups, you can reference them as you wish later in your code.
Upvotes: 0