Johannes Walter
Johannes Walter

Reputation: 1179

Python Regex: Multiple "start" terms, but it should only match from the last "start" term before the "end" term

From the following string:

s = "ABCD {DB_any_alphanumeric_character\} ABCD {DB_any_alphanumeric_character}.TABLE ABCD"

I would like to match only {DB_any_alphanumeric_character}.TABLE. So the starting term is {DB_ and the end term is .TABLE. My difficulties arise since there are two {DB_ in the string.

How can I make it match only from the second {DB_ until .TABLE?

I feel like this can't be a very complicated regex, but despite reading through dozens of regex related stackoverflow questions and tutorials (e.g. https://regex101.com, https://regexone.com/lesson/line_beginning_end, ...) online, I fail.

Here are just two of my unsuccessful attempts:

exp = re.search(r"^{DB\w*TABLE$", s)

It returns None. They way I see it, it should return a string that starts with {DB followed by zero or more repititions of any alphanumeric character and end with TABLE.

Another attempt:

test = re.search(r"{DB(.+?).TABLE", s)

This returns {DB_ABCD\\} ABCD {DB_ABCD}.TABLE, which is exactly what I don't want.

Upvotes: 2

Views: 161

Answers (1)

Leo Arad
Leo Arad

Reputation: 4472

You can use the regex "{DB_\w*}.TABLE" that will return only the table name that followed by .TABLE

import re

s = "ABCD {DB_any_alphanumeric_character\} ABCD {DB_any_alphanumeric_character}.TABLE ABCD"
exp = re.search(r"{DB_\w*}.TABLE", s)
print(exp.group(0))

Output

'{DB_any_alphanumeric_character}.TABLE'

Upvotes: 1

Related Questions