Jithin
Jithin

Reputation: 23

Regex: Extract string with words surrounded by curly braces, if it contains a text

Input

I've a string like this:

string_ = "abc {def {ghi} { {jkl} {mno}} } \n abc { lmn {ghi} { {jkl} {mno}} }"

I need to extract the values in curly braces, if it contains the string 'def', using python regex.

Output

Expected output:

 "{def {ghi} { {jkl} {mno} } }"   

Could someone help me on this?

Upvotes: 2

Views: 583

Answers (2)

Dimensity
Dimensity

Reputation: 46

import re
string_ = "abc {def {ghi} { {jkl} {mno}} }  abc { lmn {ghi} { {jkl} {mno}} }"
result = re.search(r'{\s*def(\s*{.*?}\s*)*}(\s*})*',string_)
print(result.group())

This simple yet a bit lengthy regex works with or without '\n' delimiter.

The way it works is:

{\s*def : this searches a curly brace to start with and then searches for def with 0+ whitespace in between.

(\s*{.*?}\s*)*} : this searches for multiple instance of {...} pair, however two problems arise here first, we need to disable greedy with ? else it will go till the end of the string. Second, it can't keep a track of proper bracket pairs, (that's a job of recursion), so for it { {abc} is a valid bracket pair. How to fix this ?

(\s*})* : This fixes that. It captures all the rest left closing braces. (here I have done an assumption but until it doesn't do a problem in your case I won't mention it)

Look I assume a pattern in your type of string you gave. And if this doesn't work for any string let me know.

Upvotes: 1

basckerwil
basckerwil

Reputation: 388

import re

pattern = re.compile(r'({def(?:.*)[{].*[}])')
string_ = "abc {def {ghi} { {jkl} {mno}} } \n abc { lmn {ghi} { {jkl} {mno}} }"

result = pattern.search(string_)
print(result.group(1))

Upvotes: 2

Related Questions