Find text in innermost curly brackets starting with word with given substring

Question

Consider the following text:

{\Largefont\it Hello world!} Some text. {   \Hugefont \sl Thanks.}

I am trying to write a regular expression which will:

identify innermost curly brackets in the full text, and
check if the first word in the identified block of text starts with '\' and has a substring 'font' in it.

The regex

re.compile(r'\{\s*[^{}]+\}')

does the first part of the job. How do I accomplish the second part? In particular, I do not want \Largefont\it to be treated as a single word but rather as two separate words \Largefont and \it. The expected output is:

{\Largefont\it Hello world!}
{   \Hugefont \sl Thanks.}

Thank you.

Pushpesh Kumar Rajwanshi · Accepted Answer

You need to use a positive look ahead that will ensure your incoming data follows the pattern. Here is the regex you can use,

(?<=\{)(?=\s*$$^{}$$*font)[^{}]+(?=\})

Demo

Explanation:

(?<=\{) - Positive look behind to ensure the text is preceded by { character
(?=\s*$$^{}$$*font) - Positive look ahead to ensure content inside curly brackets starts with optional white space then \ then first word contains font in first word followed by optional characters other than { or }
[^{}]+ - Actually captures the intended text
(?=\}) - Positive look ahead to ensure captured content is contained within closing curly bracket

Find text in innermost curly brackets starting with word with given substring

Answers (2)

Related Questions