Regex that matches chapter name

Question

I'm modifying a text e-book that has sequences like this:

                           Chapter I
             PREHISTORIC MAN COMES TO NORTH AMERICA

with a newline both before and after the mentioned sequence.

I am trying to find a regex that matches the chapter name (in order to remove it)

                           Chapter I
                           [nothing]

I've come up with:

\n( *)(Chapter(.*?))\n(.*?)\n

but it seems that it's not recognizing the sequence. What am I missing?

Wiktor Stribiżew · Accepted Answer

You can use

re.sub(r'(
 *Chapter.*
 *).*\S', r'\1[nothing]', text)

See the regex demo.

Details:

( *Chapter.* *) - Group 1 (\1 refers to this text from the replacement pattern): a newline, zero or more spaces, Chapter, then zero or more chars other than line break chars, as many as possible, a newline and then zero or more spaces
.* - zero or more chars other than line break chars, as many as possible
\S - a non-whitespace char.

Answers (1)