Regex that matches chapter name

I'm modifying a text e-book that has sequences like this:

                           Chapter I
             PREHISTORIC MAN COMES TO NORTH AMERICA

with a newline both before and after the mentioned sequence.

I am trying to find a regex that matches the chapter name (in order to remove it)

                           Chapter I
                           [nothing]

I've come up with:

\\n( *)(Chapter(.*?))\\n(.*?)\\n

but it seems that it's not recognizing the sequence. What am I missing?

Upvotes: 1

Views: 575

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You can use

re.sub(r'(\n *Chapter.*\n *).*\S', r'\1[nothing]', text)

See the regex demo.

Details:

  • (\n *Chapter.*\n *) - Group 1 (\1 refers to this text from the replacement pattern): a newline, zero or more spaces, Chapter, then zero or more chars other than line break chars, as many as possible, a newline and then zero or more spaces
  • .* - zero or more chars other than line break chars, as many as possible
  • \S - a non-whitespace char.

Upvotes: 1

Related Questions