Reputation: 420
I'm modifying a text e-book that has sequences like this:
Chapter I
PREHISTORIC MAN COMES TO NORTH AMERICA
with a newline both before and after the mentioned sequence.
I am trying to find a regex that matches the chapter name (in order to remove it)
Chapter I
[nothing]
I've come up with:
\\n( *)(Chapter(.*?))\\n(.*?)\\n
but it seems that it's not recognizing the sequence. What am I missing?
Upvotes: 1
Views: 575
Reputation: 626738
You can use
re.sub(r'(\n *Chapter.*\n *).*\S', r'\1[nothing]', text)
See the regex demo.
Details:
(\n *Chapter.*\n *)
- Group 1 (\1
refers to this text from the replacement pattern): a newline, zero or more spaces, Chapter
, then zero or more chars other than line break chars, as many as possible, a newline and then zero or more spaces.*
- zero or more chars other than line break chars, as many as possible\S
- a non-whitespace char.Upvotes: 1