user40739
user40739

Reputation: 91

Regular expression splitting by a specific pattern

I have a string str='\n1. AA \n2. BB\n3.\n4. CC'. I want to split it using the following pattern: a newline character followed by a digit followed by one or more space(s).

I am hoping to get the answer ['','AA ', 'BB\n3.', 'CC'].

If I use re.split('\n[0-9]\.\s+',str), I get the result:

['', 'AA ', 'BB', '4. CC']

What am I doing wrong?

Upvotes: 1

Views: 28

Answers (1)

John Kugelman
John Kugelman

Reputation: 361730

\s+ at the end matches whitespace including newline characters. If you don't want trailing newlines to match change it to [^\S\n]+:

>>> re.split('\n[0-9]\.[^\S\n]+',s)
['', 'AA ', 'BB\n3.', 'CC']

Upvotes: 1

Related Questions