Find 2 or more Newlines

Question

My string looks like:

'I saw a little hermit crab
His coloring was oh so drab

It\u2019s hard to see the butterfly
Because he flies across the sky

Hear the honking of the goose
I think he\u2019s angry at the moose

\'

And I need to split it wherever there are two or more newlines.

Am using the re module, of course.

On this particular string re.split(r' +', text) works, but it wouldn't catch , right?

I have tried re.split(r'( ){2,}', text), which splits at every line and re.split(r' {2,}', text), which creates a list of len() 1.

Shouldn't re.split(r'( ){2,}', text) == re.split(r' ', text) be True for a string in which there are no consecutive occurrences of more than 2 ?

Aran-Fey · Accepted Answer

re.split(r'( ){2,}', text) doesn't split at every line. It does exactly what you want, except it preserves one occurence of because you've enclosed it in a capturing group. Use a non-capturing group instead:

(?:
){2,}

Here you can see what the difference is:

>>> re.split(r'(?:
){2,}', 'foo

bar')
['foo', 'bar']
>>> re.split(r'(
){2,}', 'foo

bar')
['foo', '
', 'bar']

Find 2 or more Newlines

Answers (2)

Related Questions