François M.
François M.

Reputation: 4278

Matching any combination of space AND newline

I'm trying to find a regexp that catches all instances that contain at least one \n and any number of (space), no matter the order. So, for instance (with spaces denoted with _), all of these should be caught by the regexp:

\n
\n\n\n\n
\n\n\n_\n\n
_\n
\n_
_\n_
_\n\n
\n\n_
_\n\n_
_\n\n_\n
\n_\n_
_\n\n_\n_
___\n__\n and so on...

However, it must not catch spaces that do not border a \n.

In other words, I'd like to reduce all of this (if I'm not making any mistake) to one line:

import re
mystring = re.sub(r'(\n)+'  ,  '\n'  ,  mystring)
mystring = re.sub(r'( )+'   ,  ' '   ,  mystring)
mystring = re.sub(r'\n '    ,  '\n'  ,  mystring)
mystring = re.sub(r' \n'    ,  '\n'  ,  mystring)
mystring = re.sub(r'(\n)+'  ,  '\n'  ,  mystring)
mystring = re.sub(r'(\n)+'  ,  ' | ' ,  mystring) 

Upvotes: 0

Views: 834

Answers (2)

Juan Beleño
Juan Beleño

Reputation: 1

You can use the following regular expression:

(( )*\n+( )*)+

Upvotes: 0

Toto
Toto

Reputation: 91518

[ ]*(?:\n[ ]*)+

or, if you want to match tabulations:

[ \t]*(?:\n[ \t]*)+

Demo & explanation

Upvotes: 1

Related Questions