Akashdeep Saluja
Akashdeep Saluja

Reputation: 3089

How to match anything except space and new line?

I have a string, I just want to match substring for any character(s) except for space and new line. What should be regular expression for this?

I know regular expressions for anything but space i.e. [^ ]+ and regular expression for anything but new line [^\n]+ (I'm on Windows). I am not able to figure it out how to club them together.

Upvotes: 40

Views: 62173

Answers (3)

Nitish770
Nitish770

Reputation: 677

Try this

[^\s]+

\s is shorthand for whitespaces i.e space( ), newline (\n), tab (\t).

Upvotes: 12

thefourtheye
thefourtheye

Reputation: 239573

If you want to exclude just space and newline characters, then you might want to use

r'^[^ \n]*$'

For example,

print re.match(r'^[^ \n]*$', """WelcometoStackoverflow""")
# <_sre.SRE_Match object at 0x7f77a2a58238>
print re.match(r'^[^ \n]*$', """Welcome toStackoverflow""")
# None
print re.match(r'^[^ \n]*$', """Welcome
toStackoverflow""")
# None

Note that it will not eliminate all the other whitespace characters, like tabs, line feed characters etc

print re.match(r'^[^ \n]*$', """Welcome\ttoStackoverflow""")
# <_sre.SRE_Match object at 0x7f77a2a58238>

So if you want to exclude all the whitespace characters then you can use

r'^[^\s]*$'

Or

r'^\S*$'

For example,

print re.match(r'^[^\s]*$', """WelcometoStackoverflow""")
# <_sre.SRE_Match object at 0x7f9146c8b238>
print re.match(r'^[^\s]*$', """Welcome toStackoverflow""")
# None
print re.match(r'^[^\s]*$', """Welcome
toStackoverflow""")
# None
print re.match(r'^[^\s]*$', """Welcome\ttoStackoverflow""")
# None

\S is the same as [^\s]. Quoting from the docs,

\s

When the UNICODE flag is not specified, it matches any whitespace character, this is equivalent to the set [ \t\n\r\f\v]. The LOCALE flag has no extra effect on matching of the space. If UNICODE is set, this will match the characters [ \t\n\r\f\v] plus whatever is classified as space in the Unicode character properties database.

\S

When the UNICODE flags is not specified, matches any non-whitespace character; this is equivalent to the set [^ \t\n\r\f\v] The LOCALE flag has no extra effect on non-whitespace match. If UNICODE is set, then any character not marked as space in the Unicode character properties database is matched.

Upvotes: 5

hwnd
hwnd

Reputation: 70732

You can add the space character to your character class to be excluded.

^[^\n ]*$

Regular expression

^              # the beginning of the string
 [^\n ]*       # any character except: '\n' (newline), ' ' (0 or more times)
$              # before an optional \n, and the end of the string

Upvotes: 50

Related Questions