Reputation: 3089
I have a string, I just want to match substring for any character(s) except for space and new line. What should be regular expression for this?
I know regular expressions for anything but space i.e. [^ ]+
and regular expression for anything but new line [^\n]+
(I'm on Windows). I am not able to figure it out how to club them together.
Upvotes: 40
Views: 62173
Reputation: 677
Try this
[^\s]+
\s is shorthand for whitespaces i.e space( ), newline (\n), tab (\t).
Upvotes: 12
Reputation: 239573
If you want to exclude just space and newline characters, then you might want to use
r'^[^ \n]*$'
For example,
print re.match(r'^[^ \n]*$', """WelcometoStackoverflow""")
# <_sre.SRE_Match object at 0x7f77a2a58238>
print re.match(r'^[^ \n]*$', """Welcome toStackoverflow""")
# None
print re.match(r'^[^ \n]*$', """Welcome
toStackoverflow""")
# None
Note that it will not eliminate all the other whitespace characters, like tabs, line feed characters etc
print re.match(r'^[^ \n]*$', """Welcome\ttoStackoverflow""")
# <_sre.SRE_Match object at 0x7f77a2a58238>
So if you want to exclude all the whitespace characters then you can use
r'^[^\s]*$'
Or
r'^\S*$'
For example,
print re.match(r'^[^\s]*$', """WelcometoStackoverflow""")
# <_sre.SRE_Match object at 0x7f9146c8b238>
print re.match(r'^[^\s]*$', """Welcome toStackoverflow""")
# None
print re.match(r'^[^\s]*$', """Welcome
toStackoverflow""")
# None
print re.match(r'^[^\s]*$', """Welcome\ttoStackoverflow""")
# None
\S
is the same as [^\s]
. Quoting from the docs,
\s
When the
UNICODE
flag is not specified, it matches any whitespace character, this is equivalent to the set[ \t\n\r\f\v]
. TheLOCALE
flag has no extra effect on matching of the space. IfUNICODE
is set, this will match the characters[ \t\n\r\f\v]
plus whatever is classified as space in the Unicode character properties database.\S
When the
UNICODE
flags is not specified, matches any non-whitespace character; this is equivalent to the set[^ \t\n\r\f\v]
TheLOCALE
flag has no extra effect on non-whitespace match. IfUNICODE
is set, then any character not marked as space in the Unicode character properties database is matched.
Upvotes: 5
Reputation: 70732
You can add the space character to your character class to be excluded.
^[^\n ]*$
Regular expression
^ # the beginning of the string
[^\n ]* # any character except: '\n' (newline), ' ' (0 or more times)
$ # before an optional \n, and the end of the string
Upvotes: 50