Reputation: 37
I'm trying to match the following strings:
this\test_
_thistes\t
_t\histest
In other words, the allowed strings have ONLY a backslash, splitting 2 substrings which can contain numbers, letters and _ characters.
I tried the following regex, testing it on http://regexhero.net/tester/:
^[a-zA-Z_][\\\]?[a-zA-Z0-9_]+$
Unfortunately, it recognizes also the following not allowed strings:
this\\
_\
_\w\s\x
Any help please?
Upvotes: 1
Views: 24230
Reputation: 1132
Pretty sure this should work if i understood everything you wanted.
^([a-zA-Z0-9_]+\\[a-zA-Z0-9_]+)
Upvotes: 0
Reputation: 174696
Don't make the \
as optional. The below regex won't allow two or more \
backslashes and asserts that there must be atleast one word character present before and after to the \
symbol.
@"^\w+\\\w+$"
OR
@"^[A-Za-z0-9_]+\\[A-Za-z0-9_]+$"
Upvotes: 2
Reputation: 71538
Your regex can mean two things, depending on whether you are declaring it as a raw string or as a normal string.
Using:
"^[a-zA-Z_][\\\]?[a-zA-Z0-9_]+$"
Will not match any of your test examples, since this will match, in order:
^
beginning of string,[a-zA-Z_]
1 alpha character or underscore,[\\\]?
1 optional backslash,[a-zA-Z0-9_]+
at least 1 alphanumeric and/or underscore characters,$
end of stringIf you use it as a raw string (which is how regexhero interpreted it and indicated by the @
sign before the string starts) is:
@"^[a-zA-Z_][\\\]?[a-zA-Z0-9_]+$"
^
beginning of string,[a-zA-Z_]
1 alpha character or underscore,[\\\]?[a-zA-Z0-9_]+
one or more characters being; backslash, ]
, ?
, alphanumeric and underscore,$
end of string.So what you actually need is either:
"^[a-zA-Z0-9_]+\\\\[a-zA-Z0-9_]+$"
(Two pairs of backslashes become two literal backslashes, which will be interpreted by the regex engine as an escaped backslash; hence 1 literal backslash)
Or
@"^[a-zA-Z0-9_]+\\[a-zA-Z0-9_]+$"
(No backslash substitution performed, so the regex engine directly interprets the escaped backslash)
Note that I added the numbers in the first character class to allow it to match numbers like you requested and added the +
quantifier to allow it to match more than one character before the backslash.
Upvotes: 0
Reputation: 27599
The best way to fix up your regex is the following:
^[a-zA-Z0-9_]+\\[a-zA-Z0-9_]+$
This breaks down to:
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
[a-zA-Z0-9_]+ any character of: 'a' to 'z', 'A' to 'Z',
'0' to '9', '_' (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
[a-zA-Z0-9_]+ any character of: 'a' to 'z', 'A' to 'Z',
'0' to '9', '_' (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
Explanation courtesy of http://rick.measham.id.au/paste/explain.pl
As you can see we have the same pattern before and after the backslash (since you indicated they should both be letters, numbers and underscores) with the + modifier meaning at least one. Then in the middle there is just the backslash which is compulsory.
Since it is unclear whether when you said "letters" you meant the basic alphabet or if you meant anything that is letter like (most obviously accented characters but also any other alphabet, etc.) then you may want to expand your set of characters by using something like \w
as Avinash Raj suggests. See http://msdn.microsoft.com/en-us/library/20bw873z(v=vs.110).aspx#WordCharacter for more info on what the "word character" covers.
Upvotes: 1