Zanzi
Zanzi

Reputation: 37

Regex to match backslash inside a string

I'm trying to match the following strings:

In other words, the allowed strings have ONLY a backslash, splitting 2 substrings which can contain numbers, letters and _ characters.

I tried the following regex, testing it on http://regexhero.net/tester/: ^[a-zA-Z_][\\\]?[a-zA-Z0-9_]+$

Unfortunately, it recognizes also the following not allowed strings:

Any help please?

Upvotes: 1

Views: 24230

Answers (4)

Vajura
Vajura

Reputation: 1132

Pretty sure this should work if i understood everything you wanted.

^([a-zA-Z0-9_]+\\[a-zA-Z0-9_]+)

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174696

Don't make the \ as optional. The below regex won't allow two or more \ backslashes and asserts that there must be atleast one word character present before and after to the \ symbol.

@"^\w+\\\w+$"

OR

@"^[A-Za-z0-9_]+\\[A-Za-z0-9_]+$"

DEMO

Upvotes: 2

Jerry
Jerry

Reputation: 71538

Your regex can mean two things, depending on whether you are declaring it as a raw string or as a normal string.

Using:

"^[a-zA-Z_][\\\]?[a-zA-Z0-9_]+$"

Will not match any of your test examples, since this will match, in order:

  • ^ beginning of string,
  • [a-zA-Z_] 1 alpha character or underscore,
  • [\\\]? 1 optional backslash,
  • [a-zA-Z0-9_]+ at least 1 alphanumeric and/or underscore characters,
  • $ end of string

If you use it as a raw string (which is how regexhero interpreted it and indicated by the @ sign before the string starts) is:

@"^[a-zA-Z_][\\\]?[a-zA-Z0-9_]+$"
  • ^ beginning of string,
  • [a-zA-Z_] 1 alpha character or underscore,
  • [\\\]?[a-zA-Z0-9_]+ one or more characters being; backslash, ], ?, alphanumeric and underscore,
  • $ end of string.

So what you actually need is either:

"^[a-zA-Z0-9_]+\\\\[a-zA-Z0-9_]+$"

(Two pairs of backslashes become two literal backslashes, which will be interpreted by the regex engine as an escaped backslash; hence 1 literal backslash)

Or

@"^[a-zA-Z0-9_]+\\[a-zA-Z0-9_]+$"

(No backslash substitution performed, so the regex engine directly interprets the escaped backslash)

Note that I added the numbers in the first character class to allow it to match numbers like you requested and added the + quantifier to allow it to match more than one character before the backslash.

Upvotes: 0

Chris
Chris

Reputation: 27599

The best way to fix up your regex is the following:

^[a-zA-Z0-9_]+\\[a-zA-Z0-9_]+$

This breaks down to:

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  [a-zA-Z0-9_]+            any character of: 'a' to 'z', 'A' to 'Z',
                           '0' to '9', '_' (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  \\                       '\'
--------------------------------------------------------------------------------
  [a-zA-Z0-9_]+            any character of: 'a' to 'z', 'A' to 'Z',
                           '0' to '9', '_' (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

Explanation courtesy of http://rick.measham.id.au/paste/explain.pl

As you can see we have the same pattern before and after the backslash (since you indicated they should both be letters, numbers and underscores) with the + modifier meaning at least one. Then in the middle there is just the backslash which is compulsory.

Since it is unclear whether when you said "letters" you meant the basic alphabet or if you meant anything that is letter like (most obviously accented characters but also any other alphabet, etc.) then you may want to expand your set of characters by using something like \w as Avinash Raj suggests. See http://msdn.microsoft.com/en-us/library/20bw873z(v=vs.110).aspx#WordCharacter for more info on what the "word character" covers.

Upvotes: 1

Related Questions