Parth Bhoiwala
Parth Bhoiwala

Reputation: 1322

Regex to check for repeating characters inside a string

I have a string for username and it can contain a (period) . (underscore) _. But I don't want the string to have more than 1 repeating . or _. So for example:

alpha.beta.gamma and alpha.beta_gamma are acceptable.

alpha..beta, alpha__beta and alpha._beta etc. are not acceptable.

Here is the regex that I am using ".*([._])\\1{1,}.*"

It works for same characters, so it returns true for alpha..beta and alpha__beta. However, the regex returns false for alpha._beta.

How can I modify the regex to make it work for both . and _ repeating match?

Upvotes: 1

Views: 68

Answers (4)

achAmháin
achAmháin

Reputation: 4266

An alternative, non-regex way, is just a simple method to check if(s.contains...) and your 4 options (..,__,._,_.);

return !(s.contains("..") || s.contains("__") || s.contains("._") || s.contains("_."));

Upvotes: 0

Andrey Tyukin
Andrey Tyukin

Reputation: 44918

Recall that in RegEx, groups and */+ operators can be arbitrarily nested.

This here matches all strings that start with some characters, and then are followed by arbitrarily many blocks, where each block starts with a '.' or a '_' and ends with arbitrarily many characters.

"[a-zA-Z]+([._][a-zA-Z]+)*"

Example (uses Scala, but that's irrelevant, it's the same String literals, and the same java.util.regex library):

for (example <- List(
  "aa.bb_c", "aa_.bb", "abc..cb", "a.b.c_d",
  "uidh.dh_thh.ths", "_a", "a_", "a_b_c",
  "tdh_ins_utu", "ghs.tah..hua",
  "kqkz..wkmqjk", ".wkqj", "..wkqm")
) println(
  ("'" + example + "' : ").padTo(20, ' ') + 
  example.matches("[a-zA-Z]+([._][a-zA-Z]+)*")
)

Output:

'aa.bb_c' :         true
'aa_.bb' :          false
'abc..cb' :         false
'a.b.c_d' :         true
'uidh.dh_thh.ths' : true
'_a' :              false
'a_' :              false
'a_b_c' :           true
'tdh_ins_utu' :     true
'ghs.tah..hua' :    false
'kqkz..wkmqjk' :    false
'.wkqj' :           false
'..wkqm' :          false

Upvotes: 2

treyhakanson
treyhakanson

Reputation: 4911

The following should do the trick, by ensuring that the username does NOT match an undesired format:

// should not match this
/.*[._]{2,}.*/

This will ensure no more than one of either _ or . can appear consecutively.

You may also want to consider being more specific than .*, potentially something like [A-Za-z0-9]*.

Upvotes: 2

Mohammad Javad Noori
Mohammad Javad Noori

Reputation: 1217

use this : .*([._]){2,}.*

see Demo

Upvotes: 4

Related Questions