KennethC
KennethC

Reputation: 756

Kotlin / Regex - Replace a group of pattern with a repeating character

I would like to mask the email passed in the maskEmail function. I'm currently facing a problem wherein the asterisk * is not repeating when i'm replacing group 2 and and 4 of my pattern.

Here is my code:

fun maskEmail(email: String): String {
    return email.replace(Regex("(\\w)(\\w*)\\.(\\w)(\\w*)(@.*\\..*)$"), "$1*.$3*$5")
}

Here is the input:

[email protected]
[email protected]
[email protected]

Here is the current output of that code:

t*.c*@email.com
c*.p*@email.com
c*.a*@email.com

Expected output:

t**.c**@email.com
c****.p**@email.com
c******.a******@email.com

Edit: I know this could be done easily with for loop but I would need this to be done in regex. Thank you in advance.

Upvotes: 3

Views: 3521

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627335

I suggest keeping any char at the start of string and a combination of a dot + any char, and replace any other chars with * that are followed with any amount of characters other than @ before a @:

((?:\.|^).)?.(?=.*@)

Replace with $1*. See the regex demo. This will handle emails that happen to contain chars other than just word (letter/digit/underscore) and . chars.

Details

  • ((?:\.|^).)? - an optional capturing group matching a dot or start of string position and then any char other than a line break char
  • . - any char other than a line break char...
  • (?=.*@) - if followed with any 0 or more chars other than line break chars as many as possible and then @.

Kotlin code (with a raw string literal used to define the regex pattern so as not to have to double escape the backslash):

fun maskEmail(email: String): String {
    return email.replace(Regex("""((?:\.|^).)?.(?=.*@)"""), "$1*")
}

See a Kotlin test online:

val emails = arrayOf<String>("[email protected]","[email protected]","[email protected]","[email protected]","[email protected]")
for(email in emails) {    
  val masked = maskEmail(email)
  println("${email}: ${masked}")    
}

Output:

[email protected]: c******.a*********@email.com
[email protected]: m*******.p*******@email.com
[email protected]: t**.c**@email.com
[email protected]: c****.p**@email.com
[email protected]: c******.a******@email.com

Upvotes: 4

Nick
Nick

Reputation: 147216

For your problem, you need to match each character in the email address that not is the first character in a word and occurs before the @. You can do that with a negative lookbehind for a word break and a positive lookahead for the @ symbol:

(?<!\b)\w(?=.*?@)

The matched characters can then be replaced with *.

Note we use a lazy quantifier (?) on the .* to improve efficiency.

Demo on regex101

Note also as pointed out by @CarySwoveland, you can replace (?<!\b) with \B i.e.

\B\w(?=.*?@)

Demo on regex101

As pointed out by @Thefourthbird, this can be improved further efficiency wise by replacing the .*? with a [^\r\n@]* i.e.

\B\w(?=[^\r\n@]*@)

Demo on regex101

Or, if you're only matching single strings, just [^@]*:

\B\w(?=[^@]*@)

Demo on regex101

Upvotes: 5

Related Questions