Albobz
Albobz

Reputation: 53

Regular Expression to Anonymize Names

I am using Notepad++ and the Find and Replace pattern with regular expressions to alter usernames such that only the first and last character of the screen name is shown, separated by exactly four asterisks (*). For example, "albobz" would become "a****z".

Usernames are listed directly after the cue "screen_name: " and I know I can find all the usernames using the regular expression:

screen_name:\s([^\s]+)

However, this expression won't store the first or last letter and I am not sure how to do it.

Here is a sample line:

February 3, 2018    screen_name: FR33Q  location: Europe    verified: false lang: en

Upvotes: 1

Views: 2122

Answers (1)

revo
revo

Reputation: 48761

Method 1

You have to work with \G meta-character. In N++ using \G is kinda tricky.

Regex to find:

(?>(screen_name:\s+\S)|\G(?!^))\S(?=\S)

Breakdown:

  • (?> Construct a non-capturing group (atomic)

    • ( Beginning of first capturing group
      • screen_name:\s\S Match up to first letter of name
    • ) End of first CG
    • | Or
    • \G(?!^) Continue from previous match
  • ) End of NCG
  • \S Match a non-whitespace character
  • (?=\S) Up to last but one character

Replace with:

\1*

Live demo

Method 2

Above solution substitutes each inner character with a * so length remains intact. If you want to put four number of *s without considering length you would search for:

(screen_name:\s+\S)(\S*)(\S)

and replace with: \1****\3

Live demo

Upvotes: 1

Related Questions