Paolo M
Paolo M

Reputation: 12757

Matching text patterns relaxing contraints with admitted character substitutions

Suppose I've to match some patterns in an input text. Let the pattern be

password

I want to match all inputs that match with my pattern but admitting a predefined set of character substitution, say:

1. a -> @
2. e -> 3
3. o -> 0
4. i -> !
5. s -> $

Staying my example, I'd like all the following inputs to be successfully matched against my password pattern: p@ssword, p@$sword, pa$$w0rd, and so on.

My main question is how to do it, but narrowing it down:

  1. is regex the right instrument to do that?
  2. how can I define an admitted set of substitutions that regex matching should take into account?
  3. is this a pretty common question that I've overlooked?
  4. what is a concise way (i.e. just defining once the substitutions without having to repeat them for every admitted pattern) to achieve this for multiple patterns?

Note that password was just a convenient word containing several possible replacements; my problem does not want to deal with security at all.

Upvotes: 0

Views: 66

Answers (1)

Aaron
Aaron

Reputation: 24812

If the substitution always is a single character (or one single characters between multiple choices), you can use a character class :

p[a@][s$][s$]w[o0]rd

If it can be multiple characters, you have to use alternations :

pass(w|\/\/)ord

You could define a map/dictionary of original characters to possible substitutions in the language of your choice and use that to transform an input into a pattern :

1. a -> [a@]
2. e -> [e3]
3. o -> [o0]
4. i -> [i!]
5. s -> [s$]
6. w -> (w|\/\/)

password -> p[a@][s$][s$](w|\/\/)[o0]rd

I think regex can be a good tool for that, but there are already existing tools that will test the strength of a password if that's what you're looking for. They account for common substitutions.

Maybe using Levenshtein's distance would also be useful to you if you want to forbid people from reusing a close password.

Upvotes: 1

Related Questions