Reputation: 627
I have a java toString on code generated from XML . We as a company are logging the toString() to logs and I am having trouble making a good regex to mask all the data effectively . Here is the sample to String
String input="com.example.sensitive.info.UserInfo@15b1534[name=User1, clientName=HARVARD LAW SCHOOL, THE, clientId=12345]";
expected output
com.example.sensitive.info.UserInfo@15b1534[name=User1, clientName=****************, clientId=12345]
Can someone help me with a regex that will mask everything up until the last comma(,) before the next equal =
here is what I tried
maskPatterns.add("clientName=(.*?)=");
This ends up masking till next = . I cant seem to figure how to have it backtrack to last comma(,) before next equal(=).
Also if anyone has better regex for it I am all ears
Upvotes: 4
Views: 202
Reputation: 163362
According to your example of maskPatterns.add("clientName=(.*?)=");
I assume that you want the value in capture group 1.
If it should be agnostic of the square brackets for marking the end of the value, but you don't want to match them either, you might use:
\bclientName=([^\r\n,=\[\]]+(?:,(?!\h*\w+=)[^\r\n,=\[\]]*)*)
Explanation
\bclientName=
A word boundary, then match clientName=
(
Capture group 1
[^\r\n,=\[\]]+
Match 1+ times any char except ,
=
[
]
or a newline(?:
Non capture group
,(?!\h*\w+=)
Match a comma asserting what is directly to the right is not 0+ horizontal whitespace chars, 1+ word chars and an =
sign[^\r\n,=\[\]]*
Optionally match any char except a newline ,
=
[
]
)*
Close non capture group and repeat 0+ times to get all occurrences of a comma)
Close group 1If the [
and ]
can also be part of the clientName, you can omit them from the character classes.
Upvotes: 0
Reputation: 521379
Use String#replaceAll
here:
String input = "com.example.sensitive.info.UserInfo@15b1534[name=User1, clientName=HARVARD LAW SCHOOL, THE, clientId=12345]";
String output = input.replaceAll("\\bclientName=.*?(\\s*)(?=\\w+=|\\])", "clientName=****************$1");
System.out.println(input);
System.out.println(output);
This prints:
com.example.sensitive.info.UserInfo@15b1534[name=User1, clientName=HARVARD LAW SCHOOL, THE, clientId=12345]
com.example.sensitive.info.UserInfo@15b1534[name=User1, clientName=**************** clientId=12345]
Note that the number of asterisks probably should not exactly match the number of original characters in the clientName
. Doing so would actually be partially revealing the original content, insofar that it would reveal at least the original length of the clientName
string.
Upvotes: 1
Reputation: 626893
You can use
clientName=(.*?)(?=\s*,\s*\w+=|\])
See the regex demo
Details
clientName=
- a literal string(.*?)
- Group 1: any zero or more chars other than line break chars as few as possible(?=\s*,\s*\w+=|\])
- a positive lookahead that requires either ]
(\]
or (|
) a comma enclosed with zero or more whitespaces on both ends (\s*,\s*
), then one or more word chars and =
immediately to the right of the current location.Or, if you need the same amount of asterisks, use
String result = text.replaceAll("(\\G(?!^)|clientName=).(?=.*?,\\s*\\w+=|\\])", "$1*");
See this regex demo.
Details
(\\G(?!^)|clientName=)
.
- any char but a line break char(?=.*?,\s*\w+=|\])
- up to the first occurrence of
.*?,\s*\w+=
- any zero or more chars other than line break chars as few as possible, a comma, zero or more whitespaces, one or more word chars and a =
|
- or\]
- a ]
char.Upvotes: 1