Reputation: 2896
Consider following expression:
((password|secret)(=|%3D%22))+([^&|\"|%22]*)
And value:
http://host?foo=bar&xml=%3C%3Fxml+id%3D%220abc987%22+password%3D%22secreT12aa5%22+binds%3D%222%22
The xml parameter contains encoded value <?xml id="0abc987" password="secreT12aa5" binds="2"
What I would like to achieve is match password="secreT12aa5"
and then replace it with e.g. password="****"
This issue is that the given regular expression matches, only the sequence of string up to 2
, this is because of value in a negate set %22
. The percentage sign is being ignored.
How can I change the expression to match password%3D%22secreT12aa5
(whole password value?)
The expression should also match http://host?password=value
. Which currently does.
I would like to use this regular expression also for replacements. And use replaceAll()
method to actually strip a matching parameter value.
Soe the regex ((password)(=|%3D%22))([^&|\\"]*)(%22)?
with replacements $1[PROTECTED]$5
automatically replaces:
password=VALUE
to =>
password=[PROTECTED]
password=VALUE&secret=VALUE
to =>
password=[PROTECTED]&secret=[PROTECTED]
http://host?foo=bar&xml=%3C%3Fxml+id%3D%220abc987%22+password%3D%22secreT12345%22+binds%3D%222%22
to =>
http://host?foo=bar&xml=%3C%3Fxml+id%3D%220abc987%22+password%3D%22[PROTECTED]%22+binds%3D%222%22
Upvotes: 1
Views: 475
Reputation: 626950
Note that [^&|\"|%22]
is a negated character class that matches any char but &
, |
(yes, a pipe), "
, %
and 2
since inside the character class all the chars are treated separately, not as sequences.
You may use
password(?:="?|%3D%22)(?:(?!%22)[^&\"])*"?
See the regex demo
Details
password
- a literal substring(?:="?|%3D%22)
- either =
followed with an optional "
or %3D%22
(?:(?!%22)[^&\"])*
- any char but &
and "
([^&\"]
), 0 or more occurrences as many as possible (*
), that does not start a %22
char sequence (a so called tempered greedy token)."?
- an optional "
.You may re-write the pattern using "unroll-the-loop" principle as
password(?:="?|%3D%22)[^&\"%]*(?:%(?!22)[^%&\"]*)*"?
See another demo.
Also, others would prefer a lazy pattern + lookahead with alternation approach:
password(?:="?|%3D%22)[^&\"]*?(?:(?=%22)|\"|$)
See yet another regex demo.
Upvotes: 2