Reputation: 3185
I am writing a function that extracts the parameters in a URL, and I am using regex to identify if an email is present in the URL in order to remove it.
Here is my function :
function redactEmail() {
var emailRedacted = "";
try {
var urlparams= window.location.search.replace("?","");
var urlparamsdecoded = decodeURIComponent(urlparams);
emailRedacted = urlparamsdecoded;
var emailRegex = /\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,3})+/;
if (emailRegex.test(urlparamsdecoded)) {
emailRedacted = urlparamsdecoded.replace(emailRegex, '[REDACTED EMAIL]');
}
}
catch (e) {}
return emailRedacted;
}
This worked to return this :
email=[REDACTED EMAIL]
from this :
https://www.test.com/[email protected]
But in some cases, this function is stopping the whole website from working.
I am using this function in a tag on a website in GTM so I don't have access to the source code of the website.
An example where the website stopped working is this :
https://www.test.com/?token=_JxY5kgHdKMkO8uSYf77sEl9mJhD7NHwAlrsMfJ-1zg
The website stopped working completely.
I debugged the function and the problem is with :
ow_emailRegex.test(ow_urlparamsdecoded)
test()
? match()
did not work either.Thank you.
Upvotes: 1
Views: 374
Reputation: 626950
Make the dot or hyphen pattern inside groups obligatory to avoid having consequent +
/*
-quantifier patterns match the same chars:
\w+(?:[.-]\w+)*@\w+(?:[.-]\w+)*(?:\.\w{2,3})+
See how the regex fails gracefully against your string here.
Note that all [\.-]?
are turned to [.-]
, the whole [.-]\w+
group is still optional as *
matches 0 or more occurrences. The dot is not any special inside a character class, that's why I removed the backslash.
Also, you may use non-capturing groups since you are not interested in getting those submatches (and you actually can't in JavaScript).
Upvotes: 2