Carven
Carven

Reputation: 15648

Negating a Regular Expression for string replacement

I have the following code that can replace the email address in a String in Java:

addressStr.replaceFirst("([a-zA-Z0-9_\\-\\.]+)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.)|(([a-zA-Z0-9\\-]+\\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})", "")

So, a string with John Smith <[email protected]> would become John Smith <>. How do I negate it so that it will instead replace all that doesn't match the email address and have the final result as just [email protected]?

I tried to put in the ^ and ?<= at the front but it doesn't work.

Upvotes: 1

Views: 366

Answers (4)

Tim Pietzcker
Tim Pietzcker

Reputation: 336128

Well, it's not the regex you need to change but the calling code. Your regex matches the e-mail address (in a weird way), and the replace() removes it from the string.

So just use

Pattern regex = Pattern.compile("([a-zA-Z0-9_\\-\\.]+)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.)|(([a-zA-Z0-9\\-]+\\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})");
Matcher regexMatcher = regex.matcher(addressStr);
if (regexMatcher.find()) {
    address = regexMatcher.group();
}

Upvotes: 6

Alex Garcia
Alex Garcia

Reputation: 823

The complete Java regex for catching e-mails would be as follows:

"(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])"

Take a look at https://www.rfc-editor.org/rfc/rfc2822#section-3.4.1 for more info on this.

A bit complicated but it is valid for all known and valid emails formats (yours do not allows mails like [email protected] which are valid).

For your problem, as stated multiple times, just find (stealing Tim Pietzcker piece of code):

Pattern regex = Pattern.compile("(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])");
Matcher regexMatcher = regex.matcher(addressStr);
foundMatch = regexMatcher.find();

Upvotes: 3

Steen
Steen

Reputation: 2877

Just don't replace.... use match(es) instead.

Upvotes: 0

Ashwinee K Jha
Ashwinee K Jha

Reputation: 9307

You can try:

String mailId = Pattern.compile(regexp, Pattern.LITERAL).matcher(addressStr).group();

Idea here is to get the matched string rather than trying to replace everything else with blank. You can extract the pattern into a field if this operation is repetitive.

Upvotes: 0

Related Questions