Amit
Amit

Reputation: 34735

Java - Regex problem

I want to remove ) character from the end of a string through a regex.

E.g If a string is UK(Great Britain) then I want to replace the last ) symbol.

Note:

1). The regex should remove only the last ) symbol, doesn't matter how many ) symbols are present in the string.

Upvotes: 3

Views: 1522

Answers (5)

polygenelubricants
polygenelubricants

Reputation: 383746

If only ) at the end of the string is to be removed, then this works:

str.replaceFirst("\\)$", "");

This matches exactly what it says: a literal ) (escaped because it's also a regex metacharacter) followed by $, the end-of-string boundary anchor, and replace it with the empty string, effectively deleting any terminating ).

If there is no match, it means that there is no ) at the end of the string (even though there may be occurrences elsewhere), and there is no replacement made and the string is unchanged.


If you generally want to remove the last occurrence of ) which may not be at the end of the string, you can use greedy .* matching:

str.replaceFirst("(.*)\\)", "$1");

Here we have greedy matching .* that captures into \1. If the whole pattern ever matches, \1 would've been as long as it possibly can, which means that the literal ) following it would've had to have been the last occurrence (because if there is another occurrence to its right, \1 could've captured a longer string instead, which is a contradiction).


Performance

Matching the first regex should be optimizable to a O(1) operation, thanks to the end-of-string $ anchor. The actual replacement will be O(N), because the new string would have to be copied to a new buffer if there is a match. If there is no match, then it should be optimizable to return the original string, and therefore would've been O(1) overall. This is as optimal as it gets.

The second regex needs O(N) to match because of the repetition. This is no worse than a linear search for the last ) using lastIndexOf, which is also O(N).

If you're doing this a lot, then you should know the standard compiled Pattern equivalence of replaceFirst. From the API:

An invocation of this method of the form

str.replaceFirst(regex, repl)

yields exactly the same result as the expression

Pattern.compile(regex).matcher(str).replaceFirst(repl)

Readability

"Calling a replaceFirst method that's been hacked to actually replace last is just confusing."

It should be pointed here that in fact, you can use replaceAll with these exact patterns and the solution would still work! Really you just need a regex replace, and either of replaceAll or replaceFirst it really doesn't matter, the pattern is really that simple!

The needle$ to match at the end of the string and the greedy (.*)needle to match the last occurrence are basic idioms that is very readable and understandable to those who have basic understanding of regex. Neither would really qualify as "hacks".

Using a method called replaceFirst to replace the last occurrence of something may seem misleading at first, but this is shortsighted: it is the first match of the pattern that is replaced; what that pattern matches can be anything, be it the sixth "Sense", or the last "Mohican"!

As an analogy, let's take another simple string manipulation example: delete all "spam" substring from a string. I would argue that the most readable solution is to use replace

str.replace("spam", "");

"But wait! The name replace is misleading! You're not replacing it with something else! You should call a method called delete or something!"

That's silly-talk, of course! You are indeed replacing it with something else -- the empty string! Its effect is deletion, but the operation is still string replace-ment!

Just like the replaceFirst in my solution: you may want to replace the last occurrence of something, but it's still a first match of the overall pattern!

Now it's true that a regex pattern out of nowhere will be confusing, but it can be clear from context, e.g:

public static String removeLastCloseParenthesis(String str) {
   return str.replaceFirst("(.*)\\)", "$1");
}

And you can always just name the thing. And you can always put comments as/if necessary. These are just general code readability techniques, and therefore applicable to regex just as they do to everything else.

Upvotes: 3

Jason S
Jason S

Reputation: 189676

If you do want to use a regex (despite that it's doable w/o regex)

String s = /* ... your string here ... */
String parenReplacement = "!!!" // whatever the replacement is
Pattern p = Pattern.compile("^(.*)\\)([^\\)]*)$");
Matcher m = p.matcher(s);
if (m.find())
{
   s = m.group(1)+parenReplacement+m.group(2);
}

Upvotes: 2

Ben S
Ben S

Reputation: 69342

Please don't use a regex for this simple task.

// If the last ) might not be the last character of the String
String s = "Your String with) multiple).";
StringBuilder sb = new StringBuilder(s);
sb.deleteCharAt(s.lastIndexOf(')'));
s = sb.toString(); // s = "Your String with) multiple."

// If the last ) will always be the last character of the String
s = "Your String with))";
if (s.endsWith(")")) 
    s = s.substring(0, s.length() - 1);
// s = "Your String with)"

Upvotes: 9

Syntactic
Syntactic

Reputation: 10961

You don't really need a regex for this. The String class has a lastIndexOf() method that you can use to find the index of the last ) in the String. See here.

Upvotes: 0

Guillaume
Guillaume

Reputation: 14656

Why would you use a regex for that? Just use String.charAt(...) and substring(...)!

Upvotes: 1

Related Questions