Reputation: 396
I have the following string, (a.1) (b.2) (c.3) (d.4)
. I want to change it to (1) (2) (3) (4)
. I use the following method.
str.replaceAll("\(.*[.](.*)\)","($1)")
. And I only get (4)
. What is the correct method?
Thanks
Upvotes: 3
Views: 3205
Reputation: 626747
Root cause
You want to match ()
-delimited substrings, but are using .*
greedy dot pattern that can match any 0 or more chars (other than line break chars). The \(.*[.](.*)\)
pattern will match the first (
in (a.1) (b.2) (c.3) (d.4)
, then .*
will grab the whole string, and backtracking will start trying to accommodate text for the subsequent obligatory subpatterns. [.]
will find the last .
in the string, the one before the last digit, 4
. Then, (.*)
will again grab all the rest of the string, but since the )
is required right after, due to backtracking the last (.*)
will only capture 4
.
Why is lazy / reluctant .*?
not a solution?
Even if you use \(.*?[.](.*?)\)
, if there are (xxx)
like substrings inside the string, they will get matched together with expected matches, as .
matches any char but line break chars.
Solution
.replaceAll("\\([^()]*\\.([^()]*)\\)", "($1)")
See the regex demo. The [^()]
will only match any char BUT a (
and )
.
Details
\(
- a (
char[^()]*
- a negated character class matching 0 or more chars other than (
and )
\.
- a dot([^()]*)
- Group 1 (its value is later referred to with $1
from the replacement pattern): any 0+ chars other than (
and )
\)
- a )
char.List<String> strs = Arrays.asList("(a.1) (b.2) (c.3) (d.4)", "(a.1) (xxxx) (b.2) (c.3) (d.4)");
for (String str : strs)
System.out.println("\"" + str.replaceAll("\\([^()]*\\.([^()]*)\\)", "($1)") + "\"");
Output:
"(1) (2) (3) (4)"
"(1) (xxxx) (2) (3) (4)"
Upvotes: 4
Reputation: 2436
try this one, it will match any alphabets, .
and "
and replace them all with empty ""
str.replaceAll("[a-zA-Z\\.\"]", "")
Edit:
You can use also [^\\d)(\\s]
to match all characters that are not number, space and )(
and replace them all with empty ""
string
String str = "(a.1) (b.2) (c.3) (d.4)";
System.out.println(str.replaceAll("[^\\d)(\\s]",""));
Upvotes: 2
Reputation: 81074
Couple things here. First, your escapes for the parentheses are incorrect. In Java string literals, backslash itself is an escape character, meaning you need to use \\(
to represent \(
in regex.
I think your question is how to do non-greedy matches in regex. Use ?
to specify non-greedy matching; e.g. *?
means "zero or more times, but as few times as possible".
This doesn't negate other answers, but they depend on your test input being as simple as it is in your question. This gives me the correct output without changing the spirit of your original regex (that only the parentheses and dot delimiter are known to be present):
String test = "(a.1) (b.2) (c.3) (d.4)";
String replaced = test.replaceAll("\\(.*?[.](.*?)\\)", "($1)");
System.out.println(replaced); // "(1) (2) (3) (4)"
Upvotes: 3
Reputation: 1091
Try this
str.replaceAll("[A-Za-z0-9]+\.","");
[A-Za-z0-9]
will match the upper case, lower case and digits. If you want to match anything before the dot(.) you can use .+
or .*
in the place of [A-Za-z0-9]+
Upvotes: 0