Reputation: 67
I'm trying to split this string :
aba(2)bb(52)gc(4)d(2)fe(14)f(6)g(8)h(4)5(6)
so it looks like this array :
[ a, b, a(2), b, b(52), g, c(4), d(2), f, e(14), f(6), g(8) ]
Here are the rules, it can accept letters a to g, it can be a letter alone but if there is parentheses following it, it has to include them and its content. The content of the parentheses must be a numeric value.
This is what I tried :
content = "aba(2)bb(52)gc(4)d(2)fe(14)f(6)g(8)h(4)5(6)";
a = content.split("[a-g]|[a-g]\\([0-9]*\\)");
for (String s:
a) {
System.out.println(s);
}
And here's the output
(2)
(52)
(4) (2)
(14) (6) (8)h(4)5(6)
Thanks.
Upvotes: 4
Views: 2783
Reputation: 1410
If you want to use the split method only, here is an approach you could follow too,
import java.util.Arrays;
public class Test
{
public static void main(String[] args)
{
String content = "aba(2)bb(52)gc(4)d(2)fe(14)f(6)g(8)h(4)5(6)";
String[] a = content.replaceAll("[a-g](\\([0-9]*\\))?|[a-g]", "$0:").split(":");
// $0 is the string which matched the regex
System.out.println(Arrays.toString(a));
}
}
Regex : [a-g](\\([0-9]*\\))?|[a-g]
matches the strings you want to match with (i.e a, b, a(5) and so on)
Using this regex I first replace those strings with their appended versions (appended with :). Later, I split the string using the split method.
Output of the above code is,
[a, b, a(2), b, b(52), g, c(4), d(2), f, e(14), f(6), g(8), h(4)5(6)]
NOTE: This approach would only work with a delimiter that is known to not be present in the input string. For example, I chose a colon because I assumed it won't be a part of the input string.
Upvotes: 1
Reputation: 2863
You can try the following regex: [a-g](\(.*?\))?
[a-g]
: letters from a to g required(\(.*?\))?
: any amout of characters between (
and )
, matching as as few times as possibleYou can view the expected output here.
This answer is based upon Pattern
, an example:
String input = "aba(2)bb(52)gc(4)d(2)fe(14)f(6)g(8)h(4)5(6)";
Pattern pattern = Pattern.compile("[a-g](?:\\(\\d+\\))?");
Matcher matcher = pattern.matcher(input);
List<String> tokens = new ArrayList<>();
while (matcher.find()) {
tokens.add(matcher.group());
}
tokens.forEach(System.out::println);
Resulting output:
a
b
a(2)
b
b(52)
g
c(4)
d(2)
f
e(14)
f(6)
g(8)
Edit: Using [a-g](?:\((.*?)\))?
you can also easily extract the inner value of a bracket:
while (matcher.find()) {
tokens.add(matcher.group());
tokens.add(matcher.group(1)); // the inner value or null if no () are present
}
Upvotes: 0
Reputation: 20889
Split is the wrong approach for this, as it is hard to eliminate wrong entries.
Just "match", whatever is valid and process the result array of found matches:
[a-g](?:\(\d+\))?
Upvotes: 0
Reputation: 626689
It is easier to match these substrings:
String content = "aba(2)bb(52)gc(4)d(2)fe(14)f(6)g(8)h(4)5(6)";
Pattern pattern = Pattern.compile("[a-g](?:\\(\\d+\\))?");
List<String> res = new ArrayList<>();
Matcher matcher = pattern.matcher(content);
while (matcher.find()){
res.add(matcher.group(0));
}
System.out.println(res);
Output:
[a, b, a(2), b, b(52), g, c(4), d(2), f, e(14), f(6), g(8)]
See the Java demo and a regex demo.
Pattern details
[a-g]
- a letter from a
to g
(?:\(\d+\))?
- an optional non-capturing group matching 1 or 0 occurrences of
\(
- a (
char\d+
- 1+ digits\)
- a )
char.Upvotes: 1