Reputation: 365
I have a multiline string which is delimited by a set of different delimiters,
A Z DelimiterB B X DelimiterA (C DelimiterA D) DelimiterB (E DelimiterA F) DelimiterB G DelimiterA H
I need to split that string by delimiters, but if some words are inside brackets then extract the bracket as a single word even if it contains a delimiter. I need them to be extract as follows,
A Z
DelimiterB
B X
DelimiterA
(C DelimiterA D) (extract with brackets)
DelimiterB
(E DelimiterA F)
DelimiterB
G
DelimiterA
H
Currently I am using this expression to split by delimiters,
(((?<=DelimiterA)|(?=DelimiterA))|((?<=DelimiterB)|(?=DelimiterB)))
I tried the following but it is not working. So how can I make this to work?
((?=\()|(?<=\))|(((?<=DelimiterA)|(?=DelimiterA))|((?<=DelimiterB)|(?=DelimiterB))))
Java Code,
String txt = "A DelimiterB B DelimiterA (C DelimiterA D) DelimiterB (E DelimiterA F) DelimiterB G DelimiterA H";
String[] texts = txt.split("((?=\()|(?<=\))|(((?<=DelimiterA)|(?=DelimiterA))|((?<=DelimiterB)|(?=DelimiterB))))");
for (String word : texts) {
System.out.println(word);
}
Upvotes: 3
Views: 277
Reputation: 8114
Since the "delimiter" is also needed, I suggest to match the pattern we need instead. Base on the example given, we have below patterns to capture.
(C DelimiterA D)
- Bracket contain a word, delimiter and a word"\\(\\w+ (DelimiterA|DelimiterB) \\w+\\)"
DelimiterB
- Whole Delimiter."(DelimiterA|DelimiterB)"
.B
, B X
- One or multiple words which are not delimiter."\\w+((?<!(DelimiterA|DelimiterB))\\s(?!(DelimiterA|DelimiterB))\\w+)*"
.import java.util.Scanner;
public class SplitWithCustomDelimiter {
public static void main(String[] args) {
String txt = "A Z DelimiterB B X DelimiterA (C DelimiterA D) DelimiterB (E DelimiterA F) DelimiterB G DelimiterA H";
// scanner can accept different source
Scanner scanner = new Scanner(txt);
scanner.findAll(
"\\(\\w+ (DelimiterA|DelimiterB) \\w+\\)" +
"|(DelimiterA|DelimiterB)" +
"|\\w+((?<!(DelimiterA|DelimiterB))\\s(?!(DelimiterA|DelimiterB))\\w+)*"
)
.map(matchResult -> matchResult.group()).forEach(System.out::println);
}
}
Upvotes: 1