Reputation: 21368
I need to split a String with dot '.' but with one catch as explained below For example, if a String is like this
String str = "A.B.C"
then, splitting with dot, will give A,B and C
.
But if the some part is marked with single inverted comma, then split should ignore it
String str = "A.B.'C.D'"
then, splitting with dot, should give A,B and C.D
.
How can I achieve this?
Upvotes: 0
Views: 114
Reputation: 12952
I don't know of a method in the standard library that does this. It is not too difficult to write yourself, though:
public static String[] splitByDots(String s)
{
List<String> ss = new ArrayList<>();
boolean inString = false;
int start = 0;
for (int p = 0; p < s.length(); p++) {
char ch = s.charAt(p);
if (ch == '\'') {
inString = !inString;
}
else if (ch == '.') {
if (!inString) {
ss.add(s.substring(start, p));
start = p + 1;
}
}
}
ss.add(s.substring(start));
return ss.toArray(new String[ss.size()]);
}
If you want to trim whitespace or remove the quote characters, you will have to tweak the above code a bit, but otherwise it does what you asked for.
Upvotes: 0
Reputation: 18173
First, split at '
and afterwards, if any of the split results end in .
, split at .
as well again.
"A.B.'C.D'"
=>
"A.B.", "C.D"
=> "A", "B", "C.D"
public static void main(String[] args) {
final String str = "A.B.'C.D'";
final List<String> result = new ArrayList<>();
for (String singleQuoteSplitResultArrayElement : str.split("'")) {
if (singleQuoteSplitResultArrayElement.endsWith(".")) {
Collections.addAll(result, singleQuoteSplitResultArrayElement.split("\\."));
} else {
result.add(singleQuoteSplitResultArrayElement);
}
}
System.out.println(result.stream().collect(Collectors.joining(", ")));
}
Upvotes: 2
Reputation: 48444
What you can do is as follows - will work with single letter and multiple letter tokens:
String input = "A.B.'C.D'";
// | not following capital letter(s) and '
// | | dot (escaped)
// | | | not followed by
// | | | capital letter(s) and '
System.out.println(Arrays.toString(input.split("(?<![A-Z]+?')\\.(?![A-Z]+?')")));
Output
[A, B, 'C.D']
Note
If you want it case-insensitive, prepend (?i)
to the Pattern
: (?i)(?<![A-Z]+?')\\.(?![A-Z]+?')")
Upvotes: 0
Reputation: 36304
If the String is always in the given format, you could try : \\.(?![A-Za-z]')
as regex
Upvotes: 2