Reputation: 649
I'm reading text where I want to find the end of the first sentence, at this point the first index of either '.', '?', or '!' in a string. So here's my Java code:
int next = -1;
int nextQ = text.indexOf("? ");
int nextE = text.indexOf("! ");
int nextDot = text.indexOf(". ");
if (nextDot > 0) {
next = nextDot;
if (nextQ > 0){
if (nextQ < next) {next = nextQ;}
if (nextE > 0) {
if (nextE < next) {next = nextE;}
}
} else if (nextE > 0){
if (nextE < next) {next = nextE;}
}
} else if (nextQ > 0){
next = nextQ;
if (nextE > 0 && nextE < next){next = nextE;}
} else if (nextE > 0) { next = nextE;}
I believe the code works but that's a total of 10 if statements, which doesn't look too neat. I might want to add more sentence delimiters there but I don't think this approach is very flexible. Is there any better way of doing the same? Any shorter way of achieving the same result? ...or should I try some other programming language for this sort of problems? Which one?
Upvotes: 3
Views: 139
Reputation: 7461
You may like to just filter out values, which are not ok ( == -1) (Java 8):
int nextQ = text.indexOf("? ");
int nextE = text.indexOf("! ");
int nextDot = text.indexOf(". ");
OptionalInt res = IntStream.of(nextQ, nextE, nextDot).filter(i -> i != -1).min();
if (res.isPresent())
// ok, using res.get()
else
// none of these substrings found
It's more a joke, than a real answer, in real life gandaliter's answer should be used.
Upvotes: 2
Reputation: 311050
I would suggest just looping through the string character by character and stopping when you encounter any of those characters. What you're doing now is many times less efficient.
Upvotes: 0
Reputation: 10111
I'd suggesting using a regular expression to search for any of those delimiters at once.
String text = <TEXT>;
int next;
Pattern p = Pattern.compile("\\? |! |\\. ");
Matcher m = p.matcher(text);
if (m.find()) {
int next = m.start();
} else next = -1;
You can change the regex to adjust exactly what is matched. For example, I'd suggest that instead of requiring exactly a space after the delimiter, you instead require any whitespace character, so that a line break or tab will also work. This would be as follows: "\\?\\s|!\\s|\\.\\s"
. You would be able to add extra delimiters in a similar manner, and with a little extra work be able to detect which delimiter was triggered.
The documentation for Java regular expressions in the Pattern class is here and a useful tutorial here.
Upvotes: 8
Reputation: 70584
Use methods to keep DRY:
int firstDelimiterIndex(String s) {
return minIndex(s.indexOf(". "), minIndex(s.indexOf("? "), s.indexOf("! ")));
}
int minIndex(int a, int b) {
if (a == -1) return b;
if (b == -1) return a;
return Math.min(a, b);
}
Or choose a faster algorithm:
for (int i = 0; i < s.length; i++) {
switch (s.charAt(i)) {
case '.':
case '?':
case '!':
if (i + 1 < s.length() && s.charAt(i + 1) == ' ')
return i;
}
}
Upvotes: 5
Reputation: 39059
Use Math.min and a small modification.
First, turn -1 into large positive integers:
int largeMinusOne(int a)
{
return a==-1 ? 9999999 : a;
}
int nextQ = largeMinusOne(text.indexOf("? "));
int nextE = largeMinusOne(...);
int nextDot = largeMinuseOne(...);
And now:
int next = Math.min(Math.min(nextQ, nextE), nextDot);
Upvotes: 3