Miro Lehtonen
Miro Lehtonen

Reputation: 649

How to find the smallest positive int efficiently?

I'm reading text where I want to find the end of the first sentence, at this point the first index of either '.', '?', or '!' in a string. So here's my Java code:

int next = -1;
int nextQ = text.indexOf("? ");
int nextE = text.indexOf("! ");
int nextDot = text.indexOf(". ");
if (nextDot > 0) {
    next = nextDot;
    if (nextQ > 0){
        if (nextQ < next) {next = nextQ;}
        if (nextE > 0) {
            if (nextE < next) {next = nextE;}
        }
    } else if (nextE > 0){
        if (nextE < next) {next = nextE;}
    }
} else if (nextQ > 0){
    next = nextQ;
    if (nextE > 0 && nextE < next){next = nextE;}
} else if (nextE > 0) { next = nextE;}

I believe the code works but that's a total of 10 if statements, which doesn't look too neat. I might want to add more sentence delimiters there but I don't think this approach is very flexible. Is there any better way of doing the same? Any shorter way of achieving the same result? ...or should I try some other programming language for this sort of problems? Which one?

Upvotes: 3

Views: 139

Answers (5)

Dmitry Ginzburg
Dmitry Ginzburg

Reputation: 7461

You may like to just filter out values, which are not ok ( == -1) (Java 8):

int nextQ = text.indexOf("? ");
int nextE = text.indexOf("! ");
int nextDot = text.indexOf(". ");
OptionalInt res = IntStream.of(nextQ, nextE, nextDot).filter(i -> i != -1).min();
if (res.isPresent())
    // ok, using res.get()
else
    // none of these substrings found

It's more a joke, than a real answer, in real life gandaliter's answer should be used.

Upvotes: 2

user207421
user207421

Reputation: 311050

I would suggest just looping through the string character by character and stopping when you encounter any of those characters. What you're doing now is many times less efficient.

Upvotes: 0

gandaliter
gandaliter

Reputation: 10111

I'd suggesting using a regular expression to search for any of those delimiters at once.

String text = <TEXT>;
int next;
Pattern p = Pattern.compile("\\? |! |\\. ");
Matcher m = p.matcher(text);
if (m.find()) {
   int next = m.start();
} else next = -1;

You can change the regex to adjust exactly what is matched. For example, I'd suggest that instead of requiring exactly a space after the delimiter, you instead require any whitespace character, so that a line break or tab will also work. This would be as follows: "\\?\\s|!\\s|\\.\\s". You would be able to add extra delimiters in a similar manner, and with a little extra work be able to detect which delimiter was triggered.

The documentation for Java regular expressions in the Pattern class is here and a useful tutorial here.

Upvotes: 8

meriton
meriton

Reputation: 70584

Use methods to keep DRY:

int firstDelimiterIndex(String s) {
    return minIndex(s.indexOf(". "), minIndex(s.indexOf("? "), s.indexOf("! ")));
}

int minIndex(int a, int b) {
    if (a == -1) return b;
    if (b == -1) return a;
    return Math.min(a, b);
}

Or choose a faster algorithm:

for (int i = 0; i < s.length; i++) {
    switch (s.charAt(i)) {
    case '.':
    case '?':
    case '!':
        if (i + 1 < s.length() && s.charAt(i + 1) == ' ') 
            return i;
    }
}

Upvotes: 5

zmbq
zmbq

Reputation: 39059

Use Math.min and a small modification.

First, turn -1 into large positive integers:

int largeMinusOne(int a)
{
    return a==-1 ? 9999999 : a;
}

int nextQ = largeMinusOne(text.indexOf("? "));
int nextE = largeMinusOne(...);
int nextDot = largeMinuseOne(...);

And now:

int next = Math.min(Math.min(nextQ, nextE), nextDot);

Upvotes: 3

Related Questions