Krzysztof
Krzysztof

Reputation: 2082

algorithm for finding longest sequence of the same element in 1D array-looking for better solution

I need to find algorithm which will find the longest seqeunce of element in one dimension array.

For example:

int[] myArr={1,1,1,3,4,5,5,5,5,5,3,3,4,3,3} 

solution will be 5 because sequnece of 5 is the longest. This is my solution of the problem:

static int findSequence(int [] arr, int arrLength){

    int frequency=1;
    int bestNumber=arr[0];
    int bestFrequency=0;

    for(int n=1;n<arrLength;n++){
        if(arr[n]!=arr[n-1]){ 
            if(frequency>bestFrequency){
                bestNumber=arr[n-1];
                bestFrequency=frequency;
            }
            frequency=1;
        }else { 
            frequency++;
        }
    }

    if( frequency>bestFrequency){
        bestNumber=arr[arrLength-1];
        bestFrequency=frequency;
    }

    return bestNumber;
}

but I'm not satisfied.May be some one know more effective solution?

Upvotes: 4

Views: 9272

Answers (5)

kedark
kedark

Reputation: 246

Try this:

public static void longestSequence(int[] a) {
    int count = 1, max = 1;

    for (int i = 1; i < a.length; i++) {
        if (a[i] == a[i - 1]) {
            count++;
        } else {
            if (count > max) {
                max = count;
            }
            count = 1;
        }
    }

    if (count> max)
        System.out.println(count);
            else
        System.out.println(max);    
}

Upvotes: 2

user1676075
user1676075

Reputation: 3086

I was thinking this must be an O(n) problem, but now I'm wondering if it doesn't have to be, that you could potentially make it O(log n) using a binary search (I don't think what @BlackJack posted actually works quite right, but it was inspiring):

Was thinking something like keep track of first, last element (in a block, probably a recursive algorithm). Do a binary split (so middle element to start). If it matches either first or last, you possibly have a run of at least that length. Check if the total length could exceed the current known max run. If so, continue, if not break.

Then repeat the process - do a binary split of one of those halves to see if the middle item matches. Repeat this process, recursing up and down to get the maximum length of a single run within a branch. Stop searching a branch when it can't possibly exceed the maximum run length.

I think this still comes out to be an O(n) algorithm because the worth-case is still searching every single element (consider a max length of 1 or 2). But you limit to checking each item once, and you search into the most-likely longest branches first (based on start/middle/end matches), it could potentially skip some fairly long runs. A breadth-first rather than depth-first search would also help.

Upvotes: 0

BlackJack
BlackJack

Reputation: 93

You can skip the some number in the array in the following pattern:

Maintain a integer jump_count to maintain the number of elements to skip (which will be bestFrequency/2). The divisor 2 can be changed according to the data set. Update the jump_count every time you update the bestFrequency.

Now, after every jump

  1. If previous element is not equal to current element and frequency <= jump_count, then scan backwards from current element to find number of duplicates and update the frequency.

    e.g. 2 2 2 2 3 3 and frequency = 0 (bold are previous and current elements), then scan backwards to find number of 3's and update the frequency = 2

  2. If previous element is not equal to current element and frequency > jump_count, scan for scan for every element to update the frequency and update the bestFrequency if needed.

    e.g. 2 2 2 2 2 3 3 and frequency = 1 (bold are previous and current elements), scan for number of 2's in this jump and update the frequency = 1 + 4. Now, frequency < bestFrequency, scan backwards to find number of 3's and update the frequency = 2.

  3. If previous element = current element, scan the jump to make sure it is continuous sequence. If yes, update the frequency to frequency + jump_count, else consider this as the same case as step 2.

    Here, we will consider two examples:

    a) 2 2 2 2 2 2 (bold are previous and current elements), check if the jump contains all 2's. Yes in this case, so add the jump_count to frequency.

    b) 2 2 2 2 3 2 (bold are previous and current elements), check if the jump contains all the 2's. No in this case, so considering this as in step 2. So, scan for number of 2's in this jump and update the frequency = 1 + 4. Now, frequency < bestFrequency, scan backwards to find number of 2's(from the current element) and update the frequency = 1.

Optimization: You can save some loops in many cases.

P.S. Since this is my first answer, I hope I am able to convey myself.

Upvotes: 2

Colin D
Colin D

Reputation: 5661

Your algorithm is pretty good.

It touches each array element (except the last) only once. This puts it at O(n) runtime which for this problem seems like the best worst case runtime you can get and is a pretty good worst case runtime as far as algorithms go.

One possible suggestion is when you find a new bestFrequency and n+bestFrequency > arrayLength you can break out of the loops. This is because you know a longer sequence cannot be found.

Upvotes: 1

user1676075
user1676075

Reputation: 3086

The only optimization that seems possible is:

for(int n=1;n<arrLength && frequency + (arrLength - n) >= bestFrequency;n++){

because you don't need to search any further one you can't possible exceed the best frequency with the number of elements remaining (probably possible to simplify that even further given a little more thought).

But as others point out, you're doing a O(n) search on n elements for a sequence - there's really no more efficient algorithm available.

Upvotes: 0

Related Questions