Reputation: 679

Time complexity of a method that checks a sublist in a linked list

I need to write a method public int subList (CharList list) that gets a list and returns how many times this list exist.

For example:

my list is a b c d a b g e

parameter list ita b it will return 2.

my list is b b b b

parameter list is b b it will return 3.

The method should be as efficient as possible.

Currently my problem is I don't know what do they mean when they say as efficient as possible, If I loop through the list n times, and everytime I find the same character I loop on both list and go back to where I was it will be O(n^2)? is there a better way to make is O(n) or below?

Upvotes: 3

Answers (3)

Stephen C

Reputation: 719709

Lets do this with a char[] to simplify the explanation.

The simple approach is as follows:

public int countSublists (char[] list, char[] sublist)
    int count = 0;
    for (i = 0; i < list.length; i++) {
        for (j = 0; j <= sublist.length; j++) {
            if (j = sublist.length) {
               count++;
            } else if (i + j > list.length || list[i + j] != sublist[j]) {
               break;
            }
        }
    }
    return count;
}

This has worst-case complexity of O(N*M) where N is the length of list and M is the length of sublist. And best-case complexity of O(N) ... when there are no instances of the first character of sublist in list.

There are various other algorithms that give better performance ... down to (I think) O(N/M) best-case. The general idea is that you use the value of the character at list[i + j] when there is a mismatch to allow you to skip some characters.

You can find details of various advanced search algorithms are linked from the Wikipedia page on String Searching algorithms ... which also includes a summary of the respective algorithm complexity.

But the thing to note is that the advanced search algorithms all involve some precompution steps, whose complexity is some function of M. If N is small enough, the cost of the precomputation may outweigh the saving at search time. (On the other hand, if you are repeatedly counting the same sublist in different lists, and you can reuse the precomputed tables, then you can amortize the precomputation cost ...)

Upvotes: 1