Franva
Franva

Reputation: 7077

How to calculate the time complexity of this odd method

The method is:

List<Book> books = new List<Book>();

public List<Book> Shoot()
{
   foreach(var b in books)
   {
       bool find = true;
       foreach(var otherb in books)
       {
           if(otherb != b && otherb.Author == b.Author)
           {
               find = false;
           }
       }

       if(find)
       {
           yield return b;
       } 
   }
}

Normally, the time complexity will be O(books.Count^2), but there is a if(find) statement in the outer loop and it may change the loop times.

So my questions are:

  1. What is the time complexity of this method?
  2. How did you calculate it?

I'm waiting online for your answer.

Thank you in advance.

Upvotes: 1

Views: 2188

Answers (2)

Jaime
Jaime

Reputation: 6814

You would go through each book in the outer loop (n) and for each outer book you would go through each otherb in the inner loop (n times) so the the time complexity would be O(n^2).

The yield return would not change the complexity of the algorithm, it creates an iterator pattern but if you traverse the whole list from the calling function, you go through all the iterations in your algo.

What is the yield keyword used for in C#?

To optimize the algorithm, as btilly mention, you could do two passes over the collection, on the first pass you store the number of books per author in a hash table and on the second pass you check if the author has more than one book using the hash table (assuming constant time for the lookup) and yield the book if it does:

public List<Book> Shoot()
{
    var authors = new Dictionary<string, int>();
    foreach(var b in books) 
    {
        if(authors.ContainsKey(b.Author))
            authors[b.Author] ++;
        else
            authors.Add(b.Author, 1);
    }

    foreach(var b in books) 
    {
        if(authors[b.Author] == 1)
            yield return b;
    }
}

This way you have a linear time complexity of O(n), note that you would need O(n) extra space in this case.

Upvotes: 3

btilly
btilly

Reputation: 46409

Your worst case performance per yield is O(n * n). Your best case is O(n). If you assume that authors are randomly sorted, and a fixed portion only write one book, then the average case between yields is O(n) because the probability of getting to m iterations of the outer loop decreases exponentially as m increases. (Insert standard geometric series argument here.)

Generally (but not always!) people are most interested in the average case.

Incidentally the standard way of handling this problem would be to create a dictionary up front with all of the authors, and the count of how many books they wrote. That takes time O(n). And then your yields after that would just search through the keys of that dictionary looking for the next one with only 1 entry. The average time of subsequent yields would be O(1), the worst case O(n), and the amortized average time across all yields (assuming that a fixed proportion only wrote one book) will be O(1) per yield. As opposed to the current implementation which is O(n) per yield.

Upvotes: 1

Related Questions