Babi
Babi

Reputation: 15

Improve performance of List<T>

I have a class called Product that contains some properties like Id (as Guid) and Messages (as List), and a Message class that also contains Id and other properties. I have all messages in Message table and all products in product table. After getting data of both tables, I want to join them regarding on Id property. If I use the below code as it is linear search the performance is terrible.

foreach (Product product in products)
    product.Messages = messages.Where(n => n.Id == product.Id).ToList();

Are there any other ways to do it faster?

Thanks

Upvotes: 1

Views: 581

Answers (3)

As others have indicated, do the join in the database. As you indicated, a linear search is O(n) (and you're actually doing quite a few linear searches in this case); however, most databases use a B-Tree data structure (or something similar) to sort rows by primary key. This means that a database search on a primary key is O(log n), which is obviously dramatically faster. (Assuming, of course, that the Id is the primary key).

Upvotes: 0

You might be able to speed it up by groupding your messages into a lookup table.

messagesDict = messages
    .GroupBy(x => x.Id)
    .ToDictionary(x => x.Id, x.ToList());

or, as John Bustos suggested, you can use ToLookup();

messagesDict = messages
    .ToLookup(x => x.Id);

you use it like this

//you might have to first check if messagesDict 
//actually has any messages for your project.
product.Messages = messagesDict[product.Id];

Your original attempt is O(nm) where n is the number of projects and m is the number of messages.

A Dictionary uses hashing, so from a practical standpoint, you can usually assume that it has close to O(1) inserts, and O(1) searches. Under ideal circumstances, List<T>.Add is also O(1). This means that if you were to manually create your lookup dictionary, then, you could do it in O(m). I would hope that a built-in function like ToLookup, achieves the same efficiency.

Once you do that, your algorthim becomes O(n + m)

Upvotes: 2

Charles
Charles

Reputation: 640

You should be doing the join in the database. That'll yield the best performance. If you insist on doing this in C# sort product by Id and sort messages by ID first.

Upvotes: 0

Related Questions