Reputation: 15
I have a class called Product that contains some properties like Id (as Guid) and Messages (as List), and a Message class that also contains Id and other properties. I have all messages in Message table and all products in product table. After getting data of both tables, I want to join them regarding on Id property. If I use the below code as it is linear search the performance is terrible.
foreach (Product product in products)
product.Messages = messages.Where(n => n.Id == product.Id).ToList();
Are there any other ways to do it faster?
Thanks
Upvotes: 1
Views: 581
Reputation: 12181
As others have indicated, do the join in the database. As you indicated, a linear search is O(n) (and you're actually doing quite a few linear searches in this case); however, most databases use a B-Tree data structure (or something similar) to sort rows by primary key. This means that a database search on a primary key is O(log n), which is obviously dramatically faster. (Assuming, of course, that the Id is the primary key).
Upvotes: 0
Reputation: 31194
You might be able to speed it up by groupding your messages into a lookup table.
messagesDict = messages
.GroupBy(x => x.Id)
.ToDictionary(x => x.Id, x.ToList());
or, as John Bustos suggested, you can use ToLookup();
messagesDict = messages
.ToLookup(x => x.Id);
you use it like this
//you might have to first check if messagesDict
//actually has any messages for your project.
product.Messages = messagesDict[product.Id];
Your original attempt is O(nm) where n is the number of projects and m is the number of messages.
A Dictionary
uses hashing, so from a practical standpoint, you can usually assume that it has close to O(1) inserts, and O(1) searches. Under ideal circumstances, List<T>.Add
is also O(1). This means that if you were to manually create your lookup dictionary, then, you could do it in O(m). I would hope that a built-in function like ToLookup
, achieves the same efficiency.
Once you do that, your algorthim becomes O(n + m)
Upvotes: 2
Reputation: 640
You should be doing the join in the database. That'll yield the best performance. If you insist on doing this in C# sort product by Id and sort messages by ID first.
Upvotes: 0