pzaj
pzaj

Reputation: 1092

Combine two lists into one based on property

I would like to ask whether there's an elegant and efficient way to merge two lists of MyClass into one?

MyClass looks like this:

and the lists are populated from different sources and objects in lists do share ID, so it looks like that:

MyClass instance from List1
ID = someInt
Name = someString
ExtID = null

And MyClass instance from List2

ID = someInt (same as List1)
Name = someString (same as List1)
ExtID = someInt

What I basically need is to combine these two lists, so the outcome is a list containing:

ID = someInt (from List1)
Name = someString (from List1)
ExtID = someInt (null if no corresponding item - based on ID - on List2)

I know I can do this simply using foreach loop, but I'd love to know if there's more elegant and maybe preferred (due to performance, readability) method?

Upvotes: 5

Views: 10685

Answers (3)

mikus
mikus

Reputation: 3215

There are many approaches depending on what is the priority, ex. Union + Lookup:

//this will create a key value pairs: id -> matching instances
var idMap = list1.Union(list2).ToLookup(myClass => myClass.ID);
//now just select for each ID the instance you want, ex. with some value
var mergedInstances = idMap.Select(row => 
      row.FirstOrDefault(myClass => myClass.ExtId.HasValue) ?? row.First());

The benefit of above is that it will work with whatever amount of whatever lists even if they contain many duplicated isntances and then you can easily modify the conditions of merging

A small improvement would be to extract a method to merge instances:

MyClass MergeInstances(IEnumerable<MyClass> instances){
     return instances.FirstOrDefault(myClass => myClass.ExtId.HasValue) 
          ?? instances.First(); //or whatever else you imagine
}

and now just use it in the code above

 var mergedInstances = idMap.Select(MergeInstances);

Clean, flexible, simple, no additional conditions. Performance wise not perfect, but who cares.

Edit: since performance is the priority, some more options

  1. Do a lookup like above but only for the smaller list. Then iterate through the bigger and do the needed changes O(m log m) + O(n). m - smaller list size, n- bigger list size - should be fastest.

  2. Order both lists by elements ids. Create a for loop, that iterates through both of them keeping current index to the element with same id for both lists. Move index to the next smallest id found in both list, if one has it only, move only this on. O(n log n) + O(m log m) + O(n);

Upvotes: 3

Godsent
Godsent

Reputation: 980

Is this what you want

var joined = from Item1 in list1
         join Item2 in list2
         on Item1.Id equals Item2.Id // join on some property
         select new MyClass(Item1.Id, Item1.Name, Item1.ExtID??Item2.ExtID);

Edit: If you're looking for an outer join,

var query = from Item1 in list1
            join Item2 in list2 on Item1.Id equals Item2.Id into gj
            from sublist2 in gj.DefaultIfEmpty()
            select new MyClass(Item1.Id, Item1.Name, sublist2??string.empty);

Readability wise, using foreach loop is not a too bad idea..

Upvotes: 2

Paulo Lima
Paulo Lima

Reputation: 142

I'd sugest creating the foreach loop in a method of that class, so everytime you needed to do such thing you'd use something like

instanceList1.MergeLists(instanceList2)

and with this method, you could control everything you wanted withing the merge operation.

Upvotes: -2

Related Questions