Romain Verdier
Romain Verdier

Reputation: 13011

Best algorithm for synchronizing two IList in C# 2.0

Imagine the following type:

public struct Account
{
    public int Id;
    public double Amount;
}

What is the best algorithm to synchronize two IList<Account> in C# 2.0 ? (No linq) ?

The first list (L1) is the reference list, the second (L2) is the one to synchronize according to the first:

The Id identifies accounts. It's no too hard to find a naive and working algorithm, but I would like to know if there is a smart solution to handle this scenario without ruining readability and perfs.

EDIT :

Upvotes: 11

Views: 5736

Answers (7)

Teneko
Teneko

Reputation: 1491

Introduction

I have implemented two algorithms, one for sorted and one for sequential collections. Both support null values and duplicates, and work in the same way:

They yield return CollectionModification<LeftItemType,RightItemType>s which is similar to CollectionChangedEventArgs<T> (reference) which can be used in return to synchronize a collection.

Regarding yield return:

When using one or the other algorithm where your left items (the reference collection) is compared to right items, you can apply each yield returned CollectionModification as soon as they are yield returned , but this can result in a "collection was modified"-exception (for example when using List<T>.GetEnumerator). To prevent this, both algorithms have the ability implemented to use an indexable collection as reference collection that is about to be mutated. You only have to wrap the reference collection with YieldIteratorInfluencedReadOnlyList<ItemType> (abstract) by using the extensions methods in YieldIteratorInfluencedReadOnlyListExtensions. :)

SortedCollectionModifications

The first algorithm works for ascended or descended ordered lists and uses IComparer<T>.

/// <summary>
/// The algorithm creates modifications that can transform one collection into another collection.
/// The collection modifications may be used to transform <paramref name="leftItems"/>.
/// Assumes <paramref name="leftItems"/> and <paramref name="rightItems"/> to be sorted by that order you specify by <paramref name="collectionOrder"/>.
/// Duplications are allowed but take into account that duplications are yielded as they are appearing.
/// </summary>
/// <typeparam name="LeftItemType">The type of left items.</typeparam>
/// <typeparam name="RightItemType">The type of right items.</typeparam>
/// <typeparam name="ComparablePartType">The type of the comparable part of left item and right item.</typeparam>
/// <param name="leftItems">The collection you want to have transformed.</param>
/// <param name="getComparablePartOfLeftItem">The part of left item that is comparable with part of right item.</param>
/// <param name="rightItems">The collection in which <paramref name="leftItems"/> could be transformed.</param>
/// <param name="getComparablePartOfRightItem">The part of right item that is comparable with part of left item.</param>
/// <param name="collectionOrder">the presumed order of items to be used to determine <see cref="IComparer{T}.Compare(T, T)"/> argument assignment.</param>
/// <param name="comparer">The comparer to be used to compare comparable parts of left and right item.</param>
/// <param name="yieldCapabilities">The yieldCapabilities that regulates how <paramref name="leftItems"/> and <paramref name="rightItems"/> are synchronized.</param>
/// <returns>The collection modifications.</returns>
/// <exception cref="ArgumentNullException">Thrown when non-nullable arguments are null.</exception>
public static IEnumerable<CollectionModification<LeftItemType, RightItemType>> YieldCollectionModifications<LeftItemType, RightItemType, ComparablePartType>(
    IEnumerable<LeftItemType> leftItems,
    Func<LeftItemType, ComparablePartType> getComparablePartOfLeftItem,
    IEnumerable<RightItemType> rightItems,
    Func<RightItemType, ComparablePartType> getComparablePartOfRightItem,
    SortedCollectionOrder collectionOrder,
    IComparer<ComparablePartType> comparer,
    CollectionModificationsYieldCapabilities yieldCapabilities)

Python algorithm inspiration taken from: Efficient synchronization of two instances of an ordered list.

EqualityTrailingCollectionModifications

The second algorithm works for any order and uses IEqualityComparer<T>.

/// <summary>
/// The algorithm creates modifications that can transform one collection into another collection.
/// The collection modifications may be used to transform <paramref name="leftItems"/>.
/// The more the collection is synchronized in an orderly way, the more efficient the algorithm is.
/// Duplications are allowed but take into account that duplications are yielded as they are appearing.
/// </summary>
/// <typeparam name="LeftItemType">The type of left items.</typeparam>
/// <typeparam name="RightItemType">The type of right items.</typeparam>
/// <typeparam name="ComparablePartType">The type of the comparable part of left item and right item.</typeparam>
/// <param name="leftItems">The collection you want to have transformed.</param>
/// <param name="getComparablePartOfLeftItem">The part of left item that is comparable with part of right item.</param>
/// <param name="rightItems">The collection in which <paramref name="leftItems"/> could be transformed.</param>
/// <param name="getComparablePartOfRightItem">The part of right item that is comparable with part of left item.</param>
/// <param name="equalityComparer">The equality comparer to be used to compare comparable parts.</param>
/// <param name="yieldCapabilities">The yield capabilities, e.g. only insert or only remove.</param>
/// <returns>The collection modifications.</returns>
/// <exception cref="ArgumentNullException">Thrown when non-nullable arguments are null.</exception>
public static IEnumerable<CollectionModification<LeftItemType, RightItemType>> YieldCollectionModifications<LeftItemType, RightItemType, ComparablePartType>(
    IEnumerable<LeftItemType> leftItems,
    Func<LeftItemType, ComparablePartType> getComparablePartOfLeftItem,
    IEnumerable<RightItemType> rightItems,
    Func<RightItemType, ComparablePartType> getComparablePartOfRightItem,
    IEqualityComparer<ComparablePartType>? equalityComparer,
    CollectionModificationsYieldCapabilities yieldCapabilities)
    where ComparablePartType : notnull

Requirements

One of the following frameworks is required

  • .NET-Standard 2.0
  • .NET Core 3.1
  • .NET 5.0

Both algorithms are created with custom implemented types (IndexDirectory, NullableKeyDictionary, LinkedBucketList to name a few), so I can not simply copy paste the code here, so I would like to reference you to my following packages:

Implementation

Anitcipated classes

Account:

public class Account
{
    public Account(int id) =>
        Id = id;

    public int Id { get; }
    public double Amount { get; }
}

And the following collection item equality comparer class:

AccountEqualityComparer:

public class AccountEqualityComparer : EqualityComparer<Account>
{
    public new static AccountEqualityComparer Default = new AccountEqualityComparer();

    public override bool Equals([AllowNull] Account x, [AllowNull] Account y) =>
        ReferenceEquals(x, y) || (!(x is null && y is null) && x.Id.Equals(y.Id));

    public override int GetHashCode([DisallowNull] Account obj) =>
        obj.Id;
}

"My" classes

AccountCollectionViewModel:

using Teronis.Collections.Algorithms.Modifications;
using Teronis.Collections.Synchronization;
using Teronis.Collections.Synchronization.Extensions;
using Teronis.Reflection;

public class AccountCollectionViewModel : SyncingCollectionViewModel<Account, Account>
{
    public AccountCollectionViewModel()
        : base(CollectionSynchronizationMethod.Sequential(AccountEqualityComparer.Default))
    {
        // In case of SyncingCollectionViewModel, we have to pass a synchronization method.
        //
        //   Sequential means any order
        //
    }

    protected override Account CreateSubItem(Account superItem) =>
        superItem;

    protected override void ApplyCollectionItemReplace(in ApplyingCollectionModificationBundle modificationBundle)
    {
        foreach (var (oldItem, newItem) in modificationBundle.OldSuperItemsNewSuperItemsModification.YieldTuplesForOldItemNewItemReplace())
        {
            // Implementation detail: update left public property values by right public property values.
            TeronisReflectionUtils.UpdateEntityVariables(oldItem, newItem);
        }
    }
}

Program:

using System.Diagnostics;
using System.Linq;

class Program
{
    static void Main()
    {
        // Arrange
        var collection = new AccountCollectionViewModel();

        var initialData = new Account[] {
            new Account(5) { Amount = 0 },
            new Account(7) { Amount = 0 },
            new Account(3) { Amount = 0 }
        };

        var newData = new Account[] {
            new Account(5) { Amount = 10 }, 
            /* Account by ID 7 got removed .. */ 
            /* but account by ID 8 is new. */ 
            new Account(8) { Amount = 10 },
            new Account(3) { Amount = 10 }
        };

        // Act
        collection.SynchronizeCollection(initialData);

        // Assert
        Debug.Assert(collection.SubItems.ElementAt(1).Id == 7, "The account at index 1 has not the ID 7.");
        Debug.Assert(collection.SubItems.All(x => x.Amount == 0), "Not all accounts have an amount of 0.");

        // Act
        collection.SynchronizeCollection(newData);

        // Assert
        Debug.Assert(collection.SubItems.ElementAt(1).Id == 8, "The account at index 1 has not the ID 8.");
        Debug.Assert(collection.SubItems.All(x => x.Amount == 10), "Not all accounts have an amount of 10.");

        ;
    }
}

You can see that I use SyncingCollectionViewModel, a very "heavy" type. That's because I have not finished the lightweight SynchronizableCollection implementation yet (virtual methods for add, remove, replace and so on are missing).

Upvotes: 0

Mateus Wolkmer
Mateus Wolkmer

Reputation: 735

I had the same problem, and my best solution was the following (adapted to your case), having both lists loaded:

  1. For each Account in L1, verify if it exists in L2:
    • If found, update all values from L1's Account based on L2's. Then, delete the Account from L2.
    • If not found, mark L1's Account as deleted, or delete it from the list, it depends on how your database is structured.
  2. For each Account left in L2, add it into L1.

I recommend implementing the IEquatable<> interface in your Account class (or just override the Equals() method) so it always compares IDs on methods that require comparison between objects:

public struct Account : IEquatable<Account>
{
    public int Id;
    public double Amount;

    public bool Equals(Account other)
    {
        if (other == null) return false;
        return (this.Id.Equals(other.Id));
    }
}

The sync algorithm would be something like this (make sure both lists are initialized so no errors occurr):

L1.ForEach (L1Account =>
{
    var L2Account = L2.Find(a => a.Id == L1Account.id);
    // If found, update values
    if (L2Account != null)
    {
        L1Account.Amount = L2Account.Amount;
        L2.Remove(L2Account);
    }
    // If not found, remove it
    else
    {
        L1.Remove(L1Account);
    }
}
// Add any remaining L2 Account to L1
L1.AddRange(L2);

Upvotes: 1

rboarman
rboarman

Reputation: 8214

I know this is an old post but you should check out AutoMapper. It will do exactly what you want in a very flexible and configurable way.

AutoMapper

Upvotes: 0

Roger Lipscombe
Roger Lipscombe

Reputation: 91895

If your two lists are sorted, then you can simply walk through them in tandem. This is an O(m+n) operation. The following code could help:

class Program
{
    static void Main()
    {
        List<string> left = new List<string> { "Alice", "Charles", "Derek" };
        List<string> right = new List<string> { "Bob", "Charles", "Ernie" };

        EnumerableExtensions.CompareSortedCollections(left, right, StringComparer.CurrentCultureIgnoreCase,
            s => Console.WriteLine("Left: " + s), s => Console.WriteLine("Right: " + s), (x,y) => Console.WriteLine("Both: " + x + y));
    }
}

static class EnumerableExtensions
{
    public static void CompareSortedCollections<T>(IEnumerable<T> source, IEnumerable<T> destination, IComparer<T> comparer, Action<T> onLeftOnly, Action<T> onRightOnly, Action<T, T> onBoth)
    {
        EnumerableIterator<T> sourceIterator = new EnumerableIterator<T>(source);
        EnumerableIterator<T> destinationIterator = new EnumerableIterator<T>(destination);

        while (sourceIterator.HasCurrent && destinationIterator.HasCurrent)
        {
            // While LHS < RHS, the items in LHS aren't in RHS
            while (sourceIterator.HasCurrent && (comparer.Compare(sourceIterator.Current, destinationIterator.Current) < 0))
            {
                onLeftOnly(sourceIterator.Current);
                sourceIterator.MoveNext();
            }

            // While RHS < LHS, the items in RHS aren't in LHS
            while (sourceIterator.HasCurrent && destinationIterator.HasCurrent && (comparer.Compare(sourceIterator.Current, destinationIterator.Current) > 0))
            {
                onRightOnly(destinationIterator.Current);
                destinationIterator.MoveNext();
            }

            // While LHS==RHS, the items are in both
            while (sourceIterator.HasCurrent && destinationIterator.HasCurrent && (comparer.Compare(sourceIterator.Current, destinationIterator.Current) == 0))
            {
                onBoth(sourceIterator.Current, destinationIterator.Current);
                sourceIterator.MoveNext();
                destinationIterator.MoveNext();
            }
        }

        // Mop up.
        while (sourceIterator.HasCurrent)
        {
            onLeftOnly(sourceIterator.Current);
            sourceIterator.MoveNext();
        }

        while (destinationIterator.HasCurrent)
        {
            onRightOnly(destinationIterator.Current);
            destinationIterator.MoveNext();
        }
    }
}

internal class EnumerableIterator<T>
{
    private readonly IEnumerator<T> _enumerator;

    public EnumerableIterator(IEnumerable<T> enumerable)
    {
        _enumerator = enumerable.GetEnumerator();
        MoveNext();
    }

    public bool HasCurrent { get; private set; }

    public T Current
    {
        get { return _enumerator.Current; }
    }

    public void MoveNext()
    {
        HasCurrent = _enumerator.MoveNext();
    }
}

You'll have to be careful about modifying the collections while iterating over them, though.

If they're not sorted, then comparing every element in one with every element in the other is O(mn), which gets painful really quickly.

If you can bear to copy the key values from each collection into a Dictionary or similar (i.e. a collection with acceptable performance when asked "is X present?"), then you could come up with something reasonable.

Upvotes: 2

anon
anon

Reputation:

L2 = L1.clone()?

... but I would guess you forgot to mention something.

Upvotes: 0

VVS
VVS

Reputation: 19612

In addition to Jon Skeet's comment make your Account struct a class and override the Equals() and GetHashCode() method to get nice equality checking.

Upvotes: 0

Jon Skeet
Jon Skeet

Reputation: 1502586

For a start I'd get rid of the mutable struct. Mutable value types are a fundamentally bad thing. (As are public fields, IMO.)

It's probably worth building a Dictionary so you can easily compare the contents of the two lists. Once you've got that easy way of checking for presence/absence, the rest should be straightforward.

To be honest though, it sounds like you basically want L2 to be a complete copy of L1... clear L2 and just call AddRange? Or is the point that you also want to take other actions while you're changing L2?

Upvotes: 5

Related Questions