Netanel Swartz
Netanel Swartz

Reputation: 147

How to Get the longest match found in number of sets, order is important

I need to find a way to return the longest match found in number of sets/lists (values returns only once) when the order of items is important. the list is not cyclic.

A match is a sequence of values that exists in all the lists and maintains the same order of elements in all the lists.

e.g. 1:

    List<int> list1 = new List<int> { 1, 2, 3, 4, 7, 9 };
    List<int> list2 = new List<int> { 1, 2, 5, 6, 3, 4, 7, 9 };
    List<int> list3 = new List<int> { 1, 2, 3, 6, 8, 9 };
    List<int> list4 = new List<int> { 1, 2, 5, 6, 8, 9 };

result { 1, 2 }

e.g. 2:

     List<int> list1 = new List<int> { 2, 3, 6, 8, 1, 18 };
     List<int> list2 = new List<int> { 2, 3, 4, 6, 8, 1, 18, 19, 17, 14 };
     List<int> list3 = new List<int> { 2, 5, 6, 8, 1, 18, 16, 13, 14 };
     List<int> list4 = new List<int> { 2, 6, 8, 1, 18, 19, 17, 14 };

result { 6, 8, 1, 18 }

The match doesn't have to be found at the beginning or at the end and can be on any part of any list.

I hope that I explained my problem good enough :)

Thanks!

Upvotes: 0

Views: 184

Answers (3)

Eric
Eric

Reputation: 5733

  1. First generate an ordered combination of int from the shortest list
  2. Compare the lists other than shortest list with the combination. For easy comparison of lists I just convert to string and use string.Contains()
  3. Return immediately if find the match as the items left are next order or the shorter one.

public static List<int> GetLongestMatch(params List<int>[] all)
{
    var shortest = all.Where(i => i.Count == all.Select(j => j.Count).Min()).First();
    var permutations = (from length in Enumerable.Range(1, shortest.Count)
                        orderby length descending
                        from count in Enumerable.Range(1, shortest.Count - length + 1)
                        select shortest.Skip(count - 1).Take(length).ToList())
                        .ToList();

    Func<List<int>, string> stringfy = (list) => { return string.Join(",", list.Select(i => i.ToString()).ToArray()); };
    foreach (var item in permutations)
    {
        Debug.WriteLine(string.Join(", ", item.Select(i => i.ToString()).ToArray()));
        if (all.All(list => stringfy(list).Contains(stringfy(item))))
        {
            Debug.WriteLine("Matched, skip process and return");
            return item;
        }
    }
    return new List<int>();
}

Usage

var result = GetLongestMatch(list1, list2, list3, list4);

Result

2, 3, 6, 8, 1, 18
2, 3, 6, 8, 1
3, 6, 8, 1, 18
2, 3, 6, 8
3, 6, 8, 1
6, 8, 1, 18
Matched, skip process and return

Upvotes: 0

Enigmativity
Enigmativity

Reputation: 117057

Here's my approach.

First I need a way to compare lists:

public class ListCompare<T> : IEqualityComparer<List<T>>
{
    public bool Equals(List<T> left, List<T> right)
    {
        return left.SequenceEqual(right);
    }

    public int GetHashCode(List<T> list)
    {
        return list.Aggregate(0, (a, t) => a ^ t.GetHashCode());
    }
}

Next a method to produce all subsequences of a source list:

Func<List<int>, IEnumerable<List<int>>> subsequences = xs =>
    from s in Enumerable.Range(0, xs.Count)
    from t in Enumerable.Range(1, xs.Count - s)
    select xs.Skip(s).Take(t).ToList();

Now I can create a list of lists:

var lists = new [] { list1, list2, list3, list4, };

Finally a query that pulls it all together:

var answer =
    lists
        .Skip(1)
        .Aggregate(
            subsequences(lists.First()),
            (a, l) => a.Intersect(subsequences(l), new ListCompare<int>()))
        .OrderByDescending(x => x.Count)
        .FirstOrDefault();

Given the sample data provided in the question this produces the expected results.

Upvotes: 1

Paul Hankin
Paul Hankin

Reputation: 58241

You can build a map from pairs of ints to a count of how many of the lists they appear adjacent in.

Pseudo-code:

For each list L {
    For each adjacent pair (x, y) in L {
         Counts[x, y] += 1
    }
}

Now you can iterate through the first list (or the shortest list), and find the longest run such that each adjacent pair (x, y) in the run with Counts[x, y] showing that the pair appears in every list.

Pseudo-code:

run = []
best_run = []
For x in L[0] {
    if len(run) is zero or Counts[run[len(run)-1], x] == number of lists {
        run = run + x
    } else {
        run = [x]
    }
    if run is longer than best_run {
        best_run = run
    }
}

This works given the assumption in the question that no integer appears twice in the same list.

This algorithm runs in O(N) time, where N is the sum of the lengths of all the lists.

Upvotes: 2

Related Questions