Sankara
Sankara

Reputation: 1479

Comparing 2 string arrays

I want to compare 2 string arrays in the fastest way.
I got something like below.

Will that be the right way to do. Or is there a better way to do

            bool matching=false;
            //say templateArr is the template array and dataArr as array to be compared
            string[] templateArr = {"Dictionary_type","Translation_EN" };
            string[] dataArr = { "Dictionary_type", "Translation_EN" };

            if (templateArr.Union(dataArr).Distinct().Count() == templateArr.Count())
                matching = true;

Upvotes: 2

Views: 5705

Answers (4)

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52107

Assuming they should be considered unequal if they have same elements but in different order, you can just use the SequenceEqual:

if (templateArr.SequenceEqual(dataArr))
    matching = true;

If you want to ignore the order, sort the arrays first:

if (templateArr.OrderBy(x => x).SequenceEqual(dataArr.OrderBy(x => x)))
    matching = true;

And if you want to also ignore duplicates:

if (templateArr.Distinct().OrderBy(x => x).SequenceEqual(dataArr.Distinct().OrderBy(x => x)))
    matching = true;

Or (more concise, and likely to be faster):

if (new HashSet<string>(templateArr).SetEquals(dataArr))
    matching = true;

BTW, your code is incorrect - it will conclude the arrays match in this case:

string[] templateArr = { "Dictionary_type", "Translation_EN", "abc" };
string[] dataArr = { "Translation_EN", "Dictionary_type", "Translation_EN" };

if (templateArr.Union(dataArr).Distinct().Count() == templateArr.Count())
    matching = true;

Upvotes: 0

MaxDataSol
MaxDataSol

Reputation: 356

As per the previous comments - the question is slightly ambigous as you don't clarify what constitutes to equivalent arrays, but assuming that you treat arrays as equavalent if they contain the same number of identical strings (in any order), before I would resort to the HashSet(array1).SetEquals(array2);

I would try to determine whether the arrays are equal using the following simple technique:

  1. compare the length, if the lengths are different - return false sort arrays, set counter to 0
  2. Compare array[0] elements - if different return false
  3. Repeat the procedure for each next one, using indexing, not foreach
  4. Return true

With this approach for large arrays - you are likely to find difference instead of actually loading arrays in memory or rely on hashsetting the entire arrays therefore gaining more efficient performance/memory consumption

Upvotes: 0

Steve Guidi
Steve Guidi

Reputation: 20200

To test for collection equality, you can use Enumerable.SequenceEquals as follows.

using System.Linq;

bool AreEqual()
{
    string[] templateArr = { "Dictionary_type", "Translation_EN" };
    string[] dataArr = { "Dictionary_type", "Translation_EN" };

    return templateArr.SequenceEquals(dataArr);
}

If you want to test for collection equivalence (order of elements does not matter), then you can use set-equality as follows.

bool AreEquivalent()
{
    string[] templateArr = { "Dictionary_type", "Translation_EN" };
    string[] dataArr = { "Dictionary_type", "Translation_EN" };

    return new HashSet<string>(templateArr).SetEquals(dataArr);
}

Both cases are implemented in linear time, as per the MSDN documentation.

Upvotes: 6

Andre Calil
Andre Calil

Reputation: 7692

You're making a union and then counting the total elements with only one of them. Union removes the repetition, but I'm not sure if that's the best approach, because it's a relatively expensive operation.

Look at this alternative:

        string[] templateArr = { "Dictionary_type", "Translation_EN" };
        string[] dataArr = { "Dictionary_type", "Translation_EN" };

        bool matching = templateArr.Length == dataArr.Length ? !templateArr.Any<string>(x => !dataArr.Contains(x)) : false;

Upvotes: 0

Related Questions