imageobject
imageobject

Reputation: 55

Finding Duplicate String Arrays

I have a large list of string arrays, and within this List<string[]> there can be arrays with all same values (and possibly with different indexes). I'm looking to find and count these duplicate string arrays and have a Dictionary<string[], int> with int being the count (however if there is a better way than using a dictionary I would be interested in hearing). Does anyone have any advice on how to achieve this? Any and all input is very appreciated, thanks!

Upvotes: 0

Views: 4397

Answers (3)

Srinath Sridhar
Srinath Sridhar

Reputation: 1

import java.util.Scanner;
public class Q1 {

public static void main(String[] args) {
    System.out.println("String entry here --> ");
    Scanner input = new Scanner(System.in);
    String entry = input.nextLine();
    String[] words = entry.split("\\s");         
    System.out.println(words.length);
    for(int i=0; i<words.length; i++){
        int count = 0;
        if(words[i] != null){
            for(int j=i+1;j<words.length;j++){
                if(words[j] != null){
                    if(words[i].equals(words[j])){
                        words[j] = null;
                        count++;
                    }
                }
                else{
                    continue;
                }
            }
            if(count != 0){
                System.out.println("Count of duplicate " + words[i] + " = " + count );

            }
        }
        else{
            continue;
        }
    }
    input.close();
}
}

Upvotes: 0

Eric
Eric

Reputation: 5743

You can use linq GroupBy with a IEqualityComparer to compare the string[]

var items = new List<string[]>() 
    { 
        new []{"1", "2", "3" ,"4" }, 
        new []{"4","3", "2", "1"},
        new []{"1", "2"}
    };

var results = items
        .GroupBy(i => i, new UnorderedEnumerableComparer<string>())
        .ToDictionary(g => g.Key, g => g.Count());

The IEqualityComparer for the unordered list

public class UnorderedEnumerableComparer<T> : IEqualityComparer<IEnumerable<T>>
{
    public bool Equals(IEnumerable<T> x, IEnumerable<T> y)
    {
        return x.OrderBy(i => i).SequenceEqual(y.OrderBy(i => i));
    }
    // Just the count of the array, 
    // it violates the rule of hash code but should be fine here
    public int GetHashCode(IEnumerable<T> obj)
    {
        return obj.Count();
    }
}

.Net Fiddle

Upvotes: 1

Hari Prasad
Hari Prasad

Reputation: 16956

You might find duplicate keys if you use number of occurrences as a Key to Dictionary I would suggest use Dictionary<string, int> where key represents the string and value represents no of occurrences. Now we can use Linq statements.

var results = items.SelectMany(item=>item)
                   .GroupBy(item=>item)
                   .ToDictionary(g=>g.Key, g=>g.Count()); 

Other approach is having LookUp, which allows a collection of keys each mapped to one or more values

var lookup = items.SelectMany(item=>item)
                  .GroupBy(item=>item)
                  .ToLookup(c=>c.Count(), c=>c.Key);

Working example

Upvotes: 0

Related Questions