LearningCSharp
LearningCSharp

Reputation: 1302

Removing duplicate collection strings in memory

I am working on a hypothetical question. One of them being that if there are duplicate string collections in memory, how would I get about removing the duplicates while maintaining the original order or the collections?

Upvotes: 0

Views: 230

Answers (3)

Manatherin
Manatherin

Reputation: 4187

try something like this

        List<String> stringlistone = new List<string>() { "Hello", "Hi" };
        List<String> stringlisttwo = new List<string>() { "Hi", "Bye" };
        IEnumerable<String> distinctList = stringlistone.Concat(stringlisttwo).Distinct(StringComparer.OrdinalIgnoreCase);

        List<List<String>> listofstringlist = new List<List<String>>() { stringlistone, stringlisttwo };
        IEnumerable<String> distinctlistofstringlist = listofstringlist.SelectMany(x => x).Distinct(StringComparer.OrdinalIgnoreCase);

its depends on how you join the lists but it should give you a idea, added the ordinal ignore case in case you wanted the destinct list to treat "hi" and "Hi" as the same

you can also just call the distinct so if you did

        List<String> stringlistone = new List<string>() { "Hi", "Hello", "Hi" };

        stringlistone = stringlistone.Distinct(StringComparer.OrdinalIgnoreCase);

stringlistone would be a list with stringlistone[0] == "Hi" and stringlistone[1] == "Hello"

Upvotes: 1

agent-j
agent-j

Reputation: 27923

Say you have a List<List<string>> that you read from a file or database (so they're not already interned) and you want no duplicate strings, you can use this code:

public void FoldStrings(List<List<string>> stringCollections)
{
   var interned = new Dictionary<string,string> ();
   foreach (var stringCollection in stringCollections)
   {
      for (int i = 0; i < stringCollection.Count; i++)
      {
         string str = stringCollection[i];
         string s;
         if (interned.TryGetValue (str, out s))
         {
            // We already have an instance of this string.
            stringCollection[i] = s;
         }
         else
         {
            // First time we've seen this string... add to hashtable.
            interned[str]=str;
         }
      }
   }
}

Upvotes: 0

hungryMind
hungryMind

Reputation: 6999

Don't worry about it. Framework does not create duplicate string in memory. All pointers with same string value points to same location in memory.

Upvotes: 0

Related Questions