Furkan Gözükara
Furkan Gözükara

Reputation: 23810

How to iterate and remove elements from hashset most effective way

Ok here what i came up with but i wonder is it most effective way. I need to do this for ram memory issues.

HashSet<string> hsLinks = new HashSet<string>();
List<string> lstSortList = new List<string>();

// fill hashset with millions of records

while (true)
{
    string srLastitem = "";
    foreach (var item in hsLinks)
    {
        srLastitem = item;
        break;
    }
    lstSortList.Add(srLastitem);
    hsLinks.Remove(srLastitem);
    if (hsLinks.Count == 0)
        break;
}

c# .net 4.5.2 wpf application

Upvotes: 4

Views: 1174

Answers (2)

i3arnon
i3arnon

Reputation: 116548

It seems you're trying to move items from the HashSet to the List. If that's the case simply move everything once with List.AddRange and use HashSet.Clear to empty the HashSet:

lstSortList.AddRange(hsLinks);
hsLinks.Clear();

If (as Vajura suggested) you're worried about holding on to 2 copies of the references* you can instead move batches instead of single items:

const int batchSize = 1000;
var batch = new string[batchSize];
do
{
    var batchIndex = 0;
    foreach (var link in hsLinks.Take(batchSize))
    {
        batch[batchIndex] = link;
        batchIndex++;
    }

    if (batchIndex < batchSize)
    {
        batch = batch.Take(batchIndex).ToArray();
    }

    hsLinks.ExceptWith(batch);
    lstSortList.AddRange(batch);
} while (hsLinks.Any());

Use batches in an appropriate size for you memory concerns.


*Note: A reference is 4 or 8 bytes in size (on 32bit and 64bit respectively). When you add the strings (which are reference types in .Net) to the list you are not copying them, only the references (which are mostly negligible).

Upvotes: 5

cbr
cbr

Reputation: 13642

If you're trying to move the items from hsLinks to lstSortList (and clear hsLinks afterwards), this is where you would want to use List<T>.AddRange()

lstSortList.AddRange(hsLinks);
hsLinks.Clear();

Upvotes: 2

Related Questions