Elias Poulogiannis
Elias Poulogiannis

Reputation: 31

System.Guid.NewGuid() in linq select

I wanted to generate a unique identifier for the results of a Linq query i did on some date. Initially i thought of using Guid for that but stumbling upon this problem i had to improvise. However I'd like to see if anyone could have a solution using Guid so here we go.

Imagine we have:

class Query
{
    public class Entry
    {
        public string Id { get; set; }
        public int Value { get; set; }
    }

    public static IEnumerable<Entry> GetEntries( IEnumerable<int> list)
    {
        var result = 
            from i in list
            select new Entry
            {
                Id = System.Guid.NewGuid().ToString("N"),
                Value = i
            };
        return result;
    }
}

Now we want Id to be unique for each entry, but we need this value to be the same for each traversal of the IEnumerable we get from GetEntries. This means that we want calling the following code:

List<int> list = new List<int> { 1, 2, 3, 4, 5 };
IEnumerable<Query.Entry> entries = Query.GetEntries(list);
Console.WriteLine("first pass");
foreach (var e in entries) { Console.WriteLine("{0} {1}", e.Value, e.Id); }
Console.WriteLine("second pass");
foreach (var e in entries) { Console.WriteLine("{0} {1}", e.Value, e.Id); }

to give us something like:

first pass

1 47f4a21a037c4ac98a336903ca9df15b
2 f339409bde22487e921e9063e016b717
3 8f41e0da06d84a58a61226a05e12e519
4 013cddf287da46cc919bab224eae9ee0
5 6df157da4e404b3a8309a55de8a95740

second pass

1 47f4a21a037c4ac98a336903ca9df15b
2 f339409bde22487e921e9063e016b717
3 8f41e0da06d84a58a61226a05e12e519
4 013cddf287da46cc919bab224eae9ee0
5 6df157da4e404b3a8309a55de8a95740

However we get:

first pass

1 47f4a21a037c4ac98a336903ca9df15b
2 f339409bde22487e921e9063e016b717
3 8f41e0da06d84a58a61226a05e12e519
4 013cddf287da46cc919bab224eae9ee0
5 6df157da4e404b3a8309a55de8a95740

second pass

1 a9433568e75f4f209c688962ee4da577
2 2d643f4b58b946ba9d02b7ba81064274
3 2ffbcca569fb450b9a8a38872a9fce5f
4 04000e5dfad340c1887ede0119faa16b
5 73a11e06e087408fbe1909f509f08d03

Now taking a second look at my code above I realized where my error was: The assignment of Id to Guid.NewGuid().ToString("N") gets called every time we traverse the collection and thus is different everytime.

So what should i do then? Is there a way i can reassure that i will get with only one copy of the collection everytime? Is there a way that i'm sure that i won't be getting the new instances of the result of the query?

Thank you for your time in advance :)

Upvotes: 3

Views: 5457

Answers (6)

Henk Holterman
Henk Holterman

Reputation: 273244

This is a inherent to all LINQ queries. Being repeatable is coincidental, not guaranteed.

You can solve it with a .ToList() , like:

IEnumerable<Query.Entry> entries = Query.GetEntries(list).ToList();

Or better, move the .ToList() inside GetEntries()

Upvotes: 6

Kamyar
Kamyar

Reputation: 18797

One suggestion: (Don't know if that's your case or not though)
If you want to save the entries in database, Try to assign your entry's primary key a Guid at the database level. This way, each entry will have a unique and persisted Guid as its primary key. Checkout this link for more info.

Upvotes: 0

Oliver Hanappi
Oliver Hanappi

Reputation: 12346

That's because of the way linq works. When you return just the linq query, it is executed every time you enumerate over it. Therefore, for each list item Guid.NewGuid will be executed as many times as you enumerate over the query.

Try adding an item to the list after you iterated once over the query and you will see, that when iterating a second time, the just added list item will be also in the result set. That's because the linq query holds an instance of your list and not an independent copy.

To get always the same result, return an array or list instead of the linq query, so change the return line of the GetEntries method to something like that:

return result.ToArray();

This forces immediate execution, which also happens only once.

Best Regards,
Oliver Hanappi

Upvotes: 1

mike
mike

Reputation: 3166

Any reason you have to use LINQ? The following seems to work for me:

public static IEnumerable<Entry> GetEntries(IEnumerable<int> list)
{
  List<Entry> results = new List<Entry>();
  foreach (int i in list)
  {
    results.Add(new Entry() { Id = Guid.NewGuid().ToString("N"), Value = i });
  }
  return results;
}

Upvotes: 1

Vlad
Vlad

Reputation: 35594

Perhaps you need to produce the list of entries once, and return the same list each time in GetEntries.

Edit:
Ah no, you get each time the different list! Well, then it depends on what you want to get. If you want to get the same Id for each specific Value, maybe in different lists, you need to cache Ids: you should have a Dictionary<int, Guid> where you'll store the already allocated GUIDs. If you want your GUIDs be unique for each source list, you would perhaps need to cache the input the return IEnumerables, and always check if this input list was already returned or not.

Edit:
If you don't want to share the same GUIDs for different runs of GetEntries, you should just "materialize" the query (replacing return result; with return result.ToList();, for example), as it was suggested in the comment to your question.

Otherwise the query will run each time you traverse your list. This is what is called lazy evaluation. The lazy evaluation is usually not a problem, but in your case it leads to recalculating the GUID each query run (i.e., each loop over the result sequence).

Upvotes: 1

usr-local-ΕΨΗΕΛΩΝ
usr-local-ΕΨΗΕΛΩΝ

Reputation: 26874

You might think not using Guid, at least not with "new".

Using GetHashCode() returns unique values that don't change when you traverse the list multiple times.

The problem is that your list is IEnumerable<int>, so the hash code of each item coincides with its value.

You should re-evaluate your approach and use a different strategy. One thing that comes into my mind is to use a pseudo-random number generator initialized with the hash code of the collection. It will return you always the same numbers as soon as it's initialized with the same value. But, again, forget Guid

Upvotes: 0

Related Questions