Reputation: 10274
A bunch of key/value pairs, from an object that may have duplicate keys, need to be added to a dictionary. Only the first distinct instance of a key (and the instance's value) should be added to the dictionary.
Below is an example implementation that appears, at first, to work fine.
void Main()
{
Dictionary<long, DateTime> items = new Dictionary<long, DateTime>();
items = AllItems.Select(item =>
{
long value;
bool parseSuccess = long.TryParse(item.Key, out value);
return new { value = value, parseSuccess, item.Value };
})
.Where(parsed => parsed.parseSuccess && !items.ContainsKey(parsed.value))
.Select(parsed => new { parsed.value, parsed.Value })
.Distinct()
.ToDictionary(e => e.value, e => e.Value);
Console.WriteLine(string.Format("Distinct: {0}{1}Non-distinct: {2}",items.Count, Environment.NewLine, AllItems.Count));
}
public List<KeyValuePair<string, DateTime>> AllItems
{
get
{
List<KeyValuePair<string, DateTime>> toReturn = new List<KeyValuePair<string, DateTime>>();
for (int i = 1000; i < 1100; i++)
{
toReturn.Add(new KeyValuePair<string, DateTime>(i.ToString(), DateTime.Now));
toReturn.Add(new KeyValuePair<string, DateTime>(i.ToString(), DateTime.Now));
}
return toReturn;
}
}
If AllItems is modified to return many more pairs, however, then an ArgumentException occurs: "An item with the same key has already been added."
void Main()
{
Dictionary<long, DateTime> items = new Dictionary<long, DateTime>();
var AllItems = PartOne.Union(PartTwo);
Console.WriteLine("Total items: " + AllItems.Count());
items = AllItems.Select(item =>
{
long value;
bool parseSuccess = long.TryParse(item.Key, out value);
return new { value = value, parseSuccess, item.Value };
})
.Where(parsed => parsed.parseSuccess && !items.ContainsKey(parsed.value))
.Select(parsed => new { parsed.value, parsed.Value })
.Distinct()
.ToDictionary(e => e.value, e => e.Value);
Console.WriteLine("Distinct: {0}{1}Non-distinct: {2}",items.Count, Environment.NewLine, AllItems.Count());
}
public IEnumerable<KeyValuePair<string, DateTime>> PartOne
{
get
{
for (int i = 10000000; i < 11000000; i++)
{
yield return (new KeyValuePair<string, DateTime>(i.ToString(), DateTime.Now));
}
}
}
public IEnumerable<KeyValuePair<string, DateTime>> PartTwo
{
get
{
for (int i = 10000000; i < 11000000; i++)
{
yield return (new KeyValuePair<string, DateTime>(i.ToString(), DateTime.Now));
}
}
}
What is the best way to accomplish this? Note that the use of long.TryParse
needs to be present in the solution, as the real input may not include valid Int64's.
Upvotes: 1
Views: 3203
Reputation: 22235
I didn't try this yet, but something like this with a group by should work.
items = AllItems.Select(item =>
{
long value;
bool parseSuccess = long.TryParse(item.Key, out value);
return new { value = value, parseSuccess, item.Value };
})
.Where(parsed => parsed.parseSuccess && !items.ContainsKey(parsed.value))
.Select(parsed => new { parsed.value, parsed.Value })
.GroupBy(x => x.value)
.Select(x => new {value = x.Key, Value = x.Min(y => y.Value)})
.ToDictionary(e => e.value, e => e.Value);
Upvotes: 1
Reputation: 160852
Let's see - Your Select()
is currently projecting to the anonymous type
new { value = value, parseSuccess, item.Value };
Then you filter out all items where parsing failed, so essentially you have
new { value = value, true, item.Value };
Now you use Distinct()
on the remaining items. In this case all unique combinations of (value, Value) are considered unique. That means you can have i.e (1,2) and (1,3).
Finally you create your dictionary - but you still may have duplicate value
keys as seen in the example above. This explains why you get this exception.
As posted already GroupBy()
is the way to go in this case to simplify your expression.
Upvotes: 1
Reputation: 117027
I would look cleaning a few things up.
Using a Func<string, long?>
is better in a LINQ query.
Func<string, long?> tryParse = t =>
{
long v;
if (!long.TryParse(t, out v))
{
return null;
}
return v;
};
Then the query looks like this:
var query =
from item in AllItems
let keyValue = tryParse(item.Key)
where keyValue.HasValue
group item.Value by keyValue.Value into g
select new
{
key = g.Key,
value = g.First(),
};
And finally create the dictionary:
var items = query.ToDictionary(x => x.key, x => x.value);
Fairly simple.
Thanks for providing all the code required to test the solution.
Upvotes: 4
Reputation: 96477
Only the first distinct instance of a key (and the instance's value) should be added to the dictionary.
You can achieve this by using the Enumerable.GroupBy
method and taking the first value in the group:
items = AllItems.Select(item =>
{
long value;
bool parseSuccess = long.TryParse(item.Key, out value);
return new { Key = value, parseSuccess, item.Value };
})
.Where(parsed => parsed.parseSuccess)
.GroupBy(o => o.Key)
.ToDictionary(e => e.Key, e => e.First().Value)
Upvotes: 5