Reputation: 275
I'm new to Linq. I have code like this:
public class Data
{
public Dictionary<string,int> WordFrequency;
}
List<Data> dataList;
What I want is one aggregated dictionary that does a combined WordFrequency for the whole list of Data objects. I know how to do this using loops (iterate over the List, then iterate over each Dictionary), my question is, what is the Linq syntax for this? Thank you.
EDIT: here is my (untested) looping approach, so you can see what I mean.
public static Dictionary<string, int> Combine()
{
Dictionary<string, int> result;
foreach (Data data in DataList)
{
foreach (string key in data.WordFrequencies.Keys)
{
if(!result.ContainsKey(key))
result[key] = 0;
result[key] += data.WordFrequencies[key];
}
}
}
Upvotes: 1
Views: 121
Reputation: 4189
Here is a query-based solution identical in most regards to Tim's:
Dictionary<string, int> allWordFrequency =
(from d in dataList
from kvp in d.WordFrequency
group kvp.Value by d.Key)
// ^^^^^^^^^ this grouping projection...
.ToDictionary(g => g.Key, g => g.Sum());
// ...eliminates need for lambda here ^^
I appreciate how the two from
clauses mimic the nested foreach
loops in the looping-based approach of the post. Like Tim's solution, the query iterates the KeyValuePair's of the Dictionary rather than iterate the Keys collection - this way the query doesn't need to invoke the indexer to get the corresponding integer count value.
Upvotes: 0
Reputation: 460380
So you want to flatten all dictionaries into a single one, which has no duplicate keys - of course?
You can use Enumerable.SelectMany
to flatten all and Enumerable.GroupBy
to group the keys.
Dictionary<string, int> allWordFrequency = dataList
.SelectMany(d => d.WordFrequency)
.GroupBy(d => d.Key)
.ToDictionary(g => g.Key, g => g.Sum(d => d.Value));
I have presumed that you want to sum all frequencies.
Upvotes: 6