Reputation: 2364
I have the following list -
public class edgeData
{
public string source { get; set; }
public string target { get; set; }
}
var edgeList = new List<edgeData>();
var linkCount = 1;
I want to remove an edgeData
entry from edgeList
when the source collectively occurs less than or equal to linkCount
.
For example build up my edgeData -
var newEdge = new edgeData();
newEdge.source = "James";
newEdge.target = 1;
edgeList.Add(newEdge);
var newEdge = new edgeData();
newEdge.source = "Greg";
newEdge.target = 2;
edgeList.Add(newEdge);
var newEdge = new edgeData();
newEdge.source = "James";
newEdge.target = 3;
edgeList.Add(newEdge);
Then process to remove entries that occur less or equal to linkCount
-
public List<edgeData> RemoveLinks(List<edgeData>() edgeList, int linkCount)
{
var updatedEdgeData = new List<edgeData>();
// logic
return updatedEdgeData;
}
So in the example the entry containing "Greg" as the source would be removed because he has occurred only once which is equal to the linkCount.
I tried doing this with a for loop however this got incredibly ugly pretty quickly and believe Linq is the best option however how can I achieve this?
Upvotes: 1
Views: 103
Reputation: 18155
You could do the following
edgeList.GroupBy(x=>x.source)
.Where(x=>x.Count()>linkCount)
.SelectMany(x=>x)
.ToList();
You need to Group
by the source and filter out the groups which has items less than the linkCount
.
Please also note that according to OP, edgeData.target
is a string, but your code shows it as number. Hopefully it is a typo.
Update
As Harald pointed out, if the size of group is huge, you could alternatively use,
edgeList.GroupBy(x=>x.source)
.Where(x=>x.Skip(linkCount).Any())
.SelectMany(x=>x)
.ToList()
Upvotes: 3
Reputation: 30454
I want to remove an edgeData entry from edgeList when the source collectively occurs less than or equal to linkCount.
I think you want in your end-result only those items from your input sequence that have a value of property Source
that occurs more times in your sequence than linkCount
.
So if linkCount
equals 5, you only want to keep those records where there are at least five occurences of this Source
in the input sequence.
For this we need to group your input into groups with the same value for Source
. After that we only keep those groups that have more elements that linkCount
in them:
IEnumerable<EdgeData> result = edgeList.GroupBy( edgeItem => edgeItem.Source)
// keep only the groups with enough elements:
.Where(group => group.Skip(linkCount).Any())
// Ungroup, so we get a neat sequence
.SelectMany(group => group);
The result of the GroupBy is a sequence of object where each object implements IGrouping<string, EdgeData>
. This object is in itself a sequence of EdgeData
, where every Source
property has the same value. This value is in the Key
of the IGrouping.
After making the groups, I keep only the groups that have more than linkCount items in them. I do this, by skipping the first LinkCount items of the sequence that the group is, and if there are any items left, then apparently the group has more than linkCount items.
I don't want to use Count(), because if your group has a zillion items, it would be a waste of processing power to count all those items, if you can stop counting after you've seen that there are more than linkCount.
The result of the Where is a sequence of IGrouping<string, EdgeData>
To ungroup, we use SelectMany
, which makes it a neat sequence of EdgeData
again.
Upvotes: 1
Reputation: 94
Basically just count the occurrnce of the names, then loop on the list and remove the one that you don't like (not enough connections)
Dictionary<string, int> occurrence = new Dictionary<string, int>();
foreach (edgeData edge in edgeList)
{
if (occurrence.ContainsKey(edge.source))
occurrence[edge.source] += 1;
else
occurrence[edge.source] = 1;
}
int counter = 0;
while(counter < edgeList.Count)
{
if (occurrence[edgeList[counter].source] < linkCount)
edgeList.RemoveAt(counter);
else
counter++;
}
Upvotes: 0