Reputation: 5965
I have the following array, which represents the items I'm trying to remove from a certain data structure instance:
string[] diseasesToRemove = new string[] { "D1", "D3", "D5" };
I'm able to do it using the following:
for (int i = 0; i < healthGroup.DiseaseGroups.Length; i++)
{
var dgDiseases = new List<Disease>(healthGroup.DiseaseGroups[i].Diseases);
for (int j = 0; j < dgDiseases.Count; j++)
{
if (diseasesToRemove.Contains(dgDiseases[j].Name))
{
dgDiseases.RemoveAt(j);
j--;
}
}
healthGroup.DiseaseGroups[i].Diseases = dgDiseases.ToArray();
}
However, I'm sure there's a better way using Linq or something else. Is there?
Here are the classes:
public class HealthGroup
{
public DiseaseGroup[] DiseaseGroups { get; set; }
public HealthGroup(DiseaseGroup[] diseaseGroups)
{
DiseaseGroups = diseaseGroups;
}
}
public class DiseaseGroup
{
public string Name { get; set; }
public Disease[] Diseases;
public DiseaseGroup(string name, Disease[] diseases)
{
Name = name;
Diseases = diseases;
}
}
public class Disease
{
public string Name { get; set; } = "My Disease";
public int Risk { get; set; } = 7;
public Disease(string name, int risk)
{
Name = name;
Risk = risk;
}
public override string ToString()
{
return $"{Name} with risk {Risk}";
}
}
And some boilerplate to generate the instance:
Disease d1 = new Disease("D1", 1);
Disease d2 = new Disease("D2", 2);
Disease d3 = new Disease("D3", 3);
Disease d4 = new Disease("D4", 4);
Disease d5 = new Disease("D5", 5);
Disease d6 = new Disease("D6", 6);
Disease d7 = new Disease("D7", 7);
DiseaseGroup dg1 = new DiseaseGroup("DG1", new Disease[] { d1, d2 });
DiseaseGroup dg2 = new DiseaseGroup("DG2", new Disease[] { d3, d4, d5 });
DiseaseGroup dg3 = new DiseaseGroup("DG3", new Disease[] { d6, d7 });
HealthGroup healthGroup = new HealthGroup(new DiseaseGroup[] { dg1, dg2, dg3 });
Upvotes: 3
Views: 128
Reputation: 38767
You could use Where
to simplify the code to this:
foreach (var diseaseGroup in healthGroup.DiseaseGroups)
{
diseaseGroup.Diseases
= diseaseGroup.Diseases.Where(g => !diseasesToRemove.Contains(g.Name)).ToArray();
}
Of course, like the original code, this generates a new list. A more performant option (albeit non-LINQ) might be to make Diseases
a List<Disease>
:
public class DiseaseGroup
{
public string Name { get; set; }
public List<Disease> Diseases;
public DiseaseGroup(string name, Disease[] diseases)
{
Name = name;
Diseases = new List<Disease>(diseases);
}
}
and then you could use similar code without the overhead of generating a new list and new array:
for (int i = 0; i < healthGroup.DiseaseGroups.Length; i++)
{
for (int j = healthGroup.DiseaseGroups[i].Diseases.Count - 1; j >= 0; --j)
{
if (diseasesToRemove.Contains(healthGroup.DiseaseGroups[i].Diseases[j].Name))
{
healthGroup.DiseaseGroups[i].Diseases.RemoveAt(j);
}
}
}
I also modified your for loop to work backwards to resolve the problem you had with the --j
part.
And, instead of using a string[]
for diseasesToRemove
, for a large set of items, it would probably be better to use a HashSet<string>
to store the diseases.
Edit to include requested one-liner:
healthGroup.DiseaseGroups = healthGroup.DiseaseGroups.Select(g => { return g.Diseases = g.Diseases.Where(g => !diseasesToRemove.Contains(g.Name)).ToArray(); }).ToArray();
This is abusing select a little, however :-)
Upvotes: 6
Reputation: 120450
These sorts of filtering jobs can be nicely expressed in terms of joins. They solve the problem of repeatedly iterating one group to check for matches in the other group (.Contains
here is O(n) and is called frequently... this doesn't need to happen).
Unfortunately, linq-to-objects only does inner joins out of the box, but a much-reused extension method that I keep around simplifies the job of performing set-based left-outer joins.
If you perform a left-outer join between the diseases and diseasesToRemove
, then you select items from the left-hand collection of the join (healthGroup.DiseaseGroups[i].Diseases
) that don't match anything on the right-hand collection (diseasesToRemove
), then you'll have removed anything that has a match.
Using the extension method .LeftOuterJoin
(listed below), you can filter your arrays as follows:
for (int i = 0; i < healthGroup.DiseaseGroups.Length; i++)
{
healthGroup.DiseaseGroups[i].Diseases =
healthGroup.DiseaseGroups[i].Diseases
.LeftOuterJoin(
diseasesToRemove,
d => d.Name,
dr => dr,
(d, dr) => ( d, dr ))
.Where(x => x.dr == null)
.Select(x => x.d)
.ToArray();
}
A left-outer join extension method:
public static class JoinExtensions
{
public static IEnumerable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(
this IEnumerable<TLeft> leftSeq,
IEnumerable<TRight> rightSeq,
Func<TLeft, TKey> keySelectorLeft,
Func<TRight, TKey> keySelectorRight,
Func<TLeft, TRight, TResult> projectionSelector)
{
return leftSeq
.GroupJoin(
rightSeq,
keySelectorLeft,
keySelectorRight,
(leftItem, rightItems) => new { leftItem, rightItems })
.SelectMany(
x => x.rightItems.DefaultIfEmpty(),
(x, rightItem) => projectionSelector(x.leftItem, rightItem));
}
}
Upvotes: 2