Reputation: 103
I am trying to find out the duplicate Elements in XElement , and make a generic function to remove duplicates .Something like:
public List<Xelement>RemoveDuplicatesFromXml(List<Xelement> xele)
{ // pass the Xelement List in the Argument and get the List back , after deleting the duplicate entries.
return xele;
}
the xml is as follows:
<Execute ID="7300" Attrib1="xyz" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
<Execute ID="7301" Attrib1="xyz" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
<Execute ID="7302" Attrib1="xyz1" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
I want get duplicates on every attribute excluding ID ,and then delete the one having lesser ID.
Thanks,
Upvotes: 0
Views: 1059
Reputation: 32455
Use Linq GroupBy
var doc = XDocument.Parse(yourXmlString);
var groups = doc.Root
.Elements()
.GroupBy(element => new
{
Attrib1 = element.Attribute("Attrib1").Value,
Attrib2 = element.Attribute("Attrib2").Value,
Attrib3 = element.Attribute("Attrib3").Value,
Attrib4 = element.Attribute("Attrib4").Value,
Attrib5 = element.Attribute("Attrib5").Value
});
var duplicates = group1.SelectMany(group =>
{
if(group.Count() == 1) // remove this if you want only duplicates
{
return group;
}
int minId = group.Min(element => int.Parse(element.Attribute("ID").Value));
return group.Where(element => int.Parse(element.Attribute("ID").Value) > minId);
});
Solution above will remove elements with lesser ID
which have duplicates by attributes.
If you want return only elements which have duplicates then remove if
fork from last lambda
Upvotes: 1
Reputation: 7672
You can implement custom IEqualityComparer
for this task
class XComparer : IEqualityComparer<XElement>
{
public IList<string> _exceptions;
public XComparer(params string[] exceptions)
{
_exceptions = new List<string>(exceptions);
}
public bool Equals(XElement a, XElement b)
{
var attA = a.Attributes().ToList();
var attB = b.Attributes().ToList();
var setA = AttributeNames(attA);
var setB = AttributeNames(attB);
if (!setA.SetEquals(setB))
{
return false;
}
foreach (var e in setA)
{
var xa = attA.First(x => x.Name.LocalName == e);
var xb = attB.First(x => x.Name.LocalName == e);
if (xa.Value == null && xb.Value == null)
continue;
if (xa.Value == null || xb.Value == null)
return false;
if (!xa.Value.Equals(xb.Value))
{
return false;
}
}
return true;
}
private HashSet<string> AttributeNames(IList<XAttribute> e)
{
return new HashSet<string>(e.Select(x =>x.Name.LocalName).Except(_exceptions));
}
public int GetHashCode(XElement e)
{
var h = 0;
var atts = e.Attributes().ToList();
var names = AttributeNames(atts);
foreach (var a in names)
{
var xa = atts.First(x => x.Name.LocalName == a);
if (xa.Value != null)
{
h = h ^ xa.Value.GetHashCode();
}
}
return h;
}
}
Usage:
var comp = new XComparer("ID");
var distXEle = xele.Distinct(comp);
Please note that IEqualityComparer
implementation in this answer only compare LocalName
and doesn't take namespace into considerataion. If you have element with duplicate local name attribute, then this implementation will take the first one.
You can see the demo here : https://dotnetfiddle.net/w2DteS
If you want to
delete the one having lesser ID
It means you want the largest ID, then you can chain the .Distinct
call with .Select
.
var comp = new XComparer("ID");
var distXEle = xele
.Distinct(comp)
.Select(z => xele
.Where(a => comp.Equals(z, a))
.OrderByDescending(a => int.Parse(a.Attribute("ID").Value))
.First()
);
It will guarantee that you get the element with largest ID.
Upvotes: 1