Reputation: 434
Here is my xml:
<Root>
<FirstChild id="1" att="a">
<SecondChild id="11" att="aa">
<ThirdChild>123</ThirdChild>
<ThirdChild>456</ThirdChild>
<ThirdChild>789</ThirdChild>
</SecondChild>
<SecondChild id="12" att="ab">12</SecondChild>
<SecondChild id="13" att="ac">13</SecondChild>
</FirstChild>
<FirstChild id="2" att="b">2</FirstChild>
<FirstChild id="3" att="c">3</FirstChild>
</Root>
This xml doc is very big and may be 1 GB size or more. For better performance in querying, i want to read xml doc step by step. So, in first step i want to read only "First Child"s and their attributes like below:
<FirstChild id="1" att="a"></FirstChild>
<FirstChild id="2" att="b">2</FirstChild>
<FirstChild id="3" att="c">3</FirstChild>
And after that, I maybe want to get "SecondChild"s by id of their parent and so ...
<SecondChild id="11" att="aa"></SecondChild>
<SecondChild id="12" att="ab">12</SecondChild>
<SecondChild id="13" att="ac">13</SecondChild>
How can I do it?
Note: XDoc.Descendants() or XDoc.Elements() load all specific elements with all child elements!
Upvotes: 4
Views: 2191
Reputation: 7692
I suggest creating a new element and copy the attributes.
var sourceElement = ...get "<FirstChild id="1" att="a">...</FirstChild>" through looping, xpath or any method.
var element = new XElement(sourceElement.Name);
foreach( var attribute in sourceElement.Attributes()){
element.Add(new XAttribute(attribute.Name, attribute.Value));
}
Upvotes: 2
Reputation: 11773
In VB this you could do this to get a list of FirstChild
'Dim yourpath As String = "your path here"
Dim xe As XElement
'to load from a file
'xe = XElement.Load(yourpath)
'for testing
xe = <Root>
<FirstChild id="1" att="a">
<SecondChild id="11" att="aa">
<ThirdChild>123</ThirdChild>
<ThirdChild>456</ThirdChild>
<ThirdChild>789</ThirdChild>
</SecondChild>
<SecondChild id="12" att="ab">12</SecondChild>
<SecondChild id="13" att="ac">13</SecondChild>
</FirstChild>
<FirstChild id="2" att="b">2</FirstChild>
<FirstChild id="3" att="c">3</FirstChild>
</Root>
Dim ie As IEnumerable(Of XElement)
ie = xe...<FirstChild>.Select(Function(el)
'create a copy
Dim foo As New XElement(el)
foo.RemoveNodes()
Return foo
End Function)
Upvotes: 0
Reputation: 506
Provided that you have memory available to hold the file, I suggest treating each search step as an item in the outer collection of a PLINQ pipeline.
I would start with an XName
collection for the node collections that you want to retrieve. By nesting queries within XElement
constructors, you can return new instances of your target nodes, with only name and attribute information.
With a .Where(...)
statement or two, you could also filter the attributes being kept, allow for some child nodes to be retained, etc.
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;
namespace LinqToXmlExample
{
public class Program
{
public static void Main(string[] args)
{
XElement root = XElement.Load("[your file path here]");
XName[] names = new XName[] { "firstChild", "secondChild", "thirdChild" };
IEnumerable<XElement> elements =
names.AsParallel()
.Select(
name =>
new XElement(
$"result_{name}",
root.Descendants(name)
.AsParallel()
.Select(
x => new XElement(name, x.Attributes()))))
.ToArray();
}
}
}
Upvotes: 1