Reputation: 1126
I have written this LINQ to xml query:
Dim xd1 As XDocument = XDocument.Load("C:\doc1.xml")
Dim xd2 As XDocument = XDocument.Load("C:\doc2.xml")
Dim xd3 As XDocument = XDocument.Load("C:\doc3.xml")
Dim q = From a In xd1...<row>, b In xd2...<row>, c In xd3...<row> Where
a.@Field1 = "pippo" AndAlso b.@Field2 = a.@RifField2 AndAlso c.@Field3 = a.@RifField3 Select
b.@Field4, b.@Field5, c.@Field6
Dim s As String = ""
For Each a In q
s &= a.Campo4 & " - " & a.Campo5 & " - " & a.Campo6 & vbCrLf
Next
TextBlock1.Text = s
But this code takes about 5 seconds to be executed. Certainly I would changed the query, but going in debug I have seen that the row
Dim q = From ...
takes an istant to be executed, and every following line goes away very fast until the cycle "For Each" has finished the items to scroll and it has to exit. Then the execution has a stop for 5 seconds, then the cycle exit.
I obtain the same delay if I write
Dim q = (From ... ).ToArray
or else if I write
Dim i As Long = q.Count
The most strange is that it takes so long time to see that the items list is finished and it must to exit from the cycle. A detail: the query q has only 8 items.
Have you got some suggestion to solve my performance issue? Pileggi
Upvotes: 0
Views: 164
Reputation: 160892
Let's see here:
You have 3 XML files, let's say they have K, L, M row
elements each.
You are then doing a cartesian product over all these elements, that means you have K*L*M possible results to evaluate. This is going to be a lot of work very fast depending on the size of K, L and M - If each had just 1000 rows you have 1 billion possible results. This is the reason this is so slow.
You should do the filtering first to avoid creating such a huge cartesian product - Move the condition a.@Field1 = "pippo"
before creating the cartesian product, this should significantly improve performance.
For example if there were only 10 rows that matched "pippo" in the first XML file you now only have 10*1000*1000 possible results = 10 Million - still a lot but only 1/100 of the number of rows you have in your current query.
In C# (I'm not a VB guy) that would be something like
var query = from a in xd1.Descendants("row").Where(x=> x.Field1 == "pippo")
from b in xd2.Descendants("row")
from c in xd3.Descendants("row")
//rest of query
Upvotes: 1