Reputation: 28841
I have the following data:
<Subjects>
<Subject>
<Id>1</Id>
<Name>Maths</Name>
</Subject>
<Subject>
<Id>2</Id>
<Name>Science</Name>
</Subject>
<Subject>
<Id>2</Id>
<Name>Advanced Science</Name>
</Subject>
</Subjects>
and:
<Courses>
<Course>
<SubjectId>1</SubjectId>
<Name>Algebra I</Name>
</Course>
<Course>
<SubjectId>1</SubjectId>
<Name>Algebra II</Name>
</Course>
<Course>
<SubjectId>1</SubjectId>
<Name>Percentages</Name>
</Course>
<Course>
<SubjectId>2</SubjectId>
<Name>Physics</Name>
</Course>
<Course>
<SubjectId>2</SubjectId>
<Name>Biology</Name>
</Course>
</Courses>
I wish to efficiently get elements from both documents that share the share the same Ids.
I want to get the result like this:
<Results>
<Result>
<Table1>
<Subject>
<Id>1</Id>
<Name>Maths</Name>
</Subject>
</Table1>
<Table2>
<Course>
<SubjectId>1</SubjectId>
<Name>Algebra I</Name>
</Course>
<Course>
<SubjectId>1</SubjectId>
<Name>Algebra II</Name>
</Course>
<Course>
<SubjectId>1</SubjectId>
<Name>Percentages</Name>
</Course>
</Table2>
</Result>
<Result>
<Table1>
<Subject>
<Id>2</Id>
<Name>Science</Name>
</Subject>
<Subject>
<Id>2</Id>
<Name>Advanced Science</Name>
</Subject>
</Table1>
<Table2>
<Course>
<SubjectId>2</SubjectId>
<Name>Physics</Name>
</Course>
<Course>
<SubjectId>2</SubjectId>
<Name>Biology</Name>
</Course>
</Table2>
</Result>
</Results>
So far I have 2 solutions:
<Results>
{
for $e2 in $t2/Course
let $foriegnId := $e2/SubjectId
group by $foriegnId
let $e1 := $t1/Subject[Id = $foriegnId]
where $e1
return
<Result>
<Table1>
{$e1}
</Table1>
<Table2>
{$e2}
</Table2>
</Result>
}
</Results>
and the otherway round:
<Results>
{
for $e1 in $t1/Subject
let $id := $e1/Id
group by $id
let $e2 := $t2/Course[SubjectId = $id]
where $e2
return
<Result>
<Table1>
{$e1}
</Table1>
<Table2>
{$e2}
</Table2>
</Result>
}
</Results>
Is there a more efficient way of doing this? Perhaps taking advantages of multiple groups?
Update A major issue with my code at the moment is that it's performance is highly dependent on which table is bigger. For example the 1st solution is better in cases where the 2nd table is bigger and vice versa.
Upvotes: 1
Views: 49
Reputation: 163262
The solution you have looks reasonable to me. It will perform siginificantly better on a processor like Saxon-EE that does join optimization than on one (like Saxon-HE) that doesn't. If you want to hand-optimize it, your simplest approach is to switch to using XSLT: use the key()
function to replace the filter expression $t1/Subject[Id = $foriegnId]
which, in the absence of optimization, searches your second file once for each element selected in the first file.
Upvotes: 2