Reputation: 3
I'm working with a large volume of XML files that have common elements repeated across them. I've been able to concatenate them into a single file and sort using Xquery but I'm having difficulty taking the next step to merge elements based on key identifiers. For example, I have an XML file with the following structure:
<example>
<Store ID="111">
<Manager ID="123">
<Employee>
<EmployeeID>0001001</EmployeeID>
<HireDate Value="1-Jan-2000"/>
<Action ID="001" Type="S">
<Details ID="a1">
<TransactionType>I</TransactionType>
</Details>
<TransactionType>R</TransactionType>
</Action>
<TransactionType>R</TransactionType>
</Employee>
<TransactionType>R</TransactionType>
</Manager>
<TransactionType>R</TransactionType>
</Store>
<Store ID="111">
<Manager ID="123">
<Employee>
<EmployeeID>0001001</EmployeeID>
<HireDate Value="1-Jan-2000"/>
<Action ID="003" Name="Ecg" Type="S">
<Details ID="b1">
<TransactionType>I</TransactionType>
</Details>
<TransactionType>R</TransactionType>
</Action>
<TransactionType>R</TransactionType>
</Employee>
<TransactionType>R</TransactionType>
</Manager>
<TransactionType>R</TransactionType>
</Store>
<Store ID="00102">
<Manager ID="00302">
<Employee>
<EmployeeID>0002001</EmployeeID>
<Sex Value="M"/>
<Confidential Birthdate="1970-07-03"/>
<Action ID="003" Name="Ecg" Type="S">
<Details ID="c1">
<TransactionType>I</TransactionType>
</Details>
<TransactionType>R</TransactionType>
</Action>
<TransactionType>R</TransactionType>
</Employee>
<TransactionType>R</TransactionType>
</Manager>
<TransactionType>R</TransactionType>
</Store>
</example>
I would like to be able to merge the first 2 main Store elements based on attribute values of Store ID, Manager ID and the element value of EmployeeID, such that the resulting XML is as follows:
<example>
<Store ID="111">
<Manager ID="123">
<Employee>
<EmployeeID>0001001</EmployeeID>
<HireDate Value="1-Jan-2000"/>
<Action ID="001" Type="S">
<Details ID="a1">
<TransactionType>I</TransactionType>
</Details>
<TransactionType>R</TransactionType>
</Action>
<Action ID="003" Name="Ecg" Type="S">
<Details ID="b1">
<TransactionType>I</TransactionType>
</Details>
<TransactionType>R</TransactionType>
</Action>
<TransactionType>R</TransactionType>
</Employee>
<TransactionType>R</TransactionType>
</Manager>
<TransactionType>R</TransactionType>
</Store>
<Store ID="00102">
<Manager ID="00302">
<Employee>
<EmployeeID>0002001</EmployeeID>
<Sex Value="M"/>
<Confidential Birthdate="1970-07-03"/>
<Action ID="003" Name="Ecg" Type="S">
<Details ID="c1">
<TransactionType>I</TransactionType>
</Details>
<TransactionType>R</TransactionType>
</Action>
<TransactionType>R</TransactionType>
</Employee>
<TransactionType>R</TransactionType>
</Manager>
<TransactionType>R</TransactionType>
</Store>
</example>
Any suggestions re: Xquery approaches to achieve this result would be greatly appreciated - or any alternative approaches as well (e.g. XSLT?). Thanks!
Upvotes: 0
Views: 490
Reputation: 163322
XQuery 1.0 lacks any grouping capability, which makes this tricky. If you have access to XQuery 3.0, you can probably make use of the new "group by" construct.
Similarly in XSLT, there's no built-in grouping capability in 1.0, but there is in 2.0. With XSLT 2.0 you would typically do:
<xsl:for-each-group select="Store"
group-by="concat(@ID, '/', Manager/@ID), '/', Manager/Employee/EmployeeID">
<Store ID="{@ID}">
<Manager ID="{Manager/@ID}">
<Employee>
<xsl:variable name="e" select="current-group()/Manager/Employee"/>
<xsl:copy-of select="($e/EmployeeID)[1]"/>
<xsl:copy-of select="($e/HireDate)[1]"/>
<xsl:copy-of select="$e/Action"/>
</Employee>
</Manager>
</Store>
</xsl:for-each-group>
I've made some assumptions here: you only want the first HireDate, but you want all the Actions. You'll have to adapt it to your actual requirements.
Upvotes: 1