Reputation: 544
I have an XML document,
<resultsets>
<row>
<first_name>Georgi</first_name>
<last_name>Facello</last_name>
</row>
<row>
<first_name>Bezalel</first_name>
<last_name>Simmel</last_name>
</row>
<row>
<first_name>Bezalel</first_name>
<last_name>Hass</last_name>
</row>
</resultsets>
I want to sort first names and remove duplicated first names to produce this:
<resultsets>
<row>
<first_name>Bezalel</first_name>
<last_name>Simmel</last_name>
</row>
<row>
<first_name>Georgi</first_name>
<last_name>Facello</last_name>
</row>
</resultsets>
Following are the code I wrote:
for $last_name at $count1 in doc("employees.xml")//last_name,
$first_name at $count2 in doc("employees.xml")//first_name
let $f := $first_name
where ( $count1=$count2 )
group by $f
order by $f
return
<row>
{$f}
{$last_name}
</row>
However, this code sort the XML document by first names, but failed to remove the duplicated first name ('Bezalel'), it returns:
<resultsets>
<row>
<first_name>Bezalel</first_name>
<last_name>Simmel</last_name>
</row>
<row>
<first_name>Bezalel</first_name>
<last_name>Hass</last_name>
</row>
<row>
<first_name>Georgi</first_name>
<last_name>Facello</last_name>
</row>
</resultsets>
I know how to solve this using two FLOWR statements. group by
behavior is weird, could you please explain why it does not remove the duplicates?
Is there any way we can solve this problem using ONE FLOWR loop and ONLY use $first_name
and $last_name
two variables? Thanks,
Upvotes: 1
Views: 57
Reputation: 167571
I would simply group the row
elements by the first_name
child and then output the first item in each group to ensure you don't get duplicates:
<resultssets>
{
for $row in resultsets/row
group by $fname := $row/first_name
order by $fname
return
$row[1]
}
</resultssets>
http://xqueryfiddle.liberty-development.net/jyyiVhf
As to how the group by
clause works, see https://www.w3.org/TR/xquery-31/#id-group-by which says:
The group by clause assigns each pre-grouping tuple to a group, and generates one post-grouping tuple for each group. In the post-grouping tuple for a group, each grouping key is represented by a variable that was specified in a GroupingSpec, and every variable that appears in the pre-grouping tuples that were assigned to that group is represented by a variable of the same name, bound to a sequence of all values bound to the variable in any of these pre-grouping tuples.
Upvotes: 2