melks
melks

Reputation: 3

How to iterate over XML file, pulling out attribute pairs as rows in an output CSV file

I couldn't find an answer that matched this specific combination of requirements, and am stuck.

My file is like this:

<?xml version="1.0" encoding="UTF-8"?>
<lvl1>
  <lvl2>
    <lvl3>
      <topic label="grandparent" href="gp1.html">
        <topic label="parent" href="p1.html">
           <topic label="child" href="c1.html">
              <topic label="grandchild1" href="gc1.html"/>
              <topic label="grandchild2" href="gc2.html"/>
...

My desired output is like this:

gradparent,gp1.html
parent,p1.html
child,c1.html
grandchild1,gc1.html
grandchild2,gc2.html

i.e. the goal is to flatten pairs of labels and hrefs into a csv file. My source file has multiple nested topic elements that go many levels deep, some with sibling topic elements.

I've tried things like:

let $input := (my_file.xml)
let $nl := "&#10;"
let $output :=
string-join(
for $topic in $input//topic 
return
string-join(
for $lab in $topic/*
return
$lab/@label/data()
, ',')
, $nl)

return $output

But that's not really even halfway...would be interested to know how far out I am. Thanks.

Upvotes: 0

Views: 44

Answers (1)

BeniBela
BeniBela

Reputation: 16927

You could use @* to get all attributes, but then the order is unspecified. So use (@label,@href). No need for a second for:

let $input := (my_file.xml)
let $nl := "&#10;"
let $output :=
  string-join(
    for $topic in $input//topic 
    return string-join($topic/(@label,@href), ',')
  , $nl)
return $output

You do not even need the first for:

let $input := (my_file.xml)
let $nl := "&#10;"
let $output :=
  string-join(
    $input//topic/string-join((@label,@href), ',')
  , $nl)
return $output

Upvotes: 2

Related Questions