Benhur262
Benhur262

Reputation: 13

How to output child elements separately, not as one space-delimited string?

My current code to select some attributes in XML doesn't seem to work:

[xml]$xml = Get-Content 'C:\Makro-Test\quandata.xml'
$xml.QUANDATASET.GROUPDATA.GROUP.SAMPLELISTDATA.SAMPLE | foreach {
  $_.id + ":" + $_.name + ":" + $_.COMPOUND.id + ":" + $_.COMPOUND.name +
    ":" + $_.COMPOUND.PEAK.analconc
}

It outputs:

1:Aminoacids_Routine_2016_05_30_002:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23:Leu Iso Thre Val Lys Met Phen Try His Gly Ala Ser Arg Cys Tyr Pro Glu Glut Asp Aspa Tau Orn Cit:0.0000000000     0.0000000000   0.0000000000  0.0000000000 0.0000000000  0.0000000000  0.0000000000
2:Aminoacids_Routine_2016_05_30_003:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23:Leu Iso Thre Val Lys Met Phen Try His Gly Ala Ser Arg Cys Tyr Pro Glu Glut Asp Aspa Tau Orn Cit:0.0000000000 0.2336617286 0.2147717292 0.2252815136  0.2299108827 0.2395318825  0.0000000000    0.0000000000 0.0000000000  0.2074479299     0.0000000000  0.0000000000

But I want the output to look like that:

1;Aminoacids_Routine_2016_05_30_002;1;Leu;0.0000000000
2;Aminoacids_Routine_2016_05_30_002;2;Iso;0.0000000000
...
1;Aminoacids_Routine_2016_05_30_003;1;Leu;0.0000000000
2;Aminoacids_Routine_2016_05_30_003;2;Iso;0.2336617286
...

The XML file:

<?xml version="1.0"?>
<QUANDATASET>
  <XMLFILE>
  <DATASET>
  <GROUPDATA>
    <GROUP>
      <METHODDATA/>
      <SAMPLELISTDATA>
        <SAMPLE id="1" groupid="1" name="Routine_2016_05_30_002">
          <COMPOUND id="1" sampleid="1" groupid="1" name="Leu">
            <PEAK foundscan="0" analconc="0.023423456">
              <ISPEAK/>
            </PEAK>
          </COMPOUND>
          <COMPOUND id="2" sampleid="1" groupid="1" name="Iso">
             <PEAK foundscan="0" analconc="0.123456789">
               <ISPEAK/>
             </PEAK>
          </COMPOUND>
          <COMPOUND id="3" sampleid="1" groupid="1" name="Thre">
          ...
          ...
          ...
        <SAMPLE id="2" groupid="1" name="Routine_2016_05_30_003">
          <COMPOUND id="1" sampleid="2" groupid="1" name="Leu">
          ...
          ...
          ...

Upvotes: 1

Views: 138

Answers (2)

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200193

Like @wOxxOm I'd use SelectNodes() with an XPath expression, but I'd process the output as calculated properties instead:

$xml.SelectNodes('//COMPOUND') |
  Select-Object @{n='SampleID';e={[int]$_.ParentNode.id}},
                @{n='SampleName';e={$_.ParentNode.name}},
                @{n='CompoundID';e={[int]$_.id}},
                @{n='CompoundName';e={$_.name}},
                @{n='analconc';e={[double]$_.PEAK.analconc}}

That will give you objects to work with instead of a string. If you need the data written to a file you can export it via Export-Csv:

... | Export-Csv 'C:\path\to\quandata.csv' -NoType -Delimiter ';'

Upvotes: 2

woxxom
woxxom

Reputation: 73506

You're not using XPath selectors but native PowerShell object property access.

PowerShell 3.0 and newer automatically produces an array of the specified property values when used on an array as a whole.

In case of XML, each repeated element like COMPOUND returns an array when accessed by name (that is without index), so the aforementioned behavior applies to $_.COMPOUND.id: this is an array! And it's automatically type-coerced into a string by joining the elements with a space in your code.

Solution 1: enumerate the child elements manually:

$delim = ':'
foreach ($sample in $xml.QUANDATASET.GROUPDATA.GROUP.SAMPLELISTDATA.SAMPLE) {
    foreach ($compound in $sample.COMPOUND) {
        $sample.id, $sample.name,
        $compound.id, $compound.name, [double]$compound.PEAK.analconc -join $delim
    }
}

Solution 2: actually use XPath to select all child elements and access SAMPLE as parentNode:

$delim = ':'
foreach ($compound in $xml.SelectNodes('//COMPOUND')) {
    $sample = $compound.ParentNode
    $sample.id, $sample.name,
    $compound.id, $compound.name, [double]$compound.PEAK.analconc -join $delim
}

Instead of pipelining I'm using foreach statement to have a nicely named iterator variable.

Upvotes: 1

Related Questions