Reputation: 13
My current code to select some attributes in XML doesn't seem to work:
[xml]$xml = Get-Content 'C:\Makro-Test\quandata.xml'
$xml.QUANDATASET.GROUPDATA.GROUP.SAMPLELISTDATA.SAMPLE | foreach {
$_.id + ":" + $_.name + ":" + $_.COMPOUND.id + ":" + $_.COMPOUND.name +
":" + $_.COMPOUND.PEAK.analconc
}
It outputs:
1:Aminoacids_Routine_2016_05_30_002:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23:Leu Iso Thre Val Lys Met Phen Try His Gly Ala Ser Arg Cys Tyr Pro Glu Glut Asp Aspa Tau Orn Cit:0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000 2:Aminoacids_Routine_2016_05_30_003:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23:Leu Iso Thre Val Lys Met Phen Try His Gly Ala Ser Arg Cys Tyr Pro Glu Glut Asp Aspa Tau Orn Cit:0.0000000000 0.2336617286 0.2147717292 0.2252815136 0.2299108827 0.2395318825 0.0000000000 0.0000000000 0.0000000000 0.2074479299 0.0000000000 0.0000000000
But I want the output to look like that:
1;Aminoacids_Routine_2016_05_30_002;1;Leu;0.0000000000 2;Aminoacids_Routine_2016_05_30_002;2;Iso;0.0000000000 ... 1;Aminoacids_Routine_2016_05_30_003;1;Leu;0.0000000000 2;Aminoacids_Routine_2016_05_30_003;2;Iso;0.2336617286 ...
The XML file:
<?xml version="1.0"?>
<QUANDATASET>
<XMLFILE>
<DATASET>
<GROUPDATA>
<GROUP>
<METHODDATA/>
<SAMPLELISTDATA>
<SAMPLE id="1" groupid="1" name="Routine_2016_05_30_002">
<COMPOUND id="1" sampleid="1" groupid="1" name="Leu">
<PEAK foundscan="0" analconc="0.023423456">
<ISPEAK/>
</PEAK>
</COMPOUND>
<COMPOUND id="2" sampleid="1" groupid="1" name="Iso">
<PEAK foundscan="0" analconc="0.123456789">
<ISPEAK/>
</PEAK>
</COMPOUND>
<COMPOUND id="3" sampleid="1" groupid="1" name="Thre">
...
...
...
<SAMPLE id="2" groupid="1" name="Routine_2016_05_30_003">
<COMPOUND id="1" sampleid="2" groupid="1" name="Leu">
...
...
...
Upvotes: 1
Views: 138
Reputation: 200193
Like @wOxxOm I'd use SelectNodes()
with an XPath expression, but I'd process the output as calculated properties instead:
$xml.SelectNodes('//COMPOUND') |
Select-Object @{n='SampleID';e={[int]$_.ParentNode.id}},
@{n='SampleName';e={$_.ParentNode.name}},
@{n='CompoundID';e={[int]$_.id}},
@{n='CompoundName';e={$_.name}},
@{n='analconc';e={[double]$_.PEAK.analconc}}
That will give you objects to work with instead of a string. If you need the data written to a file you can export it via Export-Csv
:
... | Export-Csv 'C:\path\to\quandata.csv' -NoType -Delimiter ';'
Upvotes: 2
Reputation: 73506
You're not using XPath selectors but native PowerShell object property access.
PowerShell 3.0 and newer automatically produces an array of the specified property values when used on an array as a whole.
In case of XML, each repeated element like COMPOUND
returns an array when accessed by name (that is without index), so the aforementioned behavior applies to $_.COMPOUND.id
: this is an array! And it's automatically type-coerced into a string by joining the elements with a space in your code.
Solution 1: enumerate the child elements manually:
$delim = ':'
foreach ($sample in $xml.QUANDATASET.GROUPDATA.GROUP.SAMPLELISTDATA.SAMPLE) {
foreach ($compound in $sample.COMPOUND) {
$sample.id, $sample.name,
$compound.id, $compound.name, [double]$compound.PEAK.analconc -join $delim
}
}
Solution 2: actually use XPath to select all child elements and access SAMPLE
as parentNode
:
$delim = ':'
foreach ($compound in $xml.SelectNodes('//COMPOUND')) {
$sample = $compound.ParentNode
$sample.id, $sample.name,
$compound.id, $compound.name, [double]$compound.PEAK.analconc -join $delim
}
Instead of pipelining I'm using foreach
statement to have a nicely named iterator variable.
Upvotes: 1