Reputation: 358
I have a some XML data that has a lot of attributes and multiple elements with the same name and I want to flatten it out into a CSV file. The data XML looks like this:
<?xml version="1.0" encoding="utf-8"?>
<SEGMENTS>
<SEGMENT NAME="webcluster">
<RESULTPAGE>
<RESULTSET FIRSTHIT="1" LASTHIT="100" HITS="100" TOTALHITS="100">
<HIT NO="1" RANK="19000" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">19000</FIELD>
<FIELD NAME="id">1</FIELD>
<FIELD NAME="url">C:\website.com\folder1\file1.txt</FIELD>
<FIELD NAME="filename">file1.txt</FIELD>
<FIELD NAME="path">https://website.com/folder1/</FIELD>
</HIT>
<HIT NO="2" RANK="19000" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">19000</FIELD>
<FIELD NAME="id">2</FIELD>
<FIELD NAME="url">C:\website.com\folder1\file2.txt</FIELD>
<FIELD NAME="filename">file2.txt</FIELD>
<FIELD NAME="path">https://website.com/folder1/</FIELD>
</HIT>
<HIT NO="3" RANK="18999" SITEID="0" MOREHITS="100">
<FIELD NAME="rank">18999</FIELD>
<FIELD NAME="id">3</FIELD>
<FIELD NAME="url">C:\website.com\folder5\file3.txt</FIELD>
<FIELD NAME="filename">file3.txt</FIELD>
<FIELD NAME="path">C:\website.com\folder\</FIELD>
</HIT>
</RESULTSET>
</RESULTPAGE>
</SEGMENT>
</SEGMENTS>
And I'm trying to iterate through it to produce something like this
HIT filename path ----- ---------- ------ 1 file1.txt C:\website.com\folder1\ 2 file2.txt C:\website.com\folder1\ 3 file3.txt C:\website.com\folder5\
My code is:
[xml]$xml=Get-Content .\xmlfile.xml
$hits = $xml.segments.segment.resultpage.resultset.hit
foreach($hit in $hits)
{
foreach($field in $hit.field)
{
if (field."NAME" -eq 'url')
{
write-output $hit.no $field."#VALUE"
}
}
}
And I keep getting errors. I can access the different elements and attributes by referencing them by ordinal location ($hits[0].field[4]) but I want to protect against future output where the FIELD values might be in a different order.
Can anyone suggest how I can get this done? I tried using select-XML and found that to be even more troublesome, but maybe that's the more elegant method.
Upvotes: 0
Views: 6215
Reputation: 665
combining approaches. Select allows to get fields in a particular order.
[xml]$xml=Get-Content .\xmlfile.xml
$hits = $xml.segments.segment.resultpage.resultset.hit
foreach($hit in $hits)
{
$r = @{hit = $hit.no; url = "N/A";filename="N/A"}
$hit.field | % { $r[$_."NAME"] = $_."#text" }
New-Object PSObject -Property $r | Select hit,url,filename
}
Upvotes: 0
Reputation: 665
Something like this seems to do it, although i don't like that.
[xml]$xml=Get-Content .\xmlfile.xml
$hits = $xml.segments.segment.resultpage.resultset.hit
foreach($hit in $hits)
{
$result = new-object PSObject -Property @{ hit = $hit.no; filename = ""; path = ""}
foreach($field in $hit.field)
{
if ($field."NAME" -eq 'url')
{
$result.path = $field."#text"
}
if ($field."NAME" -eq 'filename')
{
$result.filename = $field."#text"
}
}
write-output $result
}
or, just grabbing all fields, and then selecting relevant:
[xml]$xml=Get-Content .\xmlfile.xml
$hits = $xml.segments.segment.resultpage.resultset.hit
foreach($hit in $hits)
{
$result = new-object PSObject -Property @{ hit = $hit.no }
$hit.field | % { Add-Member -InputObject $result -MemberType NoteProperty -Name $_."NAME" -Value $_."#text"}
$result | select hit,url,filename | write-output
}
Upvotes: 1
Reputation: 201592
Try something like this:
Select-Xml -Xml $xml -XPath '//HIT' | Foreach {
$num=$_.Node.NO
$filenameAttr = $_.Node.Field | where {$_.Name -eq 'filename'}
$pathAttr = $_.Node.Field | where {$_.Name -eq 'path'}
new-object psobject -Property ([ordered]@{HIT=$num; filename = $filenameAttr.InnerText; path = $pathAttr.InnerText})
}
Upvotes: 0