Ali
Ali

Reputation: 1

How to read an XML file using PowerShell and filter the required data

How to read an XML file and extract data from that file which has large number of tags using PowerShell? I am using the below code to extract tag, but am unable to read data from sub tag.

$xmlFile= "D:\Testing\TestcasesOutput\1ac.xml"
$xmlConfig = [System.Xml.XmlDocument](Get-Content $xmlFile)
$XmlDocument.Breakfast_menu.price

I expect the output to read the whole xml file but unable to read whole xml file.

<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu>
    <food>
        <food>Belgian Waffles</food>
        <price>$5.95</price>
        <description>Two of our famous Belgian Waffles with plenty of real maple 
        syrup</description>
        <calories>650</calories>
    </food>
    <food>
        <food>Strawberry Belgian Waffles</food>
        <price>$7.95</price>
        <description>Light Belgian waffles covered with strawberries and whipped 
        cream</description>
        <calories>900</calories>
    </food>
    <food>
        <food>Berry-Berry Belgian Waffles</food>
        <price>$8.95</price>
        <description>Light Belgian waffles covered with an assortment of fresh 
        berries and whipped cream</description>
        <calories>900</calories>
    </food>
    <food>
        <food>French Toast</food>
        <price>$4.50</price>
        <description>Thick slices made from our homemade sourdough 
        bread</description>
        <calories>600</calories>
    </food>
    <food>
    <food>Homestyle Breakfast</food>
        <price>$6.95</price>
        <description>Two eggs, bacon or sausage, toast, and our ever-popular hash 
        browns</description>
        <calories>950</calories>
    </food>
</breakfast_menu>

Upvotes: 0

Views: 11719

Answers (2)

js2010
js2010

Reputation: 27423

Another nice way to open an xml file, and avoid any utf8 no bom encoding problems. I couldn't do it in one step. After a load & save, it puts the bom in, if it has an encoding tag. Utf8 is assumed with no tag.

<?xml version="1.0" encoding="utf-8"?>
<name>😊</name>
$xml = [xml]::new()
$xml.load("$pwd\file.xml")
$xml.save("$pwd\file.xml")

Upvotes: 0

Theo
Theo

Reputation: 61068

Reading XML with PowerShell is really easy.

Suppose your xml file looks similar to this:

<?xml version="1.0" encoding="UTF-8"?> 
<breakfast_menu> 
    <food> 
        <food>Belgian Waffles</food>
        <price>$5.95</price>
        <description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
        <calories>650</calories>
    </food>
    <food> 
        <food>Fried Egg</food>
        <price>$1.80</price>
        <description>blahblah</description>
        <calories>3500</calories>
    </food>
</breakfast_menu>

You simply read and have PowerShell parse the file into an object, using this

[xml]$xml = Get-Content 'D:\Testing\TestcasesOutput\1ac.xml'

Next, you can use the properties of this $xml object to get whatever it is you want to extract from it:

For instance, loop through all <food> items and output the information you want

$xml.breakfast_menu.food | ForEach-Object {
    [PSCustomObject]@{
        'MenuItem' = $_.food
        'Price'    = $_.price
    }
}

results in this output:

MenuItem        Price
--------        -----
Belgian Waffles $5.95
Fried Egg       $1.80

Or select just the one the item for 'Belgian Waffles':

$xml.breakfast_menu.food | Where-Object { $_.food -eq 'Belgian Waffles' } | 
                           Select-Object @{Name = 'MenuItem'; Expression = {$_.food}}, Price

outputs:

MenuItem        price
--------        -----
Belgian Waffles $5.95

If all you are after is the price for a certain food item, you can do this:

$xml.breakfast_menu.food | Where-Object { $_.food -eq 'Belgian Waffles' } | 
                           Select-Object -ExpandProperty Price

or even shorten that code:

($xml.breakfast_menu.food | Where-Object { $_.food -eq 'Belgian Waffles' }).price

Hope that explains


Edit

If you need to do this on multiple xml files and these files are inside the same root path, you could loop through using Get-ChildItem to get the xml files and process them like in the examples I gave.

Get-ChildItem -Path 'ROOTFOLDER OF THE FOLDERS WHERE THE XML FILES ARE KEPT' -Filter '*.xml' -File -Recurse | 
    ForEach-Object {
        [xml]$xml = Get-Content -Path $_.FullName
        # in this example simply output the menu items and their price for each xml file
        foreach ($item in $xml.breakfast_menu.food) {
            [PSCustomObject]@{
                'File'     = $_.FullName    # added the file FullName so you know where the item came from
                'MenuItem' = $item.food
                'Price'    = $item.price
            }
        }
    }

Or from several locations:

$folders = 'D:\Testing\TestcasesOutput\1ac7b5a0-2d62-403c-8394-5bd33330cbe7',
           'D:\Testing\TestcasesOutput\227c619a-b7d1-4da6-8fe5-f2c923ddcb7a',
           'D:\Testing\TestcasesOutput\d4370ae1-643f-4c44-ba41-7f640afcc276'

$result = Get-ChildItem -Path $folders -Filter '*.xml' -File | 
    ForEach-Object {
        [xml]$xml = Get-Content -Path $_.FullName
        # in this example simply output the menu items and their price for each xml file
        foreach ($item in $xml.breakfast_menu.food) {
            [PSCustomObject]@{
                'File'     = $_.FullName
                'MenuItem' = $item.food
                'Price'    = $item.price
            }
        }
    }

#output to screen:
$result

# output to CSV
$result | Export-Csv -Path 'PATH AND FILENAME FOR THE OUTPUT CSV FILE' -NoTypeInformation

Upvotes: 4

Related Questions