surge3333
surge3333

Reputation: 195

Parse an XML file with Powershell

I'm trying to parse an xml file and can't seem to extract the pieces I want. The xml file is for a homegrown system and I don't have any control over its layout.

The file looks like this –

$doc = [xml]@'
<?xml version="1.0" encoding="utf-8"?>
<rbacx >
    <namespace namespaceName="Team Name" namespaceShortName="ABC"/>
    <attributeValues>
        <attributeValue id="Role=Administrator">
            <value><![CDATA[Administrator]]></value>
            <attributes>
                <attribute name="Glossary">
                    <attributeValues>
                        <attributeValue><value><![CDATA[Administrator (service accounts)]]></value></attributeValue>
                    </attributeValues>
                </attribute>
            </attributes>
        </attributeValue>
        <attributeValue id="Role=Operator">
            <value><![CDATA[Operator]]></value>
            <attributes>
                <attribute name="Glossary">
                    <attributeValues>
                        <attributeValue><value><![CDATA[Operator (all accounts)]]></value></attributeValue>
                    </attributeValues>
                </attribute>
            </attributes>
        </attributeValue>
    </attributeValues>
    <accounts>
        <account id="FRED">
            <name><![CDATA[[email protected]]]></name>
            <endPoint>ABC</endPoint>
            <domain>ABC</domain>
            <comments/>
            <attributes>
                <attribute name="AppBoRID">
                    <attributeValues>
                        <attributeValue><value><![CDATA[FRED]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="Role">
                    <attributeValues><attributeValueRef id="Role=Operator"/></attributeValues>
                </attribute>
            </attributes>
        </account>
        <account id="BARNEY">
            <name><![CDATA[[email protected]]]></name>
            <endPoint>ABC</endPoint>
            <domain>ABC</domain>
            <comments/>
            <attributes>
                <attribute name="AppBoRID">
                    <attributeValues>
                        <attributeValue><value><![CDATA[BARNEY]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="Role">
                    <attributeValues><attributeValueRef id="Role=Administrator"/></attributeValues>
                </attribute>
            </attributes>
        </account>
        <account id="NonPeopleID_CC1234">
            <name><![CDATA[[email protected]]]></name>
            <endPoint>ABC</endPoint>
            <domain>ABC</domain>
            <comments/>
            <attributes>
                <attribute name="appUserName">
                    <attributeValues>
                        <attributeValue><value><![CDATA[WILMA]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="CostCentre">
                    <attributeValues>
                        <attributeValue><value><![CDATA[1234]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="Bank_Number">
                    <attributeValues>
                        <attributeValue><value><![CDATA[0000]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="Directory">
                    <attributeValues>
                        <attributeValue><value><![CDATA[XYZ]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="Role">
                    <attributeValues><attributeValueRef id="Role=Administrator"/></attributeValues>
                </attribute>
            </attributes>
        </account>
        <account id="NonPeopleID_CC1234">
            <name><![CDATA[[email protected]]]></name>
            <endPoint>ABC</endPoint>
            <domain>ABC</domain>
            <comments/>
            <attributes>
                <attribute name="appUserName">
                    <attributeValues>
                        <attributeValue><value><![CDATA[BETTY]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="CostCentre">
                    <attributeValues>
                        <attributeValue><value><![CDATA[1234]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="Bank_Number">
                    <attributeValues>
                        <attributeValue><value><![CDATA[0000]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="Directory">
                    <attributeValues>
                        <attributeValue><value><![CDATA[XYZ]]></value></attributeValue>
                    </attributeValues>
                </attribute>
                <attribute name="Role">
                    <attributeValues><attributeValueRef id="Role=Operator"/></attributeValues>
                </attribute>
            </attributes>
        </account>
    </accounts>
</rbacx>

'@

I'd like to get back output looking like this –

Name            Role
FRED            Operator
BARNEY          Administrator
WILMA           Administrator
BETTY           Operator

I'm able to get the name@domain with this –

$Doc.rbacx.accounts.account.name 

where it returns - 

#cdata-section
--------------
[email protected]  
[email protected]
[email protected] 
[email protected]  

I can get all the value attributes with this -

$Doc.rbacx.accounts.account.attributes.attribute.attributevalues.attributevalue.value 

where it returns - 

#cdata-section
--------------
FRED          
BARNEY        
WILMA         
1234          
0000          
XYZ           
BETTY         
1234          
0000          
XYZ   

I can't seem to get the role associated with the user returned. I'm thinking it should be along the lines of this –

$Doc.rbacx.accounts.account.attributes.attribute.attributevalues.attributevalue.attributeValueRef 

However that doesn't return anything.

Any thoughts on how I can get the User & its associated Role output here?

Upvotes: 0

Views: 87

Answers (2)

arjabbar
arjabbar

Reputation: 6404

Once you are familiar with XPath the rest is not too bad. Also, what makes this harder is that the role names are Role=Operator instead of just Operator. So you have to do a string split on the value of the attribute id on the attributeValueRef nodes. In the end you'll have something like this.

Edit: The coalesce operator only runs in Powershell 7+

Lots of alternatives given your PowerShell version can be found here.

$doc.SelectNodes("//accounts/account") | % { [pscustomobject]@{Name = $_.SelectSingleNode('attributes/attribute[@name="appUserName"]//value').'#cdata-section' ?? $_.id; Role = $_.SelectSingleNode('attributes/attribute[@name="Role"]//attributeValueRef/@id').value.Split('=')[1] } }

The output is a [PSCustomObject]. Running this will show your output in the console, but if you really need this to just be a string, you can pipe this through Out-String or Format-Table.

Upvotes: 2

Doug Maurer
Doug Maurer

Reputation: 8868

Arjabbar's answer is fantastic, unless you are on 5.1 where the ?? syntax is not supported. I'm sure there's a cleaner way to check but this will achieve your result. If the account has an appusername it will use that, otherwise it will use the ID.

$doc.SelectNodes("//accounts/account") | foreach {
    $name = if($appusername = $_.SelectSingleNode('attributes/attribute[@name="appUserName"]//value').'#cdata-section')
    {
        $appusername
    }
    else
    {
        $_.id
    }
    [PSCustomObject]@{
        Name = $name
        Role = $_.SelectSingleNode('attributes/attribute[@name="Role"]//attributeValueRef/@id').value.Split('=') | select -last 1
    }
}

Name   Role         
----   ----         
FRED   Operator     
BARNEY Administrator
WILMA  Administrator
BETTY  Operator 

Upvotes: 1

Related Questions