JohnLBevan
JohnLBevan

Reputation: 24470

XPath to match Elements and Attributes

What is the correct XPath syntax to match both attributes and elements?

More Info

I created the below function to find elements and attributes which contain a given value:

function Get-XPathToValue {
    [CmdletBinding()]
    param (
        [Parameter(Mandatory)]
        [xml]$Xml
        ,
        [Parameter(Mandatory)]
        [string]$Value
    )
    process {
        $Xml.SelectNodes("//*[.='{0}']" -f ($Value -replace "'","''")) | %{
            $xpath = ''
            $elem = $_
            while (($elem -ne $null) -and ($elem.NodeType -ne 'Document')) {
                $xpath = '/' + $elem.Name + $xpath 
                $elem = $elem.SelectSingleNode('..')
            }
            $xpath
        }
    }
}

This matches elements, but not attributes.

By replacing $Xml.SelectNodes("//*[.='{0}']" with $Xml.SelectNodes("//@*[.='{0}']" I can match attributes, but not elements.

Example

[xml]$sampleXml = @"
<root>
    <child1>
        <child2 attribute1='hello'>
            <ignoreMe>what</ignoreMe>
            <child3>hello</child3>
            <ignoreMe2>world</ignoreMe2>
        </child2>
        <child2Part2 attribute2="ignored">hello</child2Part2>
    </child1>
    <notMe>
        <norMe>Not here</norMe>
    </notMe>
</root>
"@

Get-XPathToValue -Xml $sampleXml -Value 'hello'

Returns:

/root/child1/child2/child3
/root/child1/child2Part2

Should Return:

/root/child1/child2/attribute1
/root/child1/child2/child3
/root/child1/child2Part2

What have you tried?

I tried matching on:

Upvotes: 0

Views: 1600

Answers (2)

Tomalak
Tomalak

Reputation: 338316

Your method of deriving an XPath expression has three flaws, as indicated in the comments to your question.

  1. It does not handle the case where there are multiple elements with the same name at the same level.
  2. It does not handle quotes in values properly.
  3. It does not handle XML namespaces.

Here is my take on a function that addresses these points (I also gave it a name that I think is more appropriate within the cmdlet naming scheme):

function Convert-ValueToXpath {
    [CmdletBinding()]
    param (
        [Parameter(Mandatory)]
        [xml]$Xml
        ,
        [Parameter(Mandatory)]
        [string]$Value
    )
    process {
        $escapedValue = "concat('', '" + ($value -split "'" -join "', ""'"", '") + "')"
        $Xml.SelectNodes("(//*|//@*)[normalize-space() = {0}]" -f $escapedValue) | % {
            $xpath = ''
            $elem = $_
            while ($true) {
                if ($elem.NodeType -eq "Attribute") {
                    $xpath = '/@' + $elem.Name
                    $elem = $elem.OwnerElement
                } elseif ($elem.ParentNode) {
                    $precedingExpr = "./preceding-sibling::*[local-name() = '$($elem.LocalName)' and namespace-uri() = '$($elem.NamespaceURI)']"
                    $pos = $elem.SelectNodes($precedingExpr).Count + 1
                    $xpath = '/' + $elem.Name + "[" + $pos + "]" + $xpath
                    $elem = $elem.ParentNode
                } else {
                    break;
                }
            }
            $xpath
        }
    }
}

For your sample input I get these XPaths:

/root[1]/child1[1]/child2[1]/@attribute1
/root[1]/child1[1]/child2[1]/child3[1]
/root[1]/child1[1]/child2Part2[1]

Upvotes: 1

JohnLBevan
JohnLBevan

Reputation: 24470

Using the following XPath resolved the issue: //@*[.='{0}']|//*[.='{0}']

i.e.

function Get-XPathToValue {
    [CmdletBinding()]
    param (
        [Parameter(Mandatory)]
        [xml]$Xml
        ,
        [Parameter(Mandatory)]
        [string]$Value
    )
    process {
        $Xml.SelectNodes("//@*[.='{0}']|//*[./text()='{0}']" -f ($Value -replace "'","''")) | %{
            $xpath = ''
            $elem = $_
            while (($elem -ne $null) -and ($elem.NodeType -ne 'Document')) {
                $prefix = ''
                if($elem.NodeType -eq 'Attribute'){$prefix = '@'}
                $xpath = '/' + $prefix + $elem.Name + $xpath 
                $elem = $elem.SelectSingleNode('..')
            }
            $xpath
        }
    }
}

Upvotes: 1

Related Questions