Amaynut
Amaynut

Reputation: 4271

XPATH check if an attribute contains one of multiple values

I need to check if an attribute of an XML node contains one of values Here's my XML:

<manifest>
    <item id="item_557c683790288" href="navigation.ncx" media-type="application/x-dtbncx+xml"/>
    <item id="toc" href="navigation.xhtml" media-type="application/xhtml+xml" properties="nav"/>
    <item id="item_557c68379035d" href="title-page.html" media-type="application/xhtml+xml" properties="scripted"/>
    <item id="item_557c683790414" href="imprint.html" media-type="application/xhtml+xml" properties="scripted svg"/>
    <item id="item_557c6837904b6" href="author.html" media-type="application/xhtml+xml" properties="scripted"/>
    <item id="item_557c683790572" href="file_557c6766c75a9.html" media-type="application/xhtml+xml" properties="scripted"/>
    <item id="item_557c683790625" href="liberio.css" media-type="text/css"/>
    <item id="item_557c6837906ef" href="assets/2dcc626f-387f-4658-d6f6-58570ae176e7.jpg" media-type="image/jpeg"/>
    <item id="item_557c6837907c4" href="assets/liberio_color.svg" media-type="image/svg+xml"/>
    <item id="item_557c683790879" href="assets/93d7f25284aeda831bde692e6b002b9f.png" media-type="image/png"/>
    <item id="item_557c683790949" href="assets/properties.js" media-type="application/javascript"/>
</manifest>

Right now I'm using the following expression:

$images = $this->opfSxml->xpath("//*[local-name()='manifest']/*[local-name()='item'][contains(@media-type,'png') or contains(@media-type, 'jpg') or contains(@media-type, 'ico') ]");

My code is working but repeating OR and CONTAINS for each value I'm checking against does't feel right for me.

Is there a short way to write this?

Upvotes: 3

Views: 2370

Answers (4)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243489

Use:

//manifest/*[contains('png|jpeg|ico',substring-after(@media-type,'/'))]

This assumes that the media types are prefix-free strings -- that is, no such string is a prefix of another string.

In case the prefix-free assumption doesn't hold, use:

//manifest/*[contains('|png|jpeg|ico|',concat('|',substring-after(@media-type,'/'),'|'))]

Upvotes: 2

Kirill Polishchuk
Kirill Polishchuk

Reputation: 56182

In pure XPath -- no. The shortest XPath I can think of is:

//manifest/item[contains(@media-type,'png') or contains(@media-type, 'jpg') or contains(@media-type, 'ico') ]

Upvotes: 1

splash58
splash58

Reputation: 26153

I'm not thinking that it's shorter,

$dom = new DOMDocument;
$dom->loadXML($str);
$xp = new DOMXPath($dom);

// Create function returning boolean
function is_image($str) { 
    str_replace(['png','jpeg','ico'], '', $str, $c);
    // If substring found, it returns true
    return $c !== 0 ;
}

$xp->registerPHPFunctions();
$xp->registerNamespace("php", "http://php.net/xpath");

// And now our short Xpath :) 
$images = $xp->query("//item[php:function('is_image', string(@media-type))]");

foreach($images  as $img) 
   print_r($img); 

Upvotes: 0

jdweng
jdweng

Reputation: 34429

Here is the method I would use. No ORs.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            string[] media_types = { "png", "jpg", "ico" };
            XElement manifest = XElement.Load(FILENAME);
            var items = manifest.Descendants("item").Select(x => new
            {
                id = (string)x.Attribute("item"),
                href = (string)x.Attribute("href"),
                media = (string)x.Attribute("media-type")
            }).ToList();
            var media = items.Where(x => media_types.Contains(x.href.Substring(x.href.LastIndexOf(".") + 1))).ToList();
        }
    }
}

Upvotes: -2

Related Questions