Claus Jørgensen
Claus Jørgensen

Reputation: 26351

How to parse out the value of several attributes on different elements with xmllint/xpath?

For a given xml file called configurations.xml I would like to extract the value of each conf element, and store it in a variable for later use.

<configurations>
  <conf name="bob"/>
  <conf name="alice"/>
  <conf name="ted"/>
  <conf name="carol"/>
</configurations>

The expected output is:

bob
ailce
ted
carol

I have xpath and xmllint available. A xpath of //conf/@name gets the nodes, but outputs as name="bob", which is what I'm trying to avoid.

Upvotes: 3

Views: 3294

Answers (5)

wisbucky
wisbucky

Reputation: 37797

I searched everywhere for this seemingly simple answer. It appears that it is not possible for xmllint to print attribute values from multiple nodes. You can use string(//conf/@name), but that will only print a single value even if there are multiple nodes that match.

If you are stuck with xmllint, the only way is to use additional text processing. Here's a generic way that will parse out the attribute value. It assumes the values do not contain = or " characters.

xmllint --xpath //conf/@name | 
tr ' ' '\n' | awk -F= '{print $2}' | sed 's/"//g'

The first pipe converts spaces to newlines.

The second pipe prints what's after the =

The last pipe removes all "

Upvotes: 0

xmlstarlet sel -t -m '//configurations/conf' -v '@name' -n a.xml

worked since xmllint does not seem capable. Good intro here.

Tested on: xmlstarlet version 1.5.0, Ubuntu 14.04.

It fails however on large files: ulimit -Sv 500000 (limit it to 500Mb) dies on a 1.2Gb XML, and jams my computer without the memory limit. See also:

Upvotes: 3

Alin Pandichi
Alin Pandichi

Reputation: 955

If you really want use xpath and to display only the attribute values without the "name=" part, then here's what worked for me:

xpath configurations.xml 'string(//conf/@name)' 2>/dev/null

In plain English, wrap your XPath query in string(), and also suppress the verbose ouput of xpath by adding 2>/dev/null at the end.

Upvotes: -1

gniourf_gniourf
gniourf_gniourf

Reputation: 46813

I don't know how to achieve what you're trying to achieve with xmllint only.

Since you have xpath installed, you have Perl's XML::XPath too. So a little bit of Perl:

#!/usr/bin/perl

use XML::Path;

my $xp=XML::XPath->new(filename => 'configurations.xml');

my $nodeset=$xp->find('//conf/@name');
foreach my $node ($nodeset->get_nodelist) {
    print $node->getNodeValue,"\0";
}

will output what you want, separated with a nil character.

In a one-liner style:

perl -mXML::XPath -e 'foreach $n (XML::XPath->new(filename => "configurations.xml")->find("//conf/\@name")->get_nodelist) { print $n->getNodeValue,"\0"; }'

To retrieve them in, e.g., a Bash array:

#!/bin/bash

names=()
while IFS= read -r -d '' n; do
    names+=( "$n" )
done < <(
    perl -mXML::XPath -e 'foreach $n (XML::XPath->new(filename => "configurations.xml")->find("//conf/\@name")->get_nodelist) { print $n->getNodeValue,"\0" }'
)
# See what's in your array:
display -p names

Note that at this point you have the option of turning to Perl and drop Bash completely to solve your problem.

Upvotes: 1

Sriharsha Kalluru
Sriharsha Kalluru

Reputation: 1823

You can use awk command to make it done.

[root@myserver tmp]# cat /tmp/test.xml
<configurations>
  <conf name="bob"/>
  <conf name="alice"/>
  <conf name="ted"/>
  <conf name="carol"/>
</configurations>
[root@myserver tmp]# awk -F \" '{print $2}' /tmp/test.xml |grep -v '^$'
bob
alice
ted
carol
[root@myserver tmp]#

Upvotes: -1

Related Questions