Lonny Selinger
Lonny Selinger

Reputation: 25

Extract tag contents based on value of another tag qualifier using xmllint

I'm trying to use xmllint to extract data from a tag if a condition exists on a previous tag. I know there are probably better tools but I'm limited to xmllint and/or system standard commands like sed, awk, etc.

xml file:

<?xml version="1.0" encoding="UTF-8"?>
<MainGroup>
<MainGroupEntry name="aaa" function="xxx">
    <EntryType type="AAA"/>
    <EntryDescription>Capture This A</EntryDescription>
    <EntryRandomList>Just,a,random,list,of,things,to,discard</EntryRandomList>
</MainGroupEntry>
<MainGroupEntry name="aaa" function="xxx">
    <EntryType type="AAA"/>
    <EntryDescription>Capture This A</EntryDescription>
    <EntryRandomList>Just,a,random,list,of,things,to,discard</EntryRandomList>
</MainGroupEntry>
<MainGroupEntry name="bbb" function="yyy">
    <EntryType type="BBB"/>
    <EntryDescription>Capture This B</EntryDescription>
    <EntryRandomList>Just,a,random,list,of,things,to,discard</EntryRandomList>
</MainGroupEntry>
<MainGroupEntry name="bbb" function="yyy">
    <EntryType type="BBB"/>
    <EntryDescription>Capture This B</EntryDescription>
    <EntryRandomList>Just,a,random,list,of,things,to,discard</EntryRandomList>
</MainGroupEntry>
</MainGroup>

What I'm "trying to do is; for every Entry type="AAA", print the accompanying EntryDescription. I've tried different variations of: xmllint --xpath '//MainGroupEntry/EntryType[@type="AAA"]/EntryDescription/text()' my_file.xml but I always get an empty XPath set. If I drop trying to get the Description text, I can see the entries that match my 'type' condition:

xmllint --xpath '//MainGroupEntry/EntryType[@type="AAA"]' my_file.xml <EntryType type="AAA"/><EntryType type="AAA"/>

I just can't seem to figure out how to only grab the text from the Description field. Thoughts?

Upvotes: 1

Views: 1278

Answers (1)

choroba
choroba

Reputation: 241858

You can use the following-sibling axis and the text() function to extract only the text from the description:

xmllint --xpath '/MainGroup/MainGroupEntry/EntryType[@type="AAA"]/following-sibling::EntryDescription/text()' file.xml

To separate the texts, you can use the --shell option with cat:

echo 'cat /MainGroup/MainGroupEntry/EntryType[@type="AAA"]/following-sibling::EntryDescription/text()' \
| xmllint --shell file.xml

You might need to | grep -v ' -----\|/ >' the output to remove the separators and prompt.

Upvotes: 1

Related Questions