robert
robert

Reputation: 2027

How to (e) grep XML for certain tag content?

How can I (e)grep all content between a certain tag block?

Assume input file below, I want to get as output all characters between the B-tags so:

<B><C>Test</C></B>
<B>Test2</B>

I tried the following grep to search all XML files with the content between the <B> and </B> tags.

grep '<B>.*</B>' *.xml

but it did not work.

For the following input:

<A>
 <B>
  <C>Test</C>
 </B>
 <D>
 </D>
 <B>
    Test2
 </B>
</A>

Any ideas?

Upvotes: 1

Views: 4147

Answers (2)

marbu
marbu

Reputation: 2021

When working with xml files, the best way is to use xml tools.

XMLStarlet:

xmlstarlet sel -t -c '//B' file.xml

xmllint from libxml2:

xmllint --xpath '//B' file.xml

Upvotes: 0

Jeremy Stein
Jeremy Stein

Reputation: 19651

Use awk:

awk '/<B>/,/<\/B>/'

Upvotes: 3

Related Questions