Navneet Gautam
Navneet Gautam

Reputation: 17

how to get value of two tags of xml using sed

I have an xml files and I want to fetch the values of some tags. XML is something like this given below:

<?xml version="1.0" standalone = "no"?>
<!DOCTYPE handover_list PUBLIC"EN""h">
<X1>
<X2>
<X3>USA</X3>
<date_time>20170813T18:18-04:00</date_time>
<id action="I">XXXXXXXXXXXXXX</id>
<id action="I">YYYYYYYYYYYYYY</id>
<id action="I">ZZZZZZZZZZZZZZ</id>
</X2>
<X2>
<X3>UAE</X3>
<date_time>20160814T15:15-03:04</date_time>
<id action="I">AAAAAAAAAAAAAA</id>
<id action="I">BBBBBBBBBBBBBB</id>
<id action="I">CCCCCCCCCCCCCC</id>
</X2>
</X1>

What I'm using is:

sed -n 's:.*<X3>\(.*\)</X3>.*:\1:p' formated.xml
sed -n 's:.*<id action="I">\(.*\)</id>.*:\1:p' formated.xml

and its given output like this:

USA
UAE
XXXXXXXXXXXXXX
YYYYYYYYYYYYYY
ZZZZZZZZZZZZZZ
AAAAAAAAAAAAAA
BBBBBBBBBBBBBB
CCCCCCCCCCCCCC

What I want is to merge both the sed commands used above so that I can get the output like this:

USA
XXXXXXXXXXXXXX
YYYYYYYYYYYYYY
ZZZZZZZZZZZZZZ
UAE
AAAAAAAAAAAAAA
BBBBBBBBBBBBBB
CCCCCCCCCCCCCC

Upvotes: 0

Views: 253

Answers (3)

mop
mop

Reputation: 433

GNU sed:

sed '/<X3>/{s/<[^>]*>//g;h};/ action=/{s/<[^>]*>//g;H};/<\/X2>/{g;b};d' formated.xml

Upvotes: 0

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

The right way is using xml parsers like xmlstarlet:

In such case, <DOCTYPE ..> tag is redundant.

xmlstarlet sel -t -v '//X2/*[not(self::date_time)]' -n formated.xml

The output:

USA
XXXXXXXXXXXXXX
YYYYYYYYYYYYYY
ZZZZZZZZZZZZZZ
UAE
AAAAAAAAAAAAAA
BBBBBBBBBBBBBB
CCCCCCCCCCCCCC

Upvotes: 1

Cyrus
Cyrus

Reputation: 88583

Concatenate both sed commands with one ;:

sed -n 's:.*<X3>\(.*\)</X3>.*:\1:p' formated.xml
sed -n 's:.*<id action="I">\(.*\)</id>.*:\1:p' formated.xml

To one sed command:

sed -n 's:.*<X3>\(.*\)</X3>.*:\1:p; s:.*<id action="I">\(.*\)</id>.*:\1:p' formated.xml

Output:

USA
XXXXXXXXXXXXXX
YYYYYYYYYYYYYY
ZZZZZZZZZZZZZZ
UAE
AAAAAAAAAAAAAA
BBBBBBBBBBBBBB
CCCCCCCCCCCCCC

Upvotes: 1

Related Questions