arnpry
arnpry

Reputation: 1141

Parse Value from XML Tag

I am trying to parse out a value (21.0) out of an XML tag, <value type="float">21.0</value>.

XML Text (xml_parse.txt)

<OBSERVATIONS type="dict">
  <air_temp_value_1 type="dict">
    <date_time type="str">2019-07-25T10:35:00Z</date_time>
    <value type="float">21.0</value>
  </air_temp_value_1>
</OBSERVATIONS>

Attempted Code

cat xml_parse.txt | sed -nr 's/.* OBSERVATIONS="([0-9.]+).*/\1/p'

Upvotes: 0

Views: 119

Answers (3)

Jotne
Jotne

Reputation: 41460

Using awk

awk -F"[<>]" '/float/ {print $3}' xml_parse.txt
21.0

Upvotes: 1

choroba
choroba

Reputation: 242298

grep processes the input line by line. XML is not line based; use an XML-aware tool.

For example, using xmllint:

xmllint -xpath '/OBSERVATIONS/air_temp_value_1/value/text()' file.xml

Or, in xsh (a wrapper around XML::LibXML I happen to maintain) you can write

open file.xml ;
echo (/OBSERVATIONS/air_temp_value_1/value) ;

Upvotes: 2

collapsar
collapsar

Reputation: 17258

While it is possible to use sed or some other line-oriented processor, a more appropriate tool is xmlstarlet that observes the xml structure .

Your task is accomplished by

xmlstarlet sel -T -t -m '/OBSERVATIONS/air_temp_value_1/value' -v . -n xml_parse.txt

It extracts the value from an xml element specified by its xpath (which is a syntax to select data [elements, attributes, text, ...] from an xml tree.

This of course assumes that xmlstarlet has been installed first. Possibly it is already available on your system.

Alternatively you can rely on an xslt processor and the appropriate stylesheet.

PS: I have no affiliation with xmlstarlet other than having used it.

Upvotes: 1

Related Questions