user983223
user983223

Reputation: 1154

parse xml and put data into bash array

I have the following xml. I would like to loop through each node and push the <url>

value into a bash array if <extern> == 1. Any idea how I should approach this?

<GraphXML>
      <graph isDirected="true">
        <node name="0">
          <label>font</label>
          <url>http://fonts.googleapis.com/css?</url>
          <data>
            <checktime>0.262211</checktime>
            <extern>1</extern>
          </data>
        </node>
        <node name="1">
          <label>logo</label>
          <url>http://example.com/example.png</url>
          <data>
            <dlsize>7545</dlsize>
            <checktime>0.280600</checktime>
            <extern>0</extern>
          </data>
        </node>
     </graph>
    </GraphXML>

Upvotes: 1

Views: 1513

Answers (2)

koola
koola

Reputation: 1734

Using bash

#!/bin/bash
declare -a ARR
while read -r line; do
    if [[ "$line" =~ ^\<(url|extern)\>(.*)\</[^\>]*\>$ ]]; then
        if [ "${BASH_REMATCH[1]}" == "extern" ]; then
            (( ${BASH_REMATCH[2]} == 0 )) && unset ARR[${#ARR[@]}-1]
        else
            ARR+=("${BASH_REMATCH[2]}")
        fi
    fi
done < <(grep -oE '<(url|extern)>.*</(url|extern)>' file.xml)

echo "${ARR[@]}"

Explanation

  • grep -oE - Uses extended regex -E to match either url or extern and return match -o
  • done < <( - Uses Process Substitution to feed grep into while loop.
  • while read -r line - Reads a line until EOF then while exits.
  • ^\<(url|extern)\>(.*)\</[^\>]*\>$ - Matches line and saves into BASH_REMATCH array.
  • unset ARR[${#ARR[@]}-1] - Removes last element in array if extern attribute value is 0.
  • ARR+=(...) - Short form to add new element to array.

Upvotes: 1

perreal
perreal

Reputation: 98038

Using xmllint:

out=$(echo "cat /GraphXML/graph/node/url|/GraphXML/graph/node/data/extern" | \
        xmllint --shell input | sed 's/<[^>]*>//g;s/[-][-]*//g;s/\/[^>]*>//')
set $out
i=0
while [ $#  -gt 0 ] ; do
  url=$1
  shift
  extern=$1
  shift
  if [ $extern -eq 1 ]; then
    array[$i]=$url
    let i++
  fi  
done

echo ${array[*]}

Upvotes: 2

Related Questions