Reputation: 1154
I have the following xml. I would like to loop through each node and push the <url>
value into a bash array if <extern>
== 1. Any idea how I should approach this?
<GraphXML>
<graph isDirected="true">
<node name="0">
<label>font</label>
<url>http://fonts.googleapis.com/css?</url>
<data>
<checktime>0.262211</checktime>
<extern>1</extern>
</data>
</node>
<node name="1">
<label>logo</label>
<url>http://example.com/example.png</url>
<data>
<dlsize>7545</dlsize>
<checktime>0.280600</checktime>
<extern>0</extern>
</data>
</node>
</graph>
</GraphXML>
Upvotes: 1
Views: 1513
Reputation: 1734
Using bash
#!/bin/bash
declare -a ARR
while read -r line; do
if [[ "$line" =~ ^\<(url|extern)\>(.*)\</[^\>]*\>$ ]]; then
if [ "${BASH_REMATCH[1]}" == "extern" ]; then
(( ${BASH_REMATCH[2]} == 0 )) && unset ARR[${#ARR[@]}-1]
else
ARR+=("${BASH_REMATCH[2]}")
fi
fi
done < <(grep -oE '<(url|extern)>.*</(url|extern)>' file.xml)
echo "${ARR[@]}"
Explanation
grep -oE
- Uses extended regex -E
to match either url
or extern
and return match -o
done < <(
- Uses Process Substitution to feed grep
into while
loop.while read -r line
- Reads a line until EOF
then while
exits.^\<(url|extern)\>(.*)\</[^\>]*\>$
- Matches line and saves into BASH_REMATCH
array.unset ARR[${#ARR[@]}-1]
- Removes last element in array if extern attribute value is 0.ARR+=(...)
- Short form to add new element to array.Upvotes: 1
Reputation: 98038
Using xmllint:
out=$(echo "cat /GraphXML/graph/node/url|/GraphXML/graph/node/data/extern" | \
xmllint --shell input | sed 's/<[^>]*>//g;s/[-][-]*//g;s/\/[^>]*>//')
set $out
i=0
while [ $# -gt 0 ] ; do
url=$1
shift
extern=$1
shift
if [ $extern -eq 1 ]; then
array[$i]=$url
let i++
fi
done
echo ${array[*]}
Upvotes: 2