Reputation: 1
I have to parse xml to csv, using xmllint --shell or xmllint --xpath, because I'm not allowed to instal additional packages.
I need firstname and phone in the csv, and nothing else. I tried this to loop through an xml and parse it to csv file, but the problem is when First name has space (for example Mary Jane) or the phone is missing. Then this kind of solution does not work.
for f in $(echo 'cat //FIRSTNAME/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//')
do
echo $f >> $CSV_FILE_NAMES
done
for i in $(echo 'cat //HOMEPHONE/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//')
do
echo $i >> $CSV_FILE_PHONES
done
paste -d "," $CSV_FILE_NAMES $CSV_FILE_PHONES >> $CSV
Or this combined solution, which places every entity in a new line:
for f in $(echo 'cat //FIRSTNAME/text()|//HOMEPHONE/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//')
do
echo $f >> $CSV_FILE
done
Mark
9999999999
Jack
8888888888
Is there a different way to loop through an xml file?
Upvotes: 0
Views: 3910
Reputation: 247162
Given an XML file file.xml
<PEOPLE>
<PERSON>
<FIRSTNAME>Alice</FIRSTNAME>
<HOMEPHONE>555-1212</HOMEPHONE>
</PERSON>
<PERSON>
<FIRSTNAME>Bob</FIRSTNAME>
<HOMEPHONE>123-4567</HOMEPHONE>
</PERSON>
</PEOPLE>
Then
echo 'cat (//FIRSTNAME | //HOMEPHONE)/text()' | xmllint --shell file.xml
outputs
/ > -------
Alice
-------
555-1212
-------
Bob
-------
123-4567
/ >
which is readily parsable with awk, among other tools:
echo 'cat (//FIRSTNAME | //HOMEPHONE)/text()' | xmllint --shell file.xml | awk '
NR % 4 == 2 {printf "%s,", $0}
NR % 4 == 0 {print $0}
'
Alice,555-1212
Bob,123-4567
Too bad you can't install other tools: xmlstarlet makes it pretty easy to format your output the way you like:
xmlstarlet sel -t -m //PERSON -v ./FIRSTNAME -o , -v ./HOMEPHONE -n file.xml
Alice,555-1212
Bob,123-4567
Upvotes: 4
Reputation: 56
In The XML Sample that you have provided, I think it would be simpler to loop over all the ZVM_DATA then use the XPath concat function to concatenate the FIRSTNAME, HOMEPHONE, or any other fields you'd like to include:
for index in $(seq $(xmllint --xpath "count(//ZVM_DATA)" test.xml))
do
xmllint --xpath "concat(//ZVM_DATA[$index]/FIRSTNAME/text(),',',//ZVM_DATA[$index]/HOMEPHONE/text())" --format test.xml
done
It is not the cleanest but unfortunately, xmllint supports only Xpath 1.0 otherwise it could be done in one command.
Edit: the result should look like this:
Michael ,7800002814
E,7800907671
Ryan,7909355223
Upvotes: 4
Reputation: 5557
You can use awk
in a following way
awk 'BEGIN{FS="[<|>]"} /FIRSTNAME/ { v1=$3 } /HOMEPHONE/ { v2=$3 } /\/ZVM_DATA/ {printf "%s, %s\n", v1, v2}'
Upvotes: 0