Maryse994
Maryse994

Reputation: 1

Bash shell XML to CSV

I have to parse xml to csv, using xmllint --shell or xmllint --xpath, because I'm not allowed to instal additional packages.

I need firstname and phone in the csv, and nothing else. I tried this to loop through an xml and parse it to csv file, but the problem is when First name has space (for example Mary Jane) or the phone is missing. Then this kind of solution does not work.

for f in $(echo 'cat //FIRSTNAME/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//') 
do
   echo $f  >> $CSV_FILE_NAMES
done

for i in $(echo 'cat //HOMEPHONE/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//') 
do
   echo $i  >> $CSV_FILE_PHONES
done


paste -d "," $CSV_FILE_NAMES $CSV_FILE_PHONES >> $CSV

Or this combined solution, which places every entity in a new line:

for f in $(echo 'cat //FIRSTNAME/text()|//HOMEPHONE/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//')
do
   echo $f  >> $CSV_FILE 
done
Mark

9999999999

Jack

8888888888

Is there a different way to loop through an xml file?

XML example

Upvotes: 0

Views: 3910

Answers (3)

glenn jackman
glenn jackman

Reputation: 247162

Given an XML file file.xml

<PEOPLE>
    <PERSON>
        <FIRSTNAME>Alice</FIRSTNAME>
        <HOMEPHONE>555-1212</HOMEPHONE>
    </PERSON>
    <PERSON>
        <FIRSTNAME>Bob</FIRSTNAME>
        <HOMEPHONE>123-4567</HOMEPHONE>
    </PERSON>
</PEOPLE>

Then

echo 'cat (//FIRSTNAME | //HOMEPHONE)/text()' | xmllint --shell file.xml

outputs

/ >  -------
Alice
 -------
555-1212
 -------
Bob
 -------
123-4567
/ >

which is readily parsable with awk, among other tools:

echo 'cat (//FIRSTNAME | //HOMEPHONE)/text()' | xmllint --shell file.xml | awk '
  NR % 4 == 2 {printf "%s,", $0}
  NR % 4 == 0 {print $0}
'
Alice,555-1212
Bob,123-4567

Too bad you can't install other tools: makes it pretty easy to format your output the way you like:

xmlstarlet sel -t -m //PERSON -v ./FIRSTNAME -o , -v ./HOMEPHONE -n file.xml
Alice,555-1212
Bob,123-4567

Upvotes: 4

Othmane El Warrak
Othmane El Warrak

Reputation: 56

In The XML Sample that you have provided, I think it would be simpler to loop over all the ZVM_DATA then use the XPath concat function to concatenate the FIRSTNAME, HOMEPHONE, or any other fields you'd like to include:

for index in $(seq $(xmllint --xpath "count(//ZVM_DATA)" test.xml))
do  
    xmllint --xpath "concat(//ZVM_DATA[$index]/FIRSTNAME/text(),',',//ZVM_DATA[$index]/HOMEPHONE/text())" --format test.xml
done

It is not the cleanest but unfortunately, xmllint supports only Xpath 1.0 otherwise it could be done in one command.

Edit: the result should look like this:

Michael ,7800002814
E,7800907671
Ryan,7909355223

Upvotes: 4

jordanvrtanoski
jordanvrtanoski

Reputation: 5557

You can use awk in a following way

awk 'BEGIN{FS="[<|>]"} /FIRSTNAME/ { v1=$3 } /HOMEPHONE/ { v2=$3 } /\/ZVM_DATA/ {printf "%s, %s\n", v1,  v2}'

Upvotes: 0

Related Questions