Jeff82
Jeff82

Reputation: 157

Trying to figure out how to convert function to accept piped stdin

I am working on a way to easily parse XML using bash for a defined purpose. I have gotten this to work with some code I found on this site which I then recoded everything because this code worked so well. This is currently working with a function and I have to have the data in a file to be able to process it. Here is it in it's working state:

[ ~]$ cat testxml.xml
CTYPE PARTS SYSTEM "parts.dtd">
<?xml-stylesheet type="text/css" href="xmlpartsstyle.css"?>
<PARTS>
   <TITLE>Computer Parts</TITLE>
   <PART>
      <ITEM>Motherboard</ITEM>
      <MANUFACTURER>ASUS</MANUFACTURER>
      <MODEL>P3B-F</MODEL>
      <COST> 123.00</COST>
   </PART>
   <PART>
      <ITEM>Video Card</ITEM>
      <MANUFACTURER>ATI</MANUFACTURER>
      <MODEL>All-in-Wonder Pro</MODEL>
      <COST> 160.00</COST>
   </PART>
   <PART>
      <ITEM>Sound Card</ITEM>
      <MANUFACTURER>Creative Labs</MANUFACTURER>
      <MODEL>Sound Blaster Live</MODEL>
      <COST> 80.00</COST>
   </PART>
   <PART>
      <ITEM> 20 inch Monitor</ITEM>
      <MANUFACTURER>LG Electronics</MANUFACTURER>
      <MODEL> 995E</MODEL>
      <COST> 290.00</COST>
   </PART>
</PARTS>

[ ~]$
[ ~]$ rdom () { local IFS=\> ; read -d \< E C ;} ; while rdom; do if [[ $E = 'PART' ]] || [[ $E = 'ITEM' ]] || [[ $E = 'COST' ]] ; then  echo $E: $C ; fi ; done < testxml.xml | xargs -L3
PART: ITEM: Motherboard COST: 123.00
PART: ITEM: Video Card COST: 160.00
PART: ITEM: Sound Card COST: 80.00
PART: ITEM: 20 inch Monitor COST: 290.00
[ ~]$

As you can see this pulls out the data I am looking for and I am able to reformat it to suit my needs. However I would much rather prefer to have this accept the input from stdin such as the following:

cat out.xml2 | IFS=\> ; until [ EOF ]; do read -d \< E C ; if [[ $E = 'PART' ]] || [[ $E = 'ITEM' ]] || [[ $E = 'COST' ]] ; then  echo $E: $C ; fi ; done;

This code never ends the loop. This may be impossible and I just don't understand how the loop is ending b/c it has "rdom" as the expression it is waiting for to show loop termination. I've tried this with a while loop, etc. Not sure how to determine when the data is no more so that the loop can end. I feel like there may be a much better way restructure this that i'm completely missing although. I like being able to use stdin b/c it allows easy use for one liners. The actual data I am parsing is much larger and multi-dimensional. I created this example for testing purposes. The first example works with the large data I have though. End result is I am trying to get this to parse from stdin rather then from a file. Any recommendations are much appreciated.

Jeff

Upvotes: 1

Views: 34

Answers (1)

John1024
John1024

Reputation: 113814

Try:

$ rdom() { local IFS=\> ; while read -d \< E C ; do if [[ $E = 'PART' ]] || [[ $E = 'ITEM' ]] || [[ $E = 'COST' ]] ; then  echo $E: $C ; fi ; done; }
$ rdom <out.xml2
PART: 

ITEM: Motherboard
COST:  123.00
PART: 

ITEM: Video Card
COST:  160.00
PART: 

ITEM: Sound Card
COST:  80.00
PART: 

ITEM:  20 inch Monitor
COST:  290.00

Or, without using the function definition but still taking input from stdin:

{ IFS=\> ; while read -d \< E C ; do if [[ $E = 'PART' ]] || [[ $E = 'ITEM' ]] || [[ $E = 'COST' ]] ; then  echo $E: $C ; fi ; done; } <out.xml2

Because the question does not show desired output, I don't know if this is what you want.

Some comments:

  1. cat out.xml2 | IFS=\> ; sends the text of out.xml2 to the variable assignment IFS=\>. After the variable assignment completes, the text is discarded.

  2. until [ EOF ]; do read -d \< E C ; ... does not do what you want. In shell, the string EOF is just three characters. By contrast, while read -d \< E C ; do ... will stop when the input is exhausted.

Examples with piping

To demonstrate that the above work with piping, not just redirection from a file, try:

cat out.xml2 | rdom

Or:

cat out.xml2 | { IFS=\> ; while read -d \< E C ; do if [[ $E = 'PART' ]] || [[ $E = 'ITEM' ]] || [[ $E = 'COST' ]] ; then  echo $E: $C ; fi ; done; }

Alternative output format

Continuing with the use of cat as a stand in for a pipeline:

$ cat out.xml2 | { IFS=\> ; while read -d \< E C ; do case "$E" in PART) printf "%s:" "$E";; ITEM) printf " %s: %s" "$E" "$C";; COST) printf " %s: %s\n" "$E" "$C";; esac ; done; }
PART: ITEM: Motherboard COST:  123.00
PART: ITEM: Video Card COST:  160.00
PART: ITEM: Sound Card COST:  80.00
PART: ITEM:  20 inch Monitor COST:  290.00

Upvotes: 1

Related Questions