ibwell
ibwell

Reputation: 31

How to format cat command as a table in bash

I have the following code that parses XML to display the node value of each element in a file.

#Abbreviation - symbol
cat elements/*.xml |  egrep "<symbol>.*</symbol>" |sed -e "s/<symbol>\(.*\)<\/symbol>/\1/"|tr "|" " "

#Weight - atomic-weight
cat elements/*.xml |  egrep "<atomic-weight>.*</atomic-weight>" |sed -e "s/<atomic-weight>\(.*\)<\/atomic-weight>/\1/"|tr "|" " "

#Number atomic-number
cat elements/*.xml |  egrep "<atomic-number>.*</atomic-number>" |sed -e "s/<atomic-number>\(.*\)<\/atomic-number>/\1/"|tr "|" " " > number

How can I format the three of these outputs as a table instead of one huge sequential list?

Sample Data -

File1 -

  <symbol>Ag</symbol>
  <atomic-number>47</atomic-number>
  <atomic-weight>107.8682</atomic-weight>

File2 -

  <symbol>Ba</symbol>
  <atomic-number>56</atomic-number>
  <atomic-weight>137.327</atomic-weight>

Desired Output -

Symbol   Number   Weight
Ag       47       107.8682
Ba       56       137.327

Upvotes: 0

Views: 1013

Answers (3)

urznow
urznow

Reputation: 1811

Provided the input files are XML external general parsed entities, and so concatenate to well-formed XML if wrapped in a root element, you can use to process them in one go:

printf '<doc>%s</doc>\n' "$(cat file*.xml)" |
xmlstarlet select --template --var ofs="'$(printf "\t")'" \
    --value-of 'concat("Symbol", $ofs, "Number", $ofs, "Weight")' --nl \
    --match '*/*[position() mod 3 = 1]' --sort 'A:T:-' '.' \
    --value-of 'concat(., $ofs, following-sibling::*[1], $ofs, following-sibling::*[2])' --nl
  • printf: wrap a document element around input
  • --var ofs : define output field separator
  • first --value-of: emit header
  • list every 3rd element (i.e. symbol) sorted by symbol as Ascending Text
  • to sort by atomic-number instead: --sort 'A:N:-' 'following-sibling::*[1]'

Output:

Symbol  Number  Weight
Ag      47      107.8682
Ba      56      137.327

Upvotes: 0

user1934428
user1934428

Reputation: 22291

If the only thing you know is that is is valid XML, you would better use an XML parser. Such parsers come included, for instance, with Ruby or Perl. This would allow you to also parse a file content which looks like

<atomic-weight>
  107.8682
</atomic-weight>

If however you can ensure that the input files follow exactly the format you have posted, you could do something like:

for file in File1 File2
do
  tr '<>' ' ' <$file | cut -d ' ' -f 3
done

If you need to format the data into columns at particular positions, you could do something like

for file in File1 File2
do
  printf ' put your format specification here ' $(tr '<>' ' ' <$file | cut -d ' ' -f 3)
done

Upvotes: 0

Nic3500
Nic3500

Reputation: 8621

Try this:

#!/bin/bash

printf '%-9s %-9s %-9s\n' "Symbol" "Number" "Weight"

for F in *.xml
do
    symbol=$(grep -E "<symbol>.*</symbol>" "$F"               | sed -e "s/.*<symbol>\(.*\)<\/symbol>.*/\1/")
    number=$(grep -E "<atomic-number>.*</atomic-number>" "$F" | sed -e "s/.*<atomic-number>\(.*\)<\/atomic-number>.*/\1/")
    weight=$(grep -E "<atomic-weight>.*</atomic-weight>" "$F" | sed -e "s/.*<atomic-weight>\(.*\)<\/atomic-weight>.*/\1/")

    printf '%-9s %-9s %-9s\n' "$symbol" "$number" "$weight"
done
  • printf allows you to format the width and alignment in that width of printed text (or number, or floats, ...).
  • to avoid any interpretation of weight values (i.e. number of decimals for example), all values are printed as strings.
  • for printf, '%-9s' means it will print the value using 9 chars wide, left aligned. Without the -, it will align right.
  • printf does not output a carriage return unless you tell it to, which explains the \n.
  • I reused your grep ... | sed ... commands, but for 2 details. 1 Used grep -E instead of egrep. 2 Added .* at the beginning and end of your sed to eliminate prefixes or suffixes to the <SOMETHING> tags.

The output I get is:

$ ./so.bash 
Symbol    Number    Weight   
Ag        47        107.8682 
Ba        56        137.327  

Upvotes: 2

Related Questions