Hein
Hein

Reputation: 175

parse line output into table in bash

I have a file which contains only lines of the form

new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003

is it possible to parse this output with bash into a form like

7,59,0.876,0.000433344,0.00003

to read it then into python?

Upvotes: 0

Views: 793

Answers (8)

Ed Morton
Ed Morton

Reputation: 203254

$ sed -r 's/[^0-9.]+/,/g;s/^,//' file
7,59,0.876,0.000433344,0.00003

$ awk -F'[^0-9.]+' -v OFS=',' '{$1=$1;sub(/^,/,"")} 1' file
7,59,0.876,0.000433344,0.00003

$ sed -r 's/[^0-9.,;]+//g;s/;/,/g' file
7,59,0.876,0.000433344,0.00003

$ awk -F';' -v OFS=',' '{$1=$1;gsub(/[^0-9.,]/,"")} 1' file
7,59,0.876,0.000433344,0.00003

Personally I prefer the last 2 as they don't add a comma and then remove it again, which always feels kinda cludgy and error-prone.

Upvotes: 0

anubhava
anubhava

Reputation: 785008

Using gnu awk:

cat file

new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003

awk -F ' *[=()] *' -v RS=' ; |\n' -v OFS= -v ORS= 'NF{print $2, (NR%4==0)? "\n":","}' file
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003

Upvotes: 0

clt60
clt60

Reputation: 63892

Many solutions, only perl misisng ;)

perl -nlE '$,=",";say m/[\d.]+/g'
  • set the "list separator" to ,
  • match only numbers (returns a list)
  • print the list

or (ofc) @neronlevelu's solution

perl -plE 's/[^\d,;.]//g;y/;/,/'
  • remove anything what isn't an digit,;.
  • change ; to ',' (the y transliterates all occurrences of the characters found in the search list with the corresponding character in the replacement list ) - aka tr.

Upvotes: 0

perreal
perreal

Reputation: 97938

Using sed:

sed 's/[^0-9,.][^0-9,.]*/ /g' input

for better formatting:

 sed 's/[^0-9,.][^0-9,.]*/ /g' input | column -to,

Gives:

7,59,0.876,0.000433344,0.00003

Upvotes: 1

NeronLeVelu
NeronLeVelu

Reputation: 10039

sed 's/[^0-9,;.]//g;y/;/,/' YourFile
  1. Remove any non digit, and ,.;
  2. Change ; to ,

Upvotes: 3

Avinash Raj
Avinash Raj

Reputation: 174696

You could try the below sed command if the contents are in the format you mentioned,

$ sed 's/^[^(]*(\([^)]*\))\s*;\s*\S*\s*=\s*\(\S\+\)\s*;\s*\S*\s*=\s*\(\S\+\)\s*;\s*\S*\s*=\s*\(\S\+\)$/\1,\2,\3,\4/' file
7,59,0.876,0.000433344,0.00003

Upvotes: 1

Kent
Kent

Reputation: 195039

also gnu awk with FPAT:

awk -v FPAT="[0-9.]+" '{for(i=1;i<=NF;i++)printf "%s%s", $i,(i!=NF?",":"\n")}'

test:

$ echo "new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003"|awk -v FPAT="[0-9.]+" '{for(i=1;i<=NF;i++)printf "%s%s", $i,(i!=NF?",":"\n")}'      
7,59,0.876,0.000433344,0.00003

The FPAT could be made better.

Upvotes: 0

fedorqui
fedorqui

Reputation: 289535

You can grep for numbers:

$ grep -o '[0-9.]*' file
7
59
0.876
0.000433344
0.00003

With the -o flag we indicate grep just to print the matched results. This way, you have all your values but not the surrounding text.

If you want it comma-separated, pipe to tr to replace every new line with comma, and finally to sed to replace last comma with a new line:

$ grep -o '[0-9.]*' a | tr -s '\n' ',' | sed 's/,$/\n/'
7,59,0.876,0.000433344,0.00003

Upvotes: 0

Related Questions