Reputation: 175
I have a file which contains only lines of the form
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
is it possible to parse this output with bash into a form like
7,59,0.876,0.000433344,0.00003
to read it then into python?
Upvotes: 0
Views: 793
Reputation: 203254
$ sed -r 's/[^0-9.]+/,/g;s/^,//' file
7,59,0.876,0.000433344,0.00003
$ awk -F'[^0-9.]+' -v OFS=',' '{$1=$1;sub(/^,/,"")} 1' file
7,59,0.876,0.000433344,0.00003
$ sed -r 's/[^0-9.,;]+//g;s/;/,/g' file
7,59,0.876,0.000433344,0.00003
$ awk -F';' -v OFS=',' '{$1=$1;gsub(/[^0-9.,]/,"")} 1' file
7,59,0.876,0.000433344,0.00003
Personally I prefer the last 2 as they don't add a comma and then remove it again, which always feels kinda cludgy and error-prone.
Upvotes: 0
Reputation: 785008
Using gnu awk:
cat file
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
awk -F ' *[=()] *' -v RS=' ; |\n' -v OFS= -v ORS= 'NF{print $2, (NR%4==0)? "\n":","}' file
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
Upvotes: 0
Reputation: 63892
Many solutions, only perl misisng ;)
perl -nlE '$,=",";say m/[\d.]+/g'
,
or (ofc) @neronlevelu's solution
perl -plE 's/[^\d,;.]//g;y/;/,/'
digit,;.
;
to ',' (the y
transliterates all occurrences of the characters found in the search list with the corresponding character in the replacement list ) - aka tr
.Upvotes: 0
Reputation: 97938
Using sed:
sed 's/[^0-9,.][^0-9,.]*/ /g' input
for better formatting:
sed 's/[^0-9,.][^0-9,.]*/ /g' input | column -to,
Gives:
7,59,0.876,0.000433344,0.00003
Upvotes: 1
Reputation: 10039
sed 's/[^0-9,;.]//g;y/;/,/' YourFile
Upvotes: 3
Reputation: 174696
You could try the below sed command if the contents are in the format you mentioned,
$ sed 's/^[^(]*(\([^)]*\))\s*;\s*\S*\s*=\s*\(\S\+\)\s*;\s*\S*\s*=\s*\(\S\+\)\s*;\s*\S*\s*=\s*\(\S\+\)$/\1,\2,\3,\4/' file
7,59,0.876,0.000433344,0.00003
Upvotes: 1
Reputation: 195039
also gnu awk with FPAT
:
awk -v FPAT="[0-9.]+" '{for(i=1;i<=NF;i++)printf "%s%s", $i,(i!=NF?",":"\n")}'
test:
$ echo "new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003"|awk -v FPAT="[0-9.]+" '{for(i=1;i<=NF;i++)printf "%s%s", $i,(i!=NF?",":"\n")}'
7,59,0.876,0.000433344,0.00003
The FPAT
could be made better.
Upvotes: 0
Reputation: 289535
You can grep
for numbers:
$ grep -o '[0-9.]*' file
7
59
0.876
0.000433344
0.00003
With the -o
flag we indicate grep
just to print the matched results. This way, you have all your values but not the surrounding text.
If you want it comma-separated, pipe to tr
to replace every new line with comma, and finally to sed
to replace last comma with a new line:
$ grep -o '[0-9.]*' a | tr -s '\n' ',' | sed 's/,$/\n/'
7,59,0.876,0.000433344,0.00003
Upvotes: 0