Reputation: 23
I am new to using linux and grep and I am looking for some direction in how to use grep. I am trying to get two specific numbers from a text file. I will need to do this for thousands of files so I believe using grep or some equivalent to be best for my mental health.
The text file I am working with looks as follows:
*Average spectrum energy: 0.00100 MeV
Average sampled energy : 0.00100 MeV [ -0.0000%]
K/phi = <E*mu_tr/rho> = 6.529719E+02 10^-12 Gy cm^2 [ 0.0008%]
Kcol/phi = <E*mu_tr/rho>*(1-<g>) = 6.529719E+02 10^-12 Gy cm^2 [ 0.0008%]
<g> = 1.0000E-15 [ 0.4264%]
1-<g> = 1.000000 [ 0.0000%]
<mu_tr/rho> = <E*mu_tr/rho>/Eave = 4.075530E+03 cm^2/g [ 0.0008%]
<mu_en/rho> = <E*mu_tr/rho>*(1-<g>)/Eave = 4.075530E+03 cm^2/g [ 0.0008%]
<E*mu_en/rho> = 4.075530E+00 MeV cm^2/g
The values I am looking to extract from this are "0.00100" and "4.075530E+00".
At the moment I am using grep -iE "Average spectrum energy|<E*mu_en/rho>" *
which is allowing me to see the full lines, but I am not quite sure how to refine the search to only show me the numbers instead of just the whole line. Is this possible using grep?
As for moving the numbers into a new file, I believe the command is > newdata.txt
. My question is when using this with grep can you change how it writes the data to the new text file? I am looking for the format of the numbers to be like this:
0.00100001 3.4877754595352117
0.00100367 3.4665273232204363
0.00100735 3.4453747056004884
0.00101104 3.4243696230289187
0.00101474 3.4035147003587718
Again is that possble using the grep > newdata.txt
?
I really appreciate any help or direction people can give me. Thank you.
Upvotes: 2
Views: 154
Reputation: 19375
I'm not quite sure why it was giving the 4.075530E+03 value.
That's because *
has the special meaning of a repetition of the previous item any number of times (including zero), so the pattern <E*mu_en/rho>
does not match the text <E*mu_en/rho>
, but rather <
any number of E
mu_en/rho>
, i. e. especially <mu_en/rho>
. To escape this special meaning and match a literal *
, prepend a backslash, i. e. <E\*mu_en/rho>
.
I am not quite sure how to refine the search to only show me the numbers instead of just the whole line. Is this possible using grep?
It is if PCRE (grep -P
) is available in the system. To only (-o
) show the numbers, we can use the feature of Resetting the match start with \K
. Your modified grep
command is then:
grep -hioP "(Average spectrum energy: *|<E\*mu_en/rho> *= )\K\S*" *
(option -h
drops the file names, pattern item \S
means not a white space).
when using this with grep can you change how it writes the data to the new text file?
grep
by itself cannot change the format of numbers (except maybe cutting digits off). If you want this, we need another tool. Now, since we need another tool, I'd consider using a tool which is capable of doing the whole job, e. g. awk
:
awk '
/Average spectrum energy/ { printf "%.8f ", $4 }
/<E\*mu_en\/rho>/ { printf "%.16f\n", $3 }
' * >newdata.txt
Upvotes: 0