Reputation: 17
I am facing a problem to extract a specific value in a .txt file using grep and awk. I show below an excerpt from the .txt file: "-
bravais-lattice index = 2
lattice parameter (alat) = 10.0000 a.u.
unit-cell volume = 250.0000 (a.u.)^3
number of atoms/cell = 2
number of atomic types = 1
number of electrons = 28.00
number of Kohn-Sham states= 18
kinetic-energy cutoff = 60.0000 Ry
charge density cutoff = 300.0000 Ry
convergence threshold = 1.0E-09
mixing beta = 0.7000"
I also defined some variable: ELEMENT and lat. I want to extract the "unit-cell volume" value which is equal to 250.00. I tried the following to extract the value using grep and awk:
volume=`grep "unit-cell volume" ./latt.10/$ELEMENT.scf.latt_$lat.out | awk '{printf "%15.12f\n",$5}'`
However, when i run the bash file I always get 00.000000 as a result instead of the correct value of 250.00.
Can anyone help, please? Thanks in advance.
Upvotes: 0
Views: 4372
Reputation: 204638
You never need grep when you're using awk since awk can do anything useful that grep can do. It sounds like this is all you need:
$ awk -F'=' '/unit-cell volume/{printf "%.2f\n",$2}' file
250.00
The above works because when FS is =
that means $2
is <spaces>250.000 (a.u.)^3
and when awk is asked to convert a string to a number it strips off leading spaces and anything after the numeric part so that leaves 250.000
to be converted to a number by %.2f
.
In the script you posted $5
was failing because the 5th space-separated field in:
$1 $2 $3 $4 $5
<unit-cell> <volume> <=> <250.0000> <(a.u.)^3>
is (a.u.)^3
- you could have just added print $5
to see that.
Upvotes: 1
Reputation: 37464
Since you are processing key-value pairs where the key can have variable amount on space in it, you need to tune that field number ($4, $5 etc.) separately for each record you want to process unless you set the field separator (FS) appropriately to FS=" *= *"
. Then the key will always be in $1 and value in $2.
Then use split
to split the value and unit parts from each other.
Also, you can loose that grep
by defining in awk a pattern (or condition, /unit-cell volume/
) for that print
action:
$ awk 'BEGIN{FS=" *= *"} /unit-cell volume/{split($2,a," +");print a[1]}' file
250.0000
Explained:
$ awk '
BEGIN { FS=" *= *" } # set appropriate field separator
/unit-cell volume/ { # pattern or condition
split($2,a," +") # split value part to value and possible unit parts
print a[1] # output value part
}' file
Upvotes: 0
Reputation: 131800
awk '{printf "%15.12f\n",$5}'
You're asking awk to print out the fifth field of the line ($5
).
unit-cell volume = 250.0000 (a.u.)^3
1 2 3 4 5
The fifth field is (a.u.)^3
, which you are then asking awk to interpret as a number via the %f
format code. It's not a number, though (or actually, doesn't start with a number), and when awk is asked to treat a non-numeric string as a number, it uses 0 instead. Thus it prints 0.
Solution: use $4
instead.
By the way, you can skip invoking grep by using awk itself to select the line, e.g.
awk /^ unit-cell/ {...}
The /^ unit-cell/
is a regular expression that matches "unit-cell
" (with a leading space) at the beginning of the line. Adjust as necessary if you have other lines that start with unit-cell
which you don't want to select.
Upvotes: 3