Reputation: 586
I am trying to extract some specific values from a non column files. The files have the format
16O ADOPTED LEVELS, GAMMAS 1993TI07 93NP 199902
16O L 0.0 0+ STABLE
16O 2 L ISPIN=0
16O 3 L XREF=ABCDEFHIJKLMNOPQ
16O L 6049.4 10 0+ 67 PS 5
16O 2 L ISPIN=0
16O 3 L XREF=ABCEFIJKMP
16O G 6048.2 10 [E0] 100
16O L 6129.89 4 3- 18.4 PS 5
16O 2 L ISPIN=0$ MOMM1=+1.668 12 (1989RA17)
16O 3 L XREF=ABCEFHIJKLNOPQ
16O G 6128.63 4 100 [E3]
16O 2 G BE3W=13.5 7
I am interested in the values after the sequence 16O L
. For instanse 0.0, 6049.4, 6129.89 etc. In general the values that I want to extract from those files are after the sequence (Number)(Element)(spaces)L(space)
.
The tricky thing is that if the (Element)
consists of one letter there are 3 spaces. if the (Element)
consists of two letters there are 2 spaces. An example file is
10BE ADOPTED LEVELS, GAMMAS 2004TI06 04NP 200705
10BE L 0.0 0+ 1.51E+6 Y 4
10BE2 L ISPIN=1 $ %B-=100
10BE3 L XREF=ABDEFIJKLMNOPQSTUVWXYZabceghij
10BE cL T from weighted average of T{-1/2}=1.51 Ma 6 (Hofmann et al.,
10BE2cL Nucl. Instrum. Meth. Phys. Res. |b 24-25 (1987) 276),
10BE3cL T{-1/2}=1.53 Ma 5% (1993Mi26), and T{-1/2}=1.48 Ma 5% (1993Mi26).
10BE L 3368.03 3 2+ 125 FS 12
10BE2 L ISPIN=1 $ %IT=100
10BE3 L XREF=ABCDEFIJKLMNOPQRSTUVWXYZabceghij
10BE cL B(E2)=52 e{+2} fm{+4} 6 (1987Ra01).
10BE cL E from {+9}Be(n,|g) (1983Ke11). Other value: 3368.34 keV {I43}
10BE2cL (1999Bu26).
10BE2 L WIDTHG=3.66E-3 EV 35
10BE G 3367.415 30 100 E2
10BE2 G WIDTHG=3.66E-3 EV 35$BE2W=8.00 76
10BE L 5958.39 5 2+ 55 FS LT
10BE2 L ISPIN=1 $ %IT=100
10BE3 L XREF=DFJKLMPRTUWYbeghi
10BE cL E from {+9}Be(n,|g) (1983Ke11). Other value: 5958.3 keV {I3}
10BE2cL (1969Al17).
10BE G 2589.999 60 90 GTM1
10BE G 5955.9 5 10 LTE2
10BE L 13.05E3 10 290 KEV 130 A
10BE2 L %A GT 0
10BE3 L XREF=E
10BE cL E |G: from {+7}Li({+7}Li,|a+{+6}He) (2001Cu06).
Is there a way to get those values using awk
?
Is there another language for these kinds of jobs?
I used
awk '/ L/ { print $3 } ' file
for the first filetype(i.e. {3spaces}L) and it works. I used
awk '/ L/ { print $3 } ' file
for the second filetype(i.e. {2spaces}L) and it gives weird results(i.e. it prints values after the sequence (two spaces)G
and I cannot understand why. The only way it can work is to use
awk '/ L / { print $3 } ' file
(i.e. one extra space after L). Why is this happening for the second filetype? Is there a way to use one code for both filetypes?
Upvotes: 1
Views: 184
Reputation: 45293
Using awk
awk '/[0-9]+[A-Z] {3}L / { print $3 } ' file
or
awk '$1~/[0-9]+[A-Z]/&&$2=="L"{print $3}' file
Using grep
grep -iPo '\d+[A-Z] {3}L \K[\d.]*' file
Upvotes: 1
Reputation: 114
Are you looking for the value present in the line "160 L" If thats the case this should do the job
awk '/16O L/ { print $3 } ' filename
Upvotes: 1
Reputation: 195209
when I saw this question, I thought it would be an easy grep line, I was wrong!! I test at least 10 times with my grep line, it didn't work! finally I found out why. "sh*t!"
the data in your example:
16O ....
I was thinking they were :
160 ....
see the difference? :(
ok, here is the line:
grep -Po '^16O {3}L \K[\d.]*' file
it outputs:
0.0
6049.4
6129.89
6917.1
7116.85
8871.9
9585
9844.5
10356
10957
11080
11096.7
11260
11520
11600
12049
12440
12530
....
if you want it to be in your "general" way:
grep -Po '^\d\d[A-Z] {3}L \K[\d.]*'
Upvotes: 0