Reputation: 122132
I have the input file:
$ cat bleu.out
BLEU = 16.67, 54.4/26.8/14.9/8.2 (BP=0.813, ratio=0.828, hyp_len=8982, ref_len=10844)
BLEU = 17.56, 55.1/27.6/15.8/9.4 (BP=0.804, ratio=0.821, hyp_len=8905, ref_len=10844)
BLEU = 17.95, 54.4/27.5/15.6/9.1 (BP=0.837, ratio=0.849, hyp_len=9206, ref_len=10844)
BLEU = 19.10, 54.8/28.1/16.3/9.7 (BP=0.860, ratio=0.869, hyp_len=9423, ref_len=10844)
BLEU = 19.29, 53.0/26.6/15.1/8.9 (BP=0.925, ratio=0.928, hyp_len=10058, ref_len=10844)
BLEU = 18.70, 55.7/28.7/16.4/9.4 (BP=0.839, ratio=0.851, hyp_len=9223, ref_len=10844)
BLEU = 18.63, 55.2/28.1/16.3/9.8 (BP=0.834, ratio=0.846, hyp_len=9178, ref_len=10844)
BLEU = 18.41, 54.2/27.4/15.5/9.2 (BP=0.857, ratio=0.867, hyp_len=9398, ref_len=10844)
BLEU = 18.70, 53.7/26.9/15.7/9.3 (BP=0.871, ratio=0.878, hyp_len=9526, ref_len=10844)
But when I need to cut out a certain column, let's say the first column after the first comma, I had to use multiple instances of cut
, e.g. :
$ cat bleu.out | cut -f1 -d',' | cut -f3 -d ' '
16.67
17.56
17.95
19.10
19.29
18.70
18.63
18.41
18.70
Is there a way to sequentially order multiple cut
criterion in one cut
instance? E.g. something like cut-multi.sh -f1 -d',' -f3 -d' '
?
If no, what would be other methods to perform the same operation of cut -f1 -d',' | cut -f3 -d' '
? Using awk
, sed
or the likes are also welcomed.
Upvotes: 0
Views: 98
Reputation: 1517
awk -F'[ = ,]' '{print $4}' file
16.67
17.56
17.95
19.10
19.29
18.70
18.63
18.41
18.70
Upvotes: 0
Reputation: 5298
Another solution with awk
:
awk '{sub(/,$/, "", $3); print $3}' bleu.out
Remove the last ,
from the 3rd
field and print it.
Upvotes: 0
Reputation: 18381
Following solution using grep
and perl's lookaround feature. This will print the text between =
and first ,
.
grep -oP '= \K.*?(?=,)' input
16.67
17.56
17.95
19.10
19.29
18.70
18.63
18.41
18.70
Or as suggested to Sundeep:
grep -oP '= \K[^,]+' input
Upvotes: 3
Reputation: 52182
With sed:
$ sed 's/^[^=]*= \([^,]*\).*/\1/' bleu.out
16.67
17.56
17.95
19.10
19.29
18.70
18.63
18.41
18.70
This captures all characters that are not a comma up to a comma (\([^,]*\)
) after the first occurrence of =
(and a space) (^[^=]*=
) and substitutes the line with the capture group (\1
).
Upvotes: 2
Reputation: 23667
You can specify multiple field separator in awk
$ awk -F'= *|,' '{print $2}' bleu.out
16.67
17.56
17.95
19.10
19.29
18.70
18.63
18.41
18.70
-F'= *|,'
specifies =
followed by zero or more space or ,
as field separator{print $2}
print second columnUpvotes: 4