Reputation: 1299
I have several lines of text. I want to extract the number after specific word using awk.
I tried the following code but it does not work.
At first, create the test file by: vi test.text
. There are 3 columns (the 3 fields are generated by some other pipeline commands using awk).
Index AllocTres CPUTotal
1 cpu=1,mem=256G 18
2 cpu=2,mem=1024M 16
3 4
4 cpu=12,gres/gpu=3 12
5 8
6 9
7 cpu=13,gres/gpu=4,gres/gpu:ret6000=2 20
8 mem=12G,gres/gpu=3,gres/gpu:1080ti=1 21
Please note there are several empty fields in this file.
what I want to achieve is to extract the number after the first gres/gpu=
in each line (if no gres/gpu=
occurs in this line, the default number is 0
) using a pipeline like: cat test.text | awk '{some_commands}'
to output 4 columns:
Index AllocTres CPUTotal GPUAllocated
1 cpu=1,mem=256G 18 0
2 cpu=2,mem=1024M 16 0
3 4 0
4 cpu=12,gres/gpu=3 12 3
5 8 0
6 9 0
7 cpu=13,gres/gpu=4,gres/gpu:ret6000=2 20 4
8 mem=12G,gres/gpu=3,gres/gpu:1080ti=1 21 3
Upvotes: 0
Views: 1677
Reputation: 3975
awk '
BEGIN{FS="\t"}
NR==1{
$(NF+1)="GPUAllocated"
}
NR>1{
$(NF+1)=FS 0
}
/gres\/gpu=/{
split($0, a, "=")
gp=a[3]; gsub(/[ ,].*/, "", gp)
$NF=FS gp
}1' test.text
Index AllocTres CPUTotal GPUAllocated
1 cpu=1,mem=256G 18 0
2 cpu=2,mem=1024M 16 0
3 4 0
4 cpu=12,gres/gpu=3 12 3
5 8 0
6 9 0
7 cpu=13,gres/gpu=4,gres/gpu:ret6000=2 20 4
8 mem=12G,gres/gpu=3,gres/gpu:1080ti=1 21 3
Upvotes: 0
Reputation: 133458
With your shown samples in GNU awk
you can try following code. Written and tested in GNU awk
. Simple explanation would be using awk
's match
function where using regex gres\/gpu=([0-9]+)
(escaping /
here) and creating one and only capturing group to capture all digits coming after =
. Once match is found printing current line followed by array's arr's 1st element +0
(to print zero in case no match found for any line) here.
awk '
FNR==1{
print $0,"GPUAllocated"
next
}
{
match($0,/gres\/gpu=([0-9]+)/,arr)
print $0,arr[1]+0
}
' Input_file
Upvotes: 1
Reputation: 11217
Using sed
$ sed '1s/$/\tGPUAllocated/;s~.*gres/gpu=\([0-9]\).*~& \t\1~;1!{\~gres/gpu=[0-9]~!s/$/ \t0/}' input_file
Index AllocTres CPUTotal GPUAllocated
1 cpu=1,mem=256G 18 0
2 cpu=2,mem=1024M 16 0
3 4 0
4 cpu=12,gres/gpu=3 12 3
5 8 0
6 9 0
7 cpu=13,gres/gpu=4,gres/gpu:ret6000=2 20 4
8 mem=12G,gres/gpu=3,gres/gpu:1080ti=1 21 3
Upvotes: 0
Reputation: 36390
Firstly: awk
do not need cat
, it could read files on its' own. Combining cat
and awk
is generally discouraged as useless use of cat.
For this task I would use GNU AWK
following way, let file.txt
content be
cpu=1,mem=256G
cpu=2,mem=1024M
cpu=12,gres/gpu=3
cpu=13,gres/gpu=4,gres/gpu:ret6000=2
mem=12G,gres/gpu=3,gres/gpu:1080ti=1
then
awk 'BEGIN{FS="gres/gpu="}{print $2+0}' file.txt
output
0
0
0
3
0
0
4
3
Explanation: I inform GNU AWK
that field separator (FS
) is gres/gpu=
then for each line I do print 2nd field increased by zero. For lines without gres/gpu=
$2
is empty string, when used in arithmetic context this is same as zero so zero plus zero gives zero. For lines with at least one gres/gpu=
increasing by zero provokes GNU AWK
to find longest prefix which is legal number, thus 3
(4th line) becomes 3
, 4,
(7th line) becomes 4
, 3,
(8th line) becomes 3
.
(tested in GNU Awk 5.0.1)
Upvotes: 2