Reputation: 99408
I want to extract the number following Pages:
in text such as
Tagged: no
Form: none
Pages: 3
Encrypted: no
The following awk command doesn't work well, because it has whitespaces before 3
$ awk -F': ' '$1=="Pages" {print $2}'
3
while
awk -F'[: ]' '$1=="Pages" {print $2}'
produce nothing, where I think I specify two possible characters as field separator.
So how can I use awk to extract the number after Pages:
without preceding whitespaces? Thanks.
Upvotes: 2
Views: 67
Reputation: 37394
Why bother with -F
, just:
$ awk '/^Pages/{print $2}' foo
3
EDIT: Oh, @BenjaminW. already suggested this in commects. Props++.
Upvotes: 1
Reputation: 67467
-F'[: ]+'
is not right. Although works in this case, it wouldn't if there were empty fields. The right delimiter to use is ': +'
. See the examples below
$ echo "a: : b" | awk -F'[: ]+' '{print NF}'
2
$ echo "a: : b" | awk -F': +' '{print NF}'
3
this should solve your problem.
$ awk -F': +' '/^Pages/{print $2}' file
Upvotes: 2
Reputation: 4112
you can try this;
awk -F': ' '$1=="Pages" {gsub(/ /, "", $2); print $2} '
Upvotes: 1
Reputation: 32066
Looks like you need to tell awk it's more than one character:
awk -F'[: ]+' '$1=="Pages" {print $2}'
Note the +
in the regex.
Upvotes: 3