Tim
Tim

Reputation: 99408

misuse field separators in awk

I want to extract the number following Pages: in text such as

Tagged:         no
Form:           none
Pages:          3
Encrypted:      no

The following awk command doesn't work well, because it has whitespaces before 3

$ awk -F': ' '$1=="Pages" {print $2}' 
         3

while

awk -F'[: ]' '$1=="Pages" {print $2}' 

produce nothing, where I think I specify two possible characters as field separator.

So how can I use awk to extract the number after Pages: without preceding whitespaces? Thanks.

Upvotes: 2

Views: 67

Answers (4)

James Brown
James Brown

Reputation: 37394

Why bother with -F, just:

$ awk '/^Pages/{print $2}' foo
3

EDIT: Oh, @BenjaminW. already suggested this in commects. Props++.

Upvotes: 1

karakfa
karakfa

Reputation: 67467

-F'[: ]+' is not right. Although works in this case, it wouldn't if there were empty fields. The right delimiter to use is ': +'. See the examples below

$ echo "a:  : b" | awk -F'[: ]+' '{print NF}'
2

$ echo "a:  : b" | awk -F': +' '{print NF}'
3

this should solve your problem.

$ awk -F': +' '/^Pages/{print $2}' file

Upvotes: 2

Mustafa DOGRU
Mustafa DOGRU

Reputation: 4112

you can try this;

awk -F': ' '$1=="Pages" {gsub(/ /, "", $2); print $2} '

Upvotes: 1

Andy Ray
Andy Ray

Reputation: 32066

Looks like you need to tell awk it's more than one character:

awk -F'[: ]+' '$1=="Pages" {print $2}'

Note the + in the regex.

Upvotes: 3

Related Questions