Reputation: 5504

awk regex and space inside

Why my awk script

BEGIN {
  FS = "][ \t\v]+"
}

# Note space after + in the end of the regex.
NF == 2 && $1 ~ /[:alpha:][:digit:]+ / {
  print $1, "<<<";
}

Doesn't match any string in the file like the following:

I1130 15:18:42.526808 17329 thrift_bridge.cpp:126] AAA
E1130 15:18:42.527042 16076 thrift_bridge.hpp:288] BBB

But if I remove space, both lines are in the output.

Upvotes: 1

Answers (1)

Reputation: 89639

It's because your character class syntax is wrong:

/[[:alpha:]][[:digit:]]+ /

Without square brackets [:alpha:] and [:digit:] aren't seen like pre-defined POSIX character classes but like basic classes.

/[:alpha:][:digit:]+/ is the same than /[ahlp:][dgit:]+/, and matches p: on each line.

As @John1024 noticed it, mawk doesn't support POSIX character classes, so you must write:

/[a-zA-Z][0-9]+ /

or use gawk since it is available under linux.

Upvotes: 5