DJDMorrison
DJDMorrison

Reputation: 1327

Perl awk regex differs to standard regex?

int abc0/0.1
int abc0/1
int abc0/1.2

I'm using regexr to create a regex rule that will match a line if there is a '.' near the end of the line. I have the following rule:

int [A-Za-z]*[0-9/]*\.[0-9]*

which works perfectly in regexr but it doesn't work when using it with awk. Is there some differences I need to know about?

This is the line I am using. It's worked fine on previous, simpler matches, just not this one.

`awk -v RS=! -v ORS= '/int [A-Za-z]*[0-9/]*\.[0-9]*/{print FILENAME}' file`;

Thank you

Upvotes: 0

Views: 140

Answers (3)

Ed Morton
Ed Morton

Reputation: 203209

There is no such thing as a regexp. There's only regexp for tool X, where X is your tool of choice. There are some general guidelines for regexps, but every tool has caveats and its own rules for which flavors of regexp it uses and how to specify them.

For example, / is a RE that matches a forward slash but try using / in a regexp context in awk or sed:

sed '///' file
awk '///' file

and both will fail with syntax errors since the / char is also a regexp delimiter and so literal /s need to be escaped while with grep on the other hand:

grep '/' file

it will work just fine. Every tool has it's own caveats and many tools have multiple ways of specifying the same regexp, none of them being exactly the same way as other tools.

This might robustly be what you're looking for:

$ awk '/int [[:alpha:]]*[[:digit:]/]*\.[[:digit:]]/' file
int abc0/0.1
int abc0/1.2

but the RE you posted should have worked just fine:

$ awk '/int [A-Za-z]*[0-9/]*\.[0-9]*/' file
int abc0/0.1
int abc0/1.2

Upvotes: 1

Jason Hu
Jason Hu

Reputation: 6333

no standard for regex. if you have to ask, the original regex has only 3 meta character: ., *, ?. the rest characters represent themselves. regex varies, but after perl came out, it gradually took the majority of the "market" and the engines after it always try to be compatible with perl's. therefore, you will see a noun "perl compatible regex syntax", but it's still not a standard.

Upvotes: 0

anubhava
anubhava

Reputation: 784938

You need to escape / inside the regex:

awk -v RS=! -v ORS= '/int [A-Za-z]*[0-9\/]*\.[0-9]*/{print FILENAME}' file

Upvotes: 1

Related Questions