Johnathan
Johnathan

Reputation: 877

awk regexp match on only one tab

I have a simple input file for awk, called tabmatch.input and with the below content:

        : (test1
            : (test2

The first line has one tab, then the ":", and the second line has two tab, then the ":". The words "test1" and "test2" could be any word in the real file I try to parse.

I am trying to create a regexp that matches the first line, but not the second. For example I try this:

user@lab-client:~$ cat tabmatch.input |awk '/\t: \(test/ {  {print $2} }'
(test1
(test2

Even though specify only one \t and then ":", it still matches on two \t and the ":". If I instead match on two \t it only matches the second line which has two \t.

user@lab-client:~$ cat tabmatch.input |awk '/\t\t: \(test/ {  {print $2} }'
(test2

I have looked around but not found why \t matches several \t, or how to make it only match one.

Other attempts I have made are:

user@lab-client:~$ cat tabmatch.input |awk '/[\t]: \(test/ {  {print $2} }'
(test1
(test2

user@lab-client:~$ cat tabmatch.input |awk '/[\t]?: \(test/ {  {print $2} }'
(test1
(test2

Upvotes: 1

Views: 1944

Answers (1)

Tom Fenech
Tom Fenech

Reputation: 74685

Both of your lines match the pattern.

If you want to only match one tab from the start of the line, then you need to add an anchor ^:

awk '/^\t: \(test/ { print $2 }' tabmatch.input

I removed the inner curly braces as they weren't doing anything useful.

Bear in mind that awk can read files all by itself so you don't need to pipe data to it using cat.

Upvotes: 2

Related Questions