How can I use a lookbehind in a regexp inside the awk/gawk action parameter?

Question

I'm attempting a pretty specific command here. I have a function that outputs quite a bit of text, like:

Country:USA, GDP:984843, id:12345
Country:Spain, GDP:29292, id:23456
Country:Italy, GDP:929393, id:34567

That function is called countries

So my command is countries | gawk -v RS='' '/Spain/ {match($0, (/(?<=id:)[0-9]+/), a); print a[0]; exit;}'

So countries gives me the long list of text.

I then use gawk to select the line with Spain.

The action(s) once gawk finds the line is match(...); print a[0]; exit;, and there is only 1 result with 'Spain' in it, which should be $0, then the regexp part, it should do a positive lookbehind for the substring id:, and then match the proceeding numbers and store it in variable a.

Then I want to print out those numbers, however this search fails. It consistently prints nothing. I know most of the command works, I think it is just a problem with the lookbehind for some reason. I can remove the lookbehind and incorporate just the search for the first occurrence of the numbers and it successfully returns the first set of numbers

David Ehrmann · Accepted Answer

If you're just looking to print the ids, wouldn't it be easier to do

countries | grep 'Country:Spain' | grep -o '[, ]*id:[0-9]*' | cut -d ':' -f 2

If your fields are ordered, it's even easier:

countries | grep 'Country:Spain' | cut -d ',' -f 2 | cut -d ':' -f 2

How can I use a lookbehind in a regexp inside the awk/gawk action parameter?

Answers (1)

Related Questions