Jae Nulton
Jae Nulton

Reputation: 391

How to AWK print only specific item?

I have a log file that looks like this:

RPT_LINKS=1,T1999
RPT_NUMALINKS=1
RPT_ALINKS=1,1999TK,2135,2009,31462,29467,2560
RPT_TXKEYED=1
RPT_ETXKEYED=0

I have used grep to isolate the line I am interested in with the RPT_ALINKS. In that line I want to know how to use AWK to print only the link that ends with a TK.

I am really close running this:

grep -w 'RPT_ALINKS' stats2.log | awk -F 'TK' '{print  FS }'

But I am sure those who are smarter than me already know I am getting only the TK back, how do I get the entire field so that I would get a return of 1999TK?

Upvotes: 0

Views: 575

Answers (6)

Ed Morton
Ed Morton

Reputation: 203522

With a sed that has -E for EREs, e.g. GNU or OSX/BSD sed:

$ sed -En 's/^RPT_ALINKS=(.*,)?([^,]*TK)(,.*|$)/\2/p' file
1999TK

With GNU awk for the 3rd arg to match():

$ awk 'match($0",",/^RPT_ALINKS=(.*,)?([^,]*TK),.*/,a){print a[2]}' file
1999TK

Upvotes: 1

Jotne
Jotne

Reputation: 41456

Instead of looping through it, you can use an other alternative.
This will be fast, loop takes time.

awk -F"TK" '/RPT_ALINKS/ {b=split($1,a,",");print a[b]FS}' stats2.log
1999TK

Here you split the line by setting field separator to TK and search for line that contains RPT_ALINKS
That gives $1=RPT_ALINKS=1,1999 and $2=,2135,2009,31462,29467,2560
$1 will always after last comma have our value.
So split it up using split function by comma. b would then contain number of fields.
Since we know that number would be in last section we do use a[b] and add FS that contains TK

Upvotes: 0

Hemang
Hemang

Reputation: 1671

Here is a simple solution

awk -F ',|=' '/^RPT_ALINKS/ { for (i=1; i<=NF; i++) if ($i ~ /TK$/) print $i }' stats2.log

It looks only on the record which begins with RPT_ALINKS. And there it check every field. If field ends with TK, then it prints it.

Upvotes: 2

ghoti
ghoti

Reputation: 46846

Dang, I was just about to post the double-grep alternative, but got scooped. And all the good awk solutions are taken as well.

Sigh. So here we go in bash, for fun.

$ mapfile a < stats2.log
$ for i in "${a[@]}"; do [[ $i =~ ^RPT_ALINKS=(.+,)*([^,]+TK) ]] && echo "${BASH_REMATCH[2]}"; done
1999TK

This has the disadvantage of running way slower than awk and not using fields. Oh, and it won't handle multiple *TK items on a single line. And like sed, this is processing lines as patterns rather than fields, which saps elegance. And by using mapfile, we limit the size of input you can handle because your whole log is loaded into memory. Of course you don't really need to do that, but if you were going to use a pipe, you'd use a different tool anyway. :-)

Happy Thursday.

Upvotes: 1

kvantour
kvantour

Reputation: 26481

If there is only a single RT in that line and RT is always at the end:

awk '/RPT_ALINKS/{match($0,/[^=,]*TK/); print substr($0,RSTART,RLENGTH)}'

You can also use a double grep

grep -w 'RPT_ALINKS' stats2.log | grep -wo '[^=,]*TK'

The following sed solution also works nicely:

sed '/RPT_ALINKS/s/\(^.*[,=]\)\([^=,]*TK\)\(,.*\)\?/\2/'

Upvotes: 3

Cyrus
Cyrus

Reputation: 88636

It doesn't get any more elegant

awk -F '=' '$1=="RPT_ALINKS" {n=split($2,array,",")
            for(i=1; i<=n; i++)
              if (array[i] ~ /TK$/)
                {print array[i]}}
           ' stats2.log

n=split($2,array,","): split 1,1999TK,2135,2009,31462,29467,2560 with , to array array. n contains number of array elements, here 7.

Upvotes: 2

Related Questions