Andy
Andy

Reputation: 23

Selecting a field after a string using awk

I'm very new to awk having just been introduced to it over the weekend. I have a question that I'm hoping someone may be able to help me with.

How would one select a field that follows a specific string?

How would I expand this code to select more than one field following a specific string?

As an example, for any given line in my text file I have something like

2 of 10 19/4/2014 school name random text distance 800m more random text time 2:20:22 winner someonefast. 

Some attributes are very consistent so I can easily extract these fields. For example 2, 10 and the date. However, there is often a lot of variable text before the next field that I wish to extract. Hence the question. Using awk can I extract the next field following a string? For example I'm interested in the fields following the /distance/ or /time/ string in combination with $1, $3, $4, $5.

Your help will be greatly appreciated.

Andy

Upvotes: 1

Views: 872

Answers (3)

Ed Morton
Ed Morton

Reputation: 204731

When you have name = value situations like you do here, it's best to create an array that maps the names to the values and then just print the values for the names you're interested in, e.g.:

$ awk '{for (i=1;i<=NF;i++) v[$i]=$(i+1); print $1, $3, $4, $5, v["distance"], v["time"]}' file
2 10 19/4/2014 school 800m 2:20:22

Upvotes: 1

konsolebox
konsolebox

Reputation: 75628

Basic:

awk '{
    for (i = 6; i <= NF; ++i) {
        if ($i == "distance") distance = $(i + 1)
        if ($i == "time") time = $(i + 1)
    }
    print $1, $3, $4, $5, distance, time
}' file

Output:

2 10 19/4/2014 school 800m 2:20:22

But it's not enough to get all other significant texts which is still part of the school name after $5. You should add another condition.

The better solution is to have another delimiter besides spaces like tabs and use \t as FS.

Upvotes: 0

jaypal singh
jaypal singh

Reputation: 77185

Using awk you can select the field following a string. Here is an example:

echo '2 of 10 19/4/2014 school name random text distance 800m more random text time 2:20:22 winner someonefast.' |
awk '{
    for(i=1; i<=NF; i++) {
        if ( i ~ /^[1345]$/ ) {
            extract = (extract ? extract FS $i : $i)
        }
        if ( $i ~ /distance|time/ ) {
            extract = (extract ? extract FS $(i+1): $(i+1))
        }
    }
    print extract
}'
2 10 19/4/2014 school 800m 2:20:22

What we are doing here is basically allowing awk to split on default delimiter. We create a for loop to iterate over all fields. NF stores number of fields for a given line. So we start from 1 and go all the way to the end.

In our first conditional block, we just inspect the field number. If it is 1 or 3 or 4 or 5, we create a variable called extract which concatenates the values of these fields separated by the field separator.

In our second conditional block, we check if the value of the field is either distance or time. If it is we again append to our variable but this time instead of the current value, we do $(i+1) which is basically the value of the next field or you can say value of a field that follows a specific string.

Upvotes: 1

Related Questions