ares
ares

Reputation: 121

How to grep the first occurrence of a word after a pattern

I have an output of an analysis and I would like to grep a keyword "X" -which always appears- every time a phrase "Y" occurs. The keyword "X" appears many times but I would like to get only the subsequent after "Y".

For example, I would like to get the subsequent Folder name every time Iter = 10 occurs, i.e. F1, F4.

Iter   = 10
Folder = F1

Iter   = 5
Folder = F2

Iter   = 6
Folder = F3

Iter   = 10
Folder = F4

Any ideas?

Hexdump -c output of file (as requested by @Inian):

0000000   I   t   e   r               =       1   0  \n   F   o   l   d
0000010   e   r       =       F   1  \n  \n   I   t   e   r            
0000020   =       5  \n   F   o   l   d   e   r       =       F   2  \n
0000030  \n   I   t   e   r               =       6  \n   F   o   l   d
0000040   e   r       =       F   3  \n  \n   I   t   e   r            
0000050   =       1   0  \n   F   o   l   d   e   r       =       F   4
0000060  \n                                                            
0000061

Upvotes: 0

Views: 4416

Answers (3)

Inian
Inian

Reputation: 85530

You could use awk for this requirement. It works on a /pattern/{action} based rule on each line of the input file. So in our case we first match the string Iter = 10 and enable a flag so that on the next match of the string starting with Folder, we extract the last space de-limited column, which in awk is represented by $NF and we reset the flag for subsequent matches.

awk '/\<Iter   = 10\>/{flag=1; next} flag && /^Folder/{print $NF; flag=0;}' file

or without the <> try

awk '/Iter   = 10/{flag=1; next} flag && /^Folder/{print $NF; flag=0;}' file

Upvotes: 3

James Brown
James Brown

Reputation: 37394

You could also you grep:

$ grep -A 1 Iter.*10 file | grep Folder | grep -o "[^ ]*$"
F1
F4

Explained:

  • grep -A 1 Iter.*10 file search for desired pattern and get some trailing context (-A 1, just one line)
  • grep Folder next search for keyword Folder
  • grep -o "[^ ]*$" get the last part of previous output

If there is noise between Iter and Folder lines you could remove that with grep "\(Iter.*10\|Folder\)" file first.

Above expects for Iter line to appear before Folder line. If that is not the case, awk is the cure. For example, data (line orders vary, there is noise):

Folder = F1
Foo    = bar
Iter   = 10

Iter   = 5
Foo    = bar
Folder = F2

$ awk -v RS="" -F"\n" '                        # record separated by empty line
/Iter/ && / 10$/ {                             # look for record with Iter 10
    for(i=1;i<=NF;i++)                         # iterate all fields (lines within record)
        if(split($i,a," *") && a[1]=="Folder") # split Folder line to components
            print a[3]                         # output value
}
' file
F1

Upvotes: 2

tryingToLearn
tryingToLearn

Reputation: 11659

grep is simply regrex search.

For, doing more complex operation, you can use awk.

E.g.

awk '/Iter = 10/ { getline; print $0 }' /path/to/file

where /path/to/file is the file containing your text to be searched

EDIT: Just after posting my answer I read Inian's answer and it is more elaborate and accurate.

Upvotes: 0

Related Questions