Goku
Goku

Reputation: 534

grep match next word after pattern until first space

I'm trying to get the word after "MODULE", where,

  1. there can be one or more spaces between MODULE and the "to-be-matched word".

  2. Single space between the "to-be-matched word" & it's next word

  3. to-be-matched word can be of any pattern

    HAL_POINT ITERATION IMPLEMENTED VERSION MODULE  1.2.3/4 OLKI 9FEB17 3MAR2018 
    3.2.6    
    CHK_PONT VALUES IMPLEMENTED VERSION MODULE 350/4 OLKI 9FEB17 3APR2018 
    3.2.6
    HAL_POINT ITERATION JIO_PO POINT MODULE     RT/6T OLKI 9FEB17 3MAR2018 
    3
    

I tried

echo $variable | grep -oP '(?<=MODULE\s)\d.\d.\d\/\d'

and

echo $variable | grep -oP '(?<=MODULE\s\s)\d.\d.\d\/\d' 

for the 1st line, but I wanted it be more elegant & generic.

to-be-matched-words are 1.2.3/4 or 350/4 or RT/6T

Upvotes: 1

Views: 5791

Answers (4)

RavinderSingh13
RavinderSingh13

Reputation: 133518

Could you please try following also once.

awk 'sub(/.*MODULE +/,"") && sub(/ +.*/,"")' Input_file

Explanation: Placing substitution for changing evertything from string MODULE to NULL in current line AND again mentioning sub for substituting everything from SPACE to till end of line with NULL. So if both the substitutions happen(since AND condition is present in between them) then no action is mentioned so by default print of that line will happen.

Upvotes: 1

Inian
Inian

Reputation: 85590

You could use awk if you are pretty sure if the words are de-limited by space, since by default awk splits up fields in input line by a white-space characters. For your given input all you need is

awk '{ for( i=1; i<=NF ;i++ ) if ( $i == "MODULE" ) { print $(i+1); break } }' 

The for loop just runs up to NF which basically means run till the last row entry in the current line split by the white-space character.

If you are still persistent on using grep, you could improve the regex by doing below. In PCRE you can use ? to match the variable number of white-space characters by doing (\s+)? and get only the part without the white-space.

grep -oP '(?<=MODULE)(\s+)?\K([^ ]*)'

See the Regular Expression from regex101 working for your given input.

Upvotes: 4

P....
P....

Reputation: 18371

You can use grep : Here \K will match but ignore the text on its left , [^ ]+ means it will match any char except white space. -o is to print matched only text.

grep -oP 'MODULE\s+\K[^ ]+'

Upvotes: 2

tink
tink

Reputation: 15214

And another awk approach, without looping.

Assuming your text were in a file called goku:

awk '/MODULE/{print gensub(/^.*MODULE +([^ ]+).*$/, "\\1","1")}' goku
1.2.3/4
350/4
RT/6T

Upvotes: 3

Related Questions