Reputation: 534
I'm trying to get the word after "MODULE", where,
there can be one or more spaces between MODULE and the "to-be-matched word".
Single space between the "to-be-matched word" & it's next word
to-be-matched word can be of any pattern
HAL_POINT ITERATION IMPLEMENTED VERSION MODULE 1.2.3/4 OLKI 9FEB17 3MAR2018
3.2.6
CHK_PONT VALUES IMPLEMENTED VERSION MODULE 350/4 OLKI 9FEB17 3APR2018
3.2.6
HAL_POINT ITERATION JIO_PO POINT MODULE RT/6T OLKI 9FEB17 3MAR2018
3
I tried
echo $variable | grep -oP '(?<=MODULE\s)\d.\d.\d\/\d'
and
echo $variable | grep -oP '(?<=MODULE\s\s)\d.\d.\d\/\d'
for the 1st line, but I wanted it be more elegant & generic.
to-be-matched-words are 1.2.3/4
or 350/4
or RT/6T
Upvotes: 1
Views: 5791
Reputation: 133518
Could you please try following also once.
awk 'sub(/.*MODULE +/,"") && sub(/ +.*/,"")' Input_file
Explanation: Placing substitution for changing evertything from string MODULE
to NULL in current line AND again mentioning sub for substituting everything from SPACE to till end of line with NULL. So if both the substitutions happen(since AND condition is present in between them) then no action is mentioned so by default print of that line will happen.
Upvotes: 1
Reputation: 85590
You could use awk
if you are pretty sure if the words are de-limited by space, since by default awk
splits up fields in input line by a white-space characters. For your given input all you need is
awk '{ for( i=1; i<=NF ;i++ ) if ( $i == "MODULE" ) { print $(i+1); break } }'
The for
loop just runs up to NF
which basically means run till the last row entry in the current line split by the white-space character.
If you are still persistent on using grep
, you could improve the regex by doing below. In PCRE you can use ?
to match the variable number of white-space characters by doing (\s+)?
and get only the part without the white-space.
grep -oP '(?<=MODULE)(\s+)?\K([^ ]*)'
See the Regular Expression from regex101 working for your given input.
Upvotes: 4
Reputation: 18371
You can use grep
: Here \K
will match but ignore the text on its left , [^ ]+
means it will match any char except white space. -o
is to print matched only text.
grep -oP 'MODULE\s+\K[^ ]+'
Upvotes: 2
Reputation: 15214
And another awk approach, without looping.
Assuming your text were in a file called goku
:
awk '/MODULE/{print gensub(/^.*MODULE +([^ ]+).*$/, "\\1","1")}' goku
1.2.3/4
350/4
RT/6T
Upvotes: 3