Reputation: 725
Let's take for example this file textfile.txt
:
foo
bar
foo
bar
foo**word1**bar
foo
bar**word2**foo
foo
foo
bar
foo**word1**bar
foo
foo
bar**word2**foo
foo
foo
bar
foo**word1**bar
foo
bar**word2**foo
foo
bar
foo**word1**bar
foo
bar
foo
bar
bar**word2**foo
foo
What I am trying to do is : Search for a first word in a file, here the word is **word1**
, and if this word has been found, search in the same line and the next two the second word, here it's **word2**
I tried to use grep
to search the **word1**
, with the -n
option to get the line number. Then with this line number, extract with sed
the matching line and the next two, and then do an other grep
to search for the **word2**
. It also should match each time **word1**
and **word2**
.
But it doesn't feel like it's the best way to achieve this.
In this example, there should be 3 positive matches : the last one doesn't work because **word2**
is 4 lines ahead from **word1**
, and I want a maximum of 2 lines ahead.
Concerning awk's output, I would like to output the line numbers where the two words matched, and also their respective lines where they have been found.
I also have a shell script returning output. What I would like to do is : for each matching couple words, print "my_script_result" + "awk_result" > file
Upvotes: 0
Views: 439
Reputation: 26703
Choosing grep from the tagged tools:
echo shelloutput && grep -nA2 "word1" EgrepToy.txt | egrep "word2"
Output:
shelloutput
7-bar**word2**foo
20-bar**word2**foo
Since I am not sure whether I correclty understood "In this example, there should be 3 positive matches" (I think OP and I are somehow counting the "next lines" differently), I add an alternative to get three:
echo shelloutput && grep -nA3 "word1" EgrepToy.txt | egrep "word2"
Output:
shelloutput
7-bar**word2**foo
14-bar**word2**foo
20-bar**word2**foo
Both solutions work basically identically:
echo shelloutput
&&
egrep word1
-A2
-n
| egrep word2
Echoing shelloutput is a placeholder for anything you want to do.
Upvotes: 0
Reputation: 26703
Choosing sed from the tagged tools:
echo shelloutput && sed -En "/word1/{/word2/{=;p;};N;/word2/{=;p;};N;s/^.*\n//;/word2/{=;p;};N;s/^.*\n//;/word2/{=;p;}}" EgrepToy.txt
Output:
shelloutput
7
bar**word2**foo
14
bar**word2**foo
20
bar**word2**foo
Works like this:
echo shelloutput
&&
/word1/{
/word2/{
=;p;};
N;
s/^.*\n//;
/word2/{
=;
p;
If you want two matches, i.e. only two following lines scanned for word2, then only repeat once, simply by deleting one N;s/^.*\n//;/word2/{=;p;};
.
Upvotes: 0
Reputation: 195029
this awk one-liner may help:
awk '/word1/{ok=1}ok && /word2/{print NR,$0}' file
In above line, /word1/
is your first word, /word2/
is your second word. The output would be matched line numbers and the matched lines.
It works in this way:
The script reads lines from the beginning of file, once word1
was found, set variable ok =1 (true)
. The 2nd part check ok AND word2 matched
, if satisfied, print the output. Thus, if word2
was matched before we found word1
, ok
is false
, the line will be skipped.
awk /word1/{ok=1;s=NR}ok && NR<=s+2 && /word2/{print NR,$0}' file
7 bar**word2**foo
20 bar**word2**foo
Upvotes: 1