Amir Amf
Amir Amf

Reputation: 45

How to search for a text between two patterns in one line with multiple occurrences using sed?

I have a file like:

"This is a sample file to find text between p1 (the first pattern) and p2 (the second pattern) in one line, where we can have multiple occurrences of p1 followed by p2 and I only want the text between p1 and p2 in the same line."

I want to print the output like the following: (the first pattern) and followed by and

I try the command

cat filename | sed -e 's/.*p1\(.*\)p2.*/\1/g'

but it only prints out the last one like: and

Thanks in advance

Upvotes: 2

Views: 633

Answers (2)

David C. Rankin
David C. Rankin

Reputation: 84561

If you do not have GNU grep, you should be able to tweak your sed expression to accomplish what you need, for example:

sed 's/^.*p1\(..*\)p2.*$/\1/'

which is essentially what you have aside from the additional '.' that insures there is a least 1 char between p1 and p2 that can be output via the backreference. In this case, it is limited to 1-occurrence of p1 and p2 in each line. The 'g' for a global replace would be superfluous.

Example

If p1 is code and p2 is endcode and 111_000 is the wanted text, you could do the following:

$ printf "A line with a code111_000endcode and stuff\n" | 
sed 's/^.*code\(..*\)endcode.*$/\1/'
111_000

If you have GNU grep, go that route, if not, give this a shot.

Upvotes: 1

hek2mgl
hek2mgl

Reputation: 158010

If you have GNU grep you can use perl compatible regular expressions:

grep -oP 'p1\K.+?(?=p2)' filename

Explanation:

  • \K resets the match after p1 has been matched. That would prevent p1 from getting included into the match result.
  • .+? matches one or more arbitrary characters - ungreedy, meaning only until the nearest occurrence of p2
  • (?=p2) is a lookahead assertion. It means that the previous pattern, which was .+? needs to get followed by p2.

Upvotes: 2

Related Questions