Reputation: 1285
I have a basic query. I have a string like below:
on one off abcd on two off
I want to find out all the string between 'on' and 'off' the result I am expecting here is 'one' and 'two'
I believe this is possible with sed..
I tried with sed 's/on\(.*\)off/\1/g'
but this returns one off abcd on two
Upvotes: 0
Views: 39
Reputation: 10039
sed 's/\(.*\) off.*/ \1³/;s/ off /³/g;s/ on /²/g;s/³[^²]*²/³²/g;s/^[^²]*²/²/;s/²/\
/g;s/.//;s/³//g'
²
and ³
as delimiter (because POSIX sed does not allow a group rejection but a class) instead of on
and off
. Other character not used in the string could be use (avoid maybe meta char like &
, ...)Upvotes: 0
Reputation: 41456
Here is an awk
version
awk -v RS=" " '/\<off\>/ {f=0} f; /\<on\>/ {f=1}' file
one
two
Upvotes: 0
Reputation: 44043
With sed, I think the easiest way is to use two sed processes:
echo 'on one off abcd on two off' | sed 's/\<on\>[[:space:]]*/\non\n/g; s/[[:space:]]*\<off\>/\noff\n/g' | sed -n '/^on$/,/^off$/ { //!p; }'
one
two
This falls into two parts:
sed 's/\<on\>[[:space:]]*/\non\n/g; s/[[:space:]]*\<off\>/\noff\n/g'
puts the on
and off
on easily recognizable, single lines, and
sed -n '/^on$/,/^off$/ { //!p; }'
prints just the stuff between them.
Alternatively, you could do it with Perl (which supports non-greedy matching and lookarounds):
$ echo 'on one off abcd on two off' | perl -pe 's/.*?\bon\b\s*(.*?)\s*\boff\b.*?((?=\bon\b)|$)/\1\n/g; s/\n$//'
one
two
Where the
s/.*?\bon\b\s*(.*?)\s*\boff\b.*?((?=\bon\b)|$)/\1\n/g
puts everything between \bon\b
and \boff\b
(where \b
matches word boundaries) on a single line. The main trick is that .*?
matches non-greedily, which is to say it matches the shortest string necessary to find a match for the full regex. The (?=\bon\b)
is a zero-length lookahead term, so that the .*?
matches only before another on
delimiter or the end of the line (this is to discard data between off
and on
).
The
s/\n$//
just removes the last newline that we don't need or want.
Upvotes: 2