Reputation: 1750
I'm trying to split a file up. sed
can be used to do this, for example
sed -e '0,/expr/d' filename
would give the bottom half of the file after "expr" But what if there is more than one occurrence and I want to split after the nth occurrence? I figured out if I want it after the second occurrence then
sed -e '0,/expr/! {/expr/,$d}' filename
gives the top half of the file up to the second match of "expr". The exclamation point (!) tells it to ignore the first range and only apply the commands in the braces to the other parts of the file.
But what about more general cases? For example, from the second last occurrence.
I've been using sed
here, but I think awk
would have elegant solutions too.
Upvotes: 4
Views: 417
Reputation: 58420
This might work for you (GNU sed):
sed -nr 'x;/^X{2}/{x;p;b};x;/REGEXP/{x;s/^/X/;x}' file
This will print out anything after the 2nd match of REGEXP
.
N.B.The REGEXP
may occur one or more times per line but will only be counted once.
Upvotes: 0
Reputation: 11703
Some more variations of awk
in addition to @rici's solutions
Up to and including the $n
th match:
awk -v n=$n 'p<n; /regex/{p++}' file
Up to but not including the $n
th match:
awk -v n=$n '/regex/{p++} p<n' file
From and including $n
th match
awk -v n=$n '/regex/{p++} p>=n' file
From and not including $n
th match
awk -v n=$n 'p>=n; /regex/{p++}' file
But what about more general cases? For example, from the second last occurrence.
In that case simple approach would be to read file reverse with tac
, do above options and print it again in reverse.
From and including $n
th last match
tac file | awk -v n=$n 'p<n; /regex/{p++}' | tac
From and not including $n
the last match
tac file | awk -v n=$n '/regex/{p++} p<n' | tac
Up to and including $n
th last match
tac file | awk -v n=$n '/regex/{p++} p>=n' | tac
Up to and not including $n
th last match
tac file | awk -v n=$n 'p>=n; /regex/{p++}' | tac
Note for OS X users as pointed out by @mklement0 in comments
Poor [stock] OS X users (as of OS X 10.9) are out of luck: no tac
there.
on OS X you can use tail -r
(note that tail
on Linux appears not to support -r
).
Upvotes: 2
Reputation: 241721
Simple awk solutions:
Up to and including the $n
th match of /regex/
:
awk -vn=$n '{print}/regex/&&!--n{exit}'
Up to but not including the $n
th match:
awk -vn=$n '/regex/&&!--n{exit}{print}'
In both the above programscases, setting n to 0 will print the whole file. Also, both uses of {print}
can be changed to 1;
because the default action is {print}
. (Or just 1
in the second program.)
For completeness:
Everything after the $n
th match:
awk -vn=$n 'n<=0;/regex/{--n}'
Note: As pointed out in a comment by @mklement0, there is a bug in command-line option parsing in versions of BSD Awk (aka "one-true-awk", the version written and as far as I know still maintained by Brian Kernighan) prior to May 23, 2010; this apparently includes the version distributed with Mac OS X (as of v10.9). As a result, if you use one of these awk versions, you need to write -v n=$n
instead of -vn=$n
.
Upvotes: 2