Reputation: 65
i want to delete all the lines after the last occurence of pattern except the pattern itself
file.txt
honor
apple
redmi
nokia
apple
samsung
lg
htc
file.txt what i want
honor
apple
redmi
nokia
apple
what i have tried
sed -i '/apple/q' file.txt
this deletes all the line after the first occurence of pattern -
honor
Upvotes: 5
Views: 1842
Reputation: 58578
This might work for you (GNU sed):
sed '/apple/,$!b;//!H;//{x;//p;x;h};${x;P};d' file
Print as usual any lines that are not from the first appearance of apple
to the end of the file. For lines within the above range, append lines that do not contain the word apple
to the hold space (HS). Lines that do contain the word apple
, first swap to the HS and print any line there if the word apple
is there, then replace the HS with the line containing apple
. Delete all lines other than the last line. At the endof file print the first line of the HS and delete the remaining lines.
If slurping a large file is not a problem use:
sed -rz 's/(.*apple[^\n]*).*/\1\n/' file
This uses greed to capture all lines before and including the word apple
.
Upvotes: 1
Reputation: 47239
Given that you are dealing with large input I would go with a two-pass coreutils
solution, e.g.:
n=$(grep -Fn apple infile | tail -n1 | cut -d: -f1)
[ -n "$n" ] && head -n$n infile > outfile
This uses grep's fixed string matching (-F
) to find every line containing apples. Then head is used to extract the relevant lines.
You did not specify what happens when no apples are found, so this solution does nothing when that occurs.
Upvotes: 0
Reputation: 104102
If you don't mind having everything in memory, you can do:
$ awk '/^apple$/{last=NR}
{lines[NR]=$0}
END{for(li=1;li<=last;li++) print lines[li]}' file
honor
apple
redmi
nokia
apple
Upvotes: 0
Reputation: 204638
Simple, robust 2-pass approach using almost no memory:
$ awk 'NR==FNR{if (/apple/) hit=NR; next} {print} FNR==hit{exit}' file file
honor
apple
redmi
nokia
apple
If that doesn't execute fast enough THEN it's time to try some alternatives to see if any produce a performance improvement.
Upvotes: 7
Reputation: 67567
here is another awk
without scanning the file twice
$ awk 'f {buf=buf ORS $0}
/apple/ {f=1; if(buf)print buf; buf=$0}
!f' file
honor
apple
redmi
nokia
apple
Upvotes: 0
Reputation: 212664
If you don't want to reverse the file as Barmar suggests, you will either have to read the file in reverse using lower level tools (eg, fseek) or read it twice:
sed $(awk '/apple/{a=NR}END{print a+1}' input),\$d input
(Note that if the pattern does not appear in the file, this will output nothing. That's an edge case you should worry about.)
Upvotes: 1
Reputation: 782693
Reverse the file, print everything starting from the first occurrence of the pattern, then reverse the result:
tac file.txt | sed -n '/apple/,$p' | tac > newfile.txt
You can find the line number of the last match, then use that to print the first N lines of the file:
line=$(awk '/apple/ { line=NR } END {print line}')
head -n $line file.txt > newfile.txt
Upvotes: 5