Reputation: 7089
I've got a bunch of files with multiple rows containing copies of the header. Is there a way using sed to delete all occurrences except for the first line. I was thinking something like:
sed -i '/textOnlyInHeader/d' file.txt
Except this will delete the header as well. As a bonus, is there a way to do this recursively for all files in a bunch of subdirectories?
Upvotes: 1
Views: 86
Reputation: 58401
This might work for you (GNU sed):
sed '1h;1b;G;/^\(.*\)\n\1$/!P;d' file
Upvotes: 2
Reputation: 10039
sed -i '1 !{
/textOnlyInHeader/ d
}' file.txt
avoid first line and use your sed on other line
for the recursive, you could passe to sed a list of file (in place of just file.txt). So prepare the list before with a shell function (find, ls, ..., loop) an pass it to the sed as argument
Upvotes: 1
Reputation: 2883
I know there's already an accepted answer using gawk
, but using sed
:
sed -i -e '2,$s/textOnlyInHeader/DELETELINE/' -e '/DELETELINE/d' file.txt
for the recursive answer, I concur with Steve that a loop with find
is the way to go.
Upvotes: -1
Reputation: 54392
I think gawk
would be best for this. Try:
gawk -i inplace 'NR==1 { r = $0; print } r == $0 { next }1' file.txt
For all files in a single directory, change NR
to FNR
and run:
gawk -i inplace '...' *.txt
For all files in many subdirectories, you can use a for
loop:
for i in $(find /path/to/files -type f -name '*.txt'); do ... ; done
If you're using an old or non-GNU AWK, you will need to write to a temp file first:
awk '...' file.txt > file.tmp && mv file.tmp file.txt
Upvotes: 3