Reputation: 113
I have a very large file from which I need to delete a specific line (line number 941573 )
I'm somewhat new to this environment, but I've been googling the problem to no avail.
I've tried using the sed command as such, but it doesn't seem to be working
sed -e '941572,941574d' filenameX > newfilenameY
I've also tried
sed -e '941573d' filenameX > newfilenameY
Yet the 'newfilenameY' file and the original file 'filenameX' both still contain the line that I'm trying to delete. It's a fastq file, though I don't see how that would make any difference. Like I said I'm new to unix so maybe I've gotten the sed command wrong
Upvotes: 9
Views: 23311
Reputation: 44434
I generated a test file with 1000000 lines and tried your sed -e '941573d' filenameX > newfilenameY
and it worked fine on Linux.
Maybe we have some other misunderstanding. Line numbers count from one, not zero. If you counted from zero then you would find line 941572 was missing.
Did you try a diff filenameX newfilenameY
? That would highlight any unexpected changes.
I don't know much about FASTQ format, but are you sure we are talking about text file line numbers, and not sequence numbers?
There is a general line length limit of 4096 bytes, do any of your lines exceed that? (That's unlikely, but I thought it worth the question).
Upvotes: 0
Reputation:
d
deletes a line/lines. So your second approach works.
$ sed '941573d' input > output
Long Example:
% for i in $(seq 1000000)
do
echo i >> input
done
% wc -l input
1000000 input
% sed '941573d' input > output
% wc -l output
999999 output
% diff -u input output :(
--- input 2012-10-22 13:22:41.404395295 +0200
+++ output 2012-10-22 13:22:43.400395358 +0200
@@ -941570,7 +941570,6 @@
941570
941571
941572
-941573
941574
941575
941576
Short Example:
% cat input
foo
bar
baz
qux
% sed '3d' input > output
% cat output
foo
bar
qux
Upvotes: 10
Reputation: 333
Here is how to remove one or more lines from a file.
Syntax:
sed '{[/]<n>|<string>|<regex>[/]}d' <fileName>
sed '{[/]<adr1>[,<adr2>][/]d' <fileName>
/.../=delimiters
n = line number
string = string found in in line
regex = regular expression corresponding to the searched pattern
addr = address of a line (number or pattern )
d = delete
Upvotes: 1