Reputation: 434
I need to remove line beginning with '#' in some txt file. but ignoring the first line as it header. how to make grep ignore first lines and remove any line beginning with # for rest of the lines?
cat sample.txt
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,xyz
cat sample.txt | grep -v "^\s*[#\;]\|^\s*$" > "out.txt"
but this removes the header too!
Upvotes: 4
Views: 20252
Reputation: 63
Applying an arbitrary command to all but the first line - a "header" - of a file or stream of tabular data is such a common task for me that I define a helper utility called body
for it:
As a shell function (put this in your ~/.bashrc
or equivalent):
body() {
IFS= read -r header
printf '%s\n' "$header"
"$@"
}
Now:
$ cat sample.txt | body grep -v '^#'
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz
Credit: adapted from: Command line tools for doing data science, where it's a one of many handy data tools you can put in your shell's PATH variable. Wish many of these could be canonicalized as standard UNIX tools.
Upvotes: 2
Reputation: 58371
This might work for you (GNU sed):
sed '1b;/^#/d' file
Ignore the first line and delete any other lines that start with #
.
Upvotes: 2
Reputation: 203189
This will cause any awk to print each line if its line number is 1 or it doesn't start with #
:
$ awk 'NR==1 || !/^#/' file
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz
Upvotes: 1
Reputation: 88563
With sed:
sed '2,${/^#/d}' sample.txt
From second row (2
) to last row ($
): search (/.../
) for rows beginning (^
) with #
and delete (d
) them. Default action of sed is to print current row.
Output:
#"EVENT",VERSION, NAME 1,2,xyz 1,2,abc 1,2,asd 1,2,ert 1,2,xyz 1,2,abc 1,2,xyz
Upvotes: 7
Reputation: 37039
Try a combination of head
and grep
like so:
head -1 sample.txt > out.txt && grep -v "^#" sample.txt >> out.txt
Result
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz
Alternate method
grep "^#" sample.txt | head -1 > out.txt && grep -v "^#" sample.txt >> out.txt
That is - grep lines beginning with # but just choose the first one and write it to a file. Then, grep all lines not starting with # and append those liens to the same output file.
Upvotes: 1