curlywei
curlywei

Reputation: 720

Delete string after '#' using sed

I have a text file that looks like:

#filelists.txt
a
# aaa
b
#bbb
c #ccc

I want to delete parts of lines starting with '#' and afterwards, if line starts with #, then to delete whole line.

So I use 'sed' command in my shell:

sed -e "s/#*//g" -e "/^$/d" filelists.txt

I wish its result is:

a
b
c

but actually result is:

filelists.txt
a
 aaa
b
bbb
c ccc

What's wrong in my "sed" command?

I know '*' which means "any", so I think that '#*' means string after "#".

Isn't it?

Upvotes: 2

Views: 6092

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

You may use

sed 's/#.*//;/^$/d' file > outfile

The s/#.*// removes # and all the rest of the line and /^$/d drops empty lines.

See an online test:

s="#filelists.txt
a
# aaa
b
#bbb
c #ccc"

sed 's/#.*//;/^$/d' <<< "$s"

Output:

a
b
c 

Another idea: match lines having #, then remove # and the rest of the line there and drop if the line is empty:

sed '/#/{s/#.*//;/^$/d}' file > outfile

See another online demo.

This way, you keep the original empty lines.

Upvotes: 3

Amadan
Amadan

Reputation: 198324

* does not mean "any" (at least not in regular expression context). * means "zero or more of the preceding pattern element". Which means you are deleting "zero or more #". Since you only have one #, you delete it, and the rest of the line is intact.

You need s/#.*//: "delete # followed by zero or more of any character".

EDIT: was suggesting grep -v, but didn't notice the third example (# in the middle of the line).

Upvotes: 2

Related Questions