user56512
user56512

Reputation:

Regular expression and sed to remove all occurences of some string from hundreds of files

I'm perusing the web for information about regular expressions and sed usage. I've also got sed's manual open. Still, I'm posting this question here because I'm sure someone uses the two often enough that they can probably answer this question before I work out a solution.

I've got a few hundred html documents with links like the following:
http://www.example.com/subfolder/abc.asp?page=1#main
I need to remove the "#main"

Does a pattern pop into mind?

Upvotes: 4

Views: 20428

Answers (3)

anubhava
anubhava

Reputation: 785551

Try this sed:

sed 's/^\(.*\)#.*$/\1/'

Or this better sed command:

sed 's/#.*$//'

Upvotes: 7

Dan Breen
Dan Breen

Reputation: 12934

Here's a snippet that works with perl on the command line. It's not sed, but I had it on hand:

perl -i -pe 's/#main//' *.html

To run it, and have it make backups, you can use:

perl -pi.bak -e 's/#main//' *.html

Upvotes: 4

Paul Creasey
Paul Creasey

Reputation: 28864

Assuming that #main is specific enough to do a simple find and replace:

find . -name '*.html' -print0 | xargs -0 sed -i 's/#main//g'

Upvotes: 2

Related Questions