technerdius
technerdius

Reputation: 353

Remove characters in all text files in a directory using sed

I have a lot of text files that are email templates. Many of them, for some reason, have the following line:

Best Regards,œ

That strange character at the end is what I am interested in removing from all of these files with a single command.

I tried:

for f in *
do 
  sed 's/"Best Regards,œ"/"Best Regards,"/g' $f | tee $f.t && mv $f.t $f
done 

This ran through the process but did not actually remove the 'œ' character.

Please let me know what I am doing incorrectly so I can remove this character and maybe other non-alphanumeric characters using regex [:alnum:], perhaps.

I fixed the issue with removing the unwanted character with:

for f in * 
do 
  sed 's/Best\ Regards\,\œ/Best\ Regards\,/g' $f | tee $f.t && mv $f.t $f   
done 

However, this still does not remove all of the non-alphanumeric characters from each line of each file. The other things I have tried either do not execute or remove the entire line.

I appreciate your help.

Upvotes: 0

Views: 845

Answers (2)

ghoti
ghoti

Reputation: 46856

If ① you don't want to have to worry about Unicode, UTF-anything, LANG, etc, and ② you are confident that lines that start with the words "Best Regards," and ONLY those lines are the ones you want to affect, you can simply do this:

sed -i .bak '/^Best Regards,.*/s//Best Regards,/' *

Note that this processes all files in the current directory. If you want to do this in subdirectories, you could use find, with all its goodness. For example:

find /path/to/start/ -exec \
  sed -i .bak '/^Best Regards,.*/s//Best Regards,/' {} \;

or if your shell is bash, you could use globstar:

shopt -s globstar
for f in **/*; do
  sed -i .bak '/^Best Regards,.*/s//Best Regards,/' "$f"
done

Rather than using tee and mv, these solutions use sed's built-in "in-place" option, and creates a .bak file as a result. Consult the documentation for your implementation of sed to learn more about how to use the -i option -- it works a little differently with different seds.

This approach eliminates the need to search for that character in particular, so you won't need to worry about how it's being represented. Beware though, it will also eliminate any other text that follows the search string on the same line.

Upvotes: 2

hek2mgl
hek2mgl

Reputation: 158100

You don't need the loop. You can pass the results of the glob expression directly to sed and use the -i option for in place editing of files:

sed -i.bak 's/Best Regards,œ/Best Regards,/' *

-i.bak changes the input file in place and creates a backup file with the extension .bak.

Some implementations of sed, for example GNU sed even support -i without an argument other allow an empty string as argument for -i. In that case sed will not keep any backup files and simply change the original file.

With GNU sed:

sed -i 's/Best Regards,œ/Best Regards,/' *
# OR (BSD, MacOS)
sed -i '' 's/Best Regards,œ/Best Regards,/' *

Upvotes: 2

Related Questions