Aa me
Aa me

Reputation: 52

Removing punctuation using sed

I am trying to write a script that removes punctuation from a text file.

I tried using sed, however am open to other suggestions (like awk)

This is my code so far

declare -a marks=('\.' '\,' '\;' '\:')

for i in {0..3}
do
    sed -i 's/${marks[i]}//g' test.txt
done
cat test.txt`

I think my main problem is am not using escape keys correctly.

Upvotes: 0

Views: 297

Answers (2)

Edouard Thiel
Edouard Thiel

Reputation: 6228

The command tr is great for that:

tr -d '[:punct:]' < test.txt > tmp.txt && mv -f tmp.txt test.txt

-d stands for delete.

Choose a non-existing file tmp.txt; to generate a temporary file a solution is mktemp -u.

Here is a small script which removes any punctuation in the files passed as arguments:

#! /bin/bash
t=$(mktemp -u)
for f ; do
    tr -d '[:punct:]' < "$f" > "$t" && mv -f "$t" "$f"
done

for f is a shortcut for for f in "$@", which iterates over each argument without word splitting.

Upvotes: 3

Shawn
Shawn

Reputation: 52549

Using ed instead:

printf "%s\n" 'g/[[:punct:]]/s/[[:punct:]]//g' w  | ed -s test.txt

removes all punctuation characters from a file and saves the remaining text.

Upvotes: 2

Related Questions