Reputation: 53
The purpose of the program is to remove punctuation in text, and can handle option -c
to remove character you want.
#!/bin/bash
old_IFS=$IFS
IFS=$’\n’
if [ “$1” == “-c” ];then
if [ -f $2 ];then
for line in $(<$2)
do
echo $line | tr -d $3
done
IFS=$old_IFS
else
echo $2 | tr -d $3
fi
else
if [ -f $1 ];then
for line in $(cat $1)
do
echo $line | tr -d '[:punct:]'
done
IFS=$old_IFS
else
echo $1 | tr -d '[:punct:]'
fi
fi
And the text file is:
"Twaddle!", you say? I’ll have you know
there’s a {deep} truth
in what I said.
If i just want to remove punctuation, the result is:
Twaddle you say Ill have you k
iheres a deep truth
what I said
Other characters are lost, like now
of know
. Can anyone find what the problem is?
Upvotes: 0
Views: 1252
Reputation: 113864
The difficulties that you are having are due to the use of non-ASCII characters. In particular, look at:
IFS=$’\n’
That line does not work as intended because those are not normal ASCII single quotes. The result is that the character n
, among others, end up in the variable IFS. This causes word splitting on n
which is why the n
disappears out of know
.
Use instead:
IFS=$'\n'
The double-quotes are also nonstandard and should be replaced with ASCII double-quotes. In particular, this line:
if [ “$1” == “-c” ];then
should be replaced with:
if [ "$1" == "-c" ];then
The script's logic can be rearranged and simplified:
#!/bin/bash
remove='[:punct:]'
if [ “$1” == “-c” ]
then
remove=$3
shift
fi
if [ -f $1 ]
then
tr -d "$remove" <"$1"
else
echo "$1" | tr -d "$remove"
fi
Upvotes: 1
Reputation: 10039
#!/bin/bash
if [ "$1" = '-c' ]
then
Pattern="$( echo "$3" | sed 's/[]\[&\\{}()"]/\\&/g' )"
File="$2"
else
Pattern="[[:punct:]]"
File="$1"
fi
sed -i "s/${Pattern}//g" ${File}
using sed with few security about special char from "reduce" regex panel on your script
Upvotes: 0