read file line by line and remove characters shell script

Question

The purpose of the program is to remove punctuation in text, and can handle option -c to remove character you want.

#!/bin/bash
old_IFS=$IFS
IFS=$’
’
if [ “$1” == “-c” ];then
 if [ -f $2 ];then
  for line in $(<$2)
  do
   echo $line | tr -d $3
  done
  IFS=$old_IFS
 else
  echo $2 | tr -d $3
 fi
else
 if [ -f $1 ];then
  for line in $(cat $1)
  do
   echo $line | tr -d '[:punct:]'
  done
  IFS=$old_IFS
 else
   echo $1 | tr -d '[:punct:]'
 fi
fi

And the text file is:

"Twaddle!", you say?  I’ll have you know
there’s a {deep} truth
in what I said.

If i just want to remove punctuation, the result is:

Twaddle you say  Ill have you k
iheres a deep truth
 what I said

Other characters are lost, like now of know. Can anyone find what the problem is?

John1024 · Accepted Answer

The difficulties that you are having are due to the use of non-ASCII characters. In particular, look at:

IFS=$’
’

That line does not work as intended because those are not normal ASCII single quotes. The result is that the character n, among others, end up in the variable IFS. This causes word splitting on n which is why the n disappears out of know.

Use instead:

IFS=$'
'

The double-quotes are also nonstandard and should be replaced with ASCII double-quotes. In particular, this line:

if [ “$1” == “-c” ];then

should be replaced with:

if [ "$1" == "-c" ];then

Alternative script

The script's logic can be rearranged and simplified:

#!/bin/bash
remove='[:punct:]'
if [ “$1” == “-c” ]
then
    remove=$3
    shift
fi
if [ -f $1 ]
then
  tr -d "$remove" <"$1"
else
  echo "$1" | tr -d "$remove"
fi

read file line by line and remove characters shell script

Answers (2)

Alternative script

Related Questions