Junipar70
Junipar70

Reputation: 3

need to clean file via SED or GREP

I have these files

content:

more NotRequired.txt
[abc-xyz_pqr-pe2_123]
[lon-abc-tkt_1202]
[wat-7600-1_414]
[indo-pak_isu-5_761]

I am reading above file and want to remove lines from Need2Clean???.txt, trying via SED and GREP but no success.

myFile="NotRequired.txt"

while IFS= read -r HKline

do

  sed -i '/$HKline/d' Need2CleanSED.txt

done < "$myFile"


myFile="NotRequired.txt"

while IFS= read -r HKline

do

  grep -vE \"$HKline\" Need2CleanGRP.txt > Need2CleanGRP.txt

done < "$myFile"

Looks as if the Variable and characters [] making some problem.

Upvotes: 1

Views: 275

Answers (3)

user7712945
user7712945

Reputation:

try gnu sed:

sed -Ez 's/\n/\|/g;s!\[!\\[!g;s!\]!\\]!g; s!(.*).!/\1/d!' NotRequired.txt| sed -Ef - Need2CleanSED.txt

Two sed process are chained into one by shell pipe
NotRequired.txt is 'slurped' by sed -z all at once and substituted its \n and [ meta-char with | and \[ respectively of which the 2nd process uses it as regex script for the input file, ie. Need2CleanSED.txt. 1st process output;

/\[abc-xyz_pqr-pe2_123\]|\[lon-abc-tkt_1202\]|\[wat-7600-1_414\]|\[indo-pak_isu-5_761\]/d

add -u ie. unbuffered, option to evade from batch process, sort of direct i/o

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 204015

What you're doing is extremely inefficient and error prone. Just do this:

grep -vF -f NotRequired.txt Need2CleanGRP.txt > tmp &&
mv tmp Need2CleanGRP.txt

Thanks to grep -F the above treats each line of NotRequired.txt as a string rather than a regexp so you don't have to worry about escaping RE metachars like [ and you don't need to wrap it in a shell loop - that one command will remove all undesirable lines in one execution of grep.

Never do command file > file btw as the shell might decide to execute the > file first and so empty file before command gets a chance to read it! Always do command file > tmp && mv tmp file instead.

Upvotes: 3

Jack
Jack

Reputation: 6168

Your assumption is correct. The [...] construct looks for any characters in that set, so you have to preface ("escape") them with \. The easiest way is to do that in your original file:

sed -i -e 's:\[:\\[:' -e 's:\]:\\]:' "${myFile}"

If you don't like that, you can probably put the sed command in where you're directing the file in:

done < replace.txt|sed -e 's:\[:\\[:' -e 's:\]:\\]:'

Finally, you can use sed on each HKline variable:

HKline=$( echo $HKline | sed -e 's:\[:\\[:' -e 's:\]:\\]:' )

Upvotes: 0

Related Questions