Reputation: 3
I have these files
NotRequired.txt
(having lines which need to be remove)Need2CleanSED.txt
(big file , need to clean)Need2CleanGRP.txt
(big file , need to clean)content:
more NotRequired.txt
[abc-xyz_pqr-pe2_123]
[lon-abc-tkt_1202]
[wat-7600-1_414]
[indo-pak_isu-5_761]
I am reading above file and want to remove lines from Need2Clean???.txt
, trying via SED and GREP but no success.
myFile="NotRequired.txt"
while IFS= read -r HKline
do
sed -i '/$HKline/d' Need2CleanSED.txt
done < "$myFile"
myFile="NotRequired.txt"
while IFS= read -r HKline
do
grep -vE \"$HKline\" Need2CleanGRP.txt > Need2CleanGRP.txt
done < "$myFile"
Looks as if the Variable and characters [] making some problem.
Upvotes: 1
Views: 275
Reputation:
try gnu sed:
sed -Ez 's/\n/\|/g;s!\[!\\[!g;s!\]!\\]!g; s!(.*).!/\1/d!' NotRequired.txt| sed -Ef - Need2CleanSED.txt
Two sed process are chained into one by shell pipe
NotRequired.txt
is 'slurped' by sed -z
all at once and substituted its \n
and [
meta-char with |
and \[
respectively of which the 2nd process uses it as regex script for the input file, ie. Need2CleanSED.txt. 1st process output;
/\[abc-xyz_pqr-pe2_123\]|\[lon-abc-tkt_1202\]|\[wat-7600-1_414\]|\[indo-pak_isu-5_761\]/d
add -u
ie. unbuffered, option to evade from batch process, sort of direct i/o
Upvotes: 0
Reputation: 204015
What you're doing is extremely inefficient and error prone. Just do this:
grep -vF -f NotRequired.txt Need2CleanGRP.txt > tmp &&
mv tmp Need2CleanGRP.txt
Thanks to grep -F
the above treats each line of NotRequired.txt as a string rather than a regexp so you don't have to worry about escaping RE metachars like [
and you don't need to wrap it in a shell loop - that one command will remove all undesirable lines in one execution of grep
.
Never do command file > file
btw as the shell might decide to execute the > file
first and so empty file
before command
gets a chance to read it! Always do command file > tmp && mv tmp file
instead.
Upvotes: 3
Reputation: 6168
Your assumption is correct. The [...]
construct looks for any characters in that set, so you have to preface ("escape") them with \
. The easiest way is to do that in your original file:
sed -i -e 's:\[:\\[:' -e 's:\]:\\]:' "${myFile}"
If you don't like that, you can probably put the sed
command in where you're directing the file in:
done < replace.txt|sed -e 's:\[:\\[:' -e 's:\]:\\]:'
Finally, you can use sed on each HKline
variable:
HKline=$( echo $HKline | sed -e 's:\[:\\[:' -e 's:\]:\\]:' )
Upvotes: 0