Lastwish
Lastwish

Reputation: 327

Batch Script - Deleting specific lines from a text file

Suppose there are two files Temp1 & Temp2 containing below data =

Temp1.txt:

xxxx xxxxx xxxxxxxx xxxxx xxxxx
yyyyy yyyy yyy yyyyyyy yyyy yyy
zz zzzzz zz zzzz zzz zzz zz z z

Temp2.txt :

xxxx xxxxx xxxxxxxx xxxxx xxxxx
zz zzzzz zz zzzz zzz zzz zz z z
aaaa aa aaaa aa aaaaa aaa aaaaaa

The requirement is to delete (in Temp1) the lines which are matching with Temp2. And possibly save it in a different file. So, basically the output should be something like this :

Temp.txt :

yyyyy yyyy yyy yyyyyyy yyyy yyy

This is what i have got so far :

@echo off
SETLOCAL ENABLEDELAYEDEXPANSION
FOR /F "Delims=" %%A IN ('type "Temp2.txt"') DO (
    SET STRING=%%A
    FINDSTR /V /C:%STRING% "Temp1.txt" > Temp.txt
)

But, i think this code will keep the matching data, instead of deleting. Needs correction.

Upvotes: 1

Views: 3888

Answers (2)

dbenham
dbenham

Reputation: 130809

FINDSTR by itself ought to be a great solution. Reading the documentation, one would think the following literal search should work.

findstr /vlxg:"temp2.txt" "temp1.txt" >temp.txt

But the following FINDSTR bugs and limitations prevent the above from being reliable

The solution is to do a regular expression search instead. But this requires that regular expression meta characters within temp2.txt must be escaped. This is a perfect task for my JREPL.BAT regular expression find/replace utility. JREPL.BAT is a hybrid JScript/batch script that runs natively on any Windows machine from XP onward.

jrepl "[.*^$[\\]" "\$&" /f "temp2.txt"|findstr /rvxg:/ "temp1.txt" >"temp.txt"

The above works as follows.

The JREPL command escapes meta characters within temp2.txt and the output is piped to FINDSTR

The FINDSTR /R option treats all search strings as regular expressions

The /V option causes matching lines to be suppressed, and non matching lines are printed

The /X option means a search string must match the entire line

The /G:/ option instructs FINDSTR to read the search strings from stdin (the pipe)

The JREPL | FINDSTR solution has the following limitations, all due to FINDSTR behavior

  • All lines in temp2.txt must be <= 511 characters, even after the meta characters have been escaped
  • All lines in temp1.txt must be terminated by \r\n (carriage return linefeed)
  • \r must not appear anywhere within temp1.txt other than at the end of a line.

The limitations can be eliminated and the solution is much simpler if you download GNU grep for Windows - a port of the standard unix utility.

grep -x -v -F -f "temp2.txt" "temp1.txt" >"temp.txt"

Upvotes: 2

Stephan
Stephan

Reputation: 56155

you don't even need a script for this. It's a single command:

findstr /x /v /G:temp2.txt temp1.txt >temp.txt

/x compares whole lines

/v prints only lines, that do NOT match

/g uses a file (temp2.txt) to get the searchstrings

Upvotes: 2

Related Questions