Reputation: 339
I want to delete lines in FILE1 contains pattern in FILE2.
How do I do this using shell/bash or Tcl?
For example:
FILE1:
This is ECO_01
This is ECO_02
This is ECO_03
This is ECO_04
FILE2:
ECO_02
ECO_04
Output:
This is ECO_01
This is ECO_03
Upvotes: 4
Views: 2097
Reputation: 247210
Another Tcl solution:
set fid [open file2 r]
set patterns [lmap line [split [read -nonewline $fid] \n] {string trim $line}]
close $fid
set fid [open file1 r]
set lines [split [read -nonewline $fid] \n]
close $fid
set wanted [lsearch -inline -all -regexp -not $lines [join $patterns "|"]]
puts [join $wanted \n]
This is ECO_01
This is ECO_03
Ref: lsearch
man page
Upvotes: 0
Reputation: 137787
In Tcl, you'd load the file of patterns in and use them to then do the filtering. It's probably simplest to keep the main filtering flow going from standard input to standard output; you can redirect those from/to files easily enough. Since you seem to be wanting to use “is pattern a substring of” as a matching rule, you can do that with string first
, leading to this code:
# Load in the patterns from the file named by the first argument
set f [open [lindex $argv 0]]
set patterns [split [string trimright [read $f] \n] \n]
close $f
# Factor out the actual matching
proc matches {theString} {
global patterns
foreach pat $patterns {
# Change the next line to use other matching rules
if {[string first $pat $theString] >= 0} {
return true
}
}
return false
}
# Read all input lines and print all non-matching lines
while {[gets stdin line] >= 0} {
if {![match $line]} {
puts $line
}
}
I find it helpful to factor out procedures with key bits like “does this line match any of my patterns?” You'd probably call the above code a bit like this:
tclsh doFiltering.tcl patterns.txt <input.txt >output.txt
Upvotes: 2
Reputation: 2129
You just have to use sed command ( as shown below ) to delete the matching lines from FILE1.
macOS:
for i in `cat FILE2.txt`
do
sed -i '' "/$i/d" FILE1.txt
done
Linux:
for i in `cat FILE2.txt`
do
sed -i '/$i/d' FILE1.txt
done
Upvotes: 0
Reputation: 67567
most generic solution will be
$ grep -vf file2 file1
note that any substring match on any field will count. If you only restrict to exact match on an exact field (here assumed the last)
$ awk 'NR==FNR{a[$1]; next} !($NF in a)' file2 file1
Upvotes: 4