nEJC
nEJC

Reputation: 555

grep redirect non-matching

I'm doing a simple grep for lines starting with some patteren like:

grep -E "^AAA" myfile > newfile

I would like to also (in the same go) redirect those non-matching lines to another file.
I know it would be possible to simply do it twice and use -v in the second try, but the files are (relatively) huge and only reading them once would save some quite valuable time...

I was thinking something along the line of redirecting non-matching to stderr like:

grep -E -magic_switch "^AAA" myfile > newfile 2> newfile.nonmatch

Is this trick somehow possible with grep or should I rather just code it?

(might be of additional value - I'm coding this in bash script)

Upvotes: 5

Views: 1013

Answers (5)

apottere
apottere

Reputation: 303

You can use process substitution to duplicate the pipe as the file is being read (inspiration https://unix.stackexchange.com/a/71511). This should be almost as performant, since the file is still only being read once.

Something like this should work:

cat file.txt | tee >(grep 'pattern' > matches.txt) | grep -v 'pattern' > non-matches.txt

Upvotes: 2

Zombo
Zombo

Reputation: 1

Here is a function for you:

function perg {
  awk '{y = $0~z ? "out" : "err"; print > "/dev/std" y}' z="$1" "$2"
}

Use it with a file

perg ^AAA myfile > newfile 2> newfile.nonmatch

or from a pipe

cat myfile | perg ^AAA > newfile 2> newfile.nonmatch

Upvotes: 0

Dennis Williamson
Dennis Williamson

Reputation: 360535

This will work:

awk '/pattern/ {print; next} {print > "/dev/stderr"}' inputfile

or

awk -v matchfile=/path/to/file1 -v nomatchfile=/path/to/file2 '/pattern/ {print > matchfile; next} {print > nomatchfile}' inputfile

or

#!/usr/bin/awk -f
BEGIN {
    pattern     = ARGV[1]
    matchfile   = ARGV[2]
    nomatchfile = ARGV[3]
    for (i=1; i<=3; i++) delete ARGV[i]
}

$0 ~ pattern {
    print > matchfile
    next
}

{
    print > nomatchfile
}

Call the last one like this:

./script.awk regex outputfile1 outputfile2 inputfile

Upvotes: 6

zwol
zwol

Reputation: 140796

I don't believe this can be done with grep, but it's only a few lines of Perl:

#! /usr/bin/perl
# usage: script regexp match_file nomatch_file < input

my $regexp = shift;
open(MATCH, ">".shift);
open(NOMATCH, ">".shift);

while(<STDIN>) {
    if (/$regexp/o) {
        print MATCH $_;
    } else {
        print NOMATCH $_;
    }
}

or Python, if you prefer:

#! /usr/bin/python
# usage: script regexp match_file nomatch_file < input

import sys
import re

exp = re.compile(sys.argv[1])
match = open(sys.argv[2], "w")
nomatch = open(sys.argv[3], "w")

for line in sys.stdin:
    if exp.match(line): match.write(line)
    else:               nomatch.write(line)

(Both totally untested. Your mileage may vary. Void where prohibited.)

Upvotes: 2

Brian Agnew
Brian Agnew

Reputation: 272377

I fear this may not be possible. I'd use Perl and do something like:

if (/^AAA/) {
   print STDOUT $_;
}
else
{
   print STDERR $_;
}

Upvotes: 2

Related Questions