cpashia
cpashia

Reputation: 1

How to check if a list of strings are present in two separate files

I have two files, "File A" is a list of IP Addresses with corresponding MAC addresses on the same line. "File B" is a list of only MAC addresses. I need to compare the two files and list the lines from File A that do not have MAC addresses found in File B.

FILE A:

172.0.0.1 AA:BB:CC:DD:EE:01
172.0.0.2 AA:BB:CC:DD:EE:02
172.0.0.3 AA:BB:CC:DD:EE:03

FILE B:

AA:BB:CC:DD:EE:01
AA:BB:CC:DD:EE:02

So the output should be:

172.0.0.3 AA:BB:CC:DD:EE:03

I am looking for solutions in sed, awk, grep, python or really anything that give me the file I want.

Upvotes: 0

Views: 261

Answers (9)

potong
potong

Reputation: 58430

This might work for you (GUN sed);

sed 's|.*|/&/Id|' fileb | sed -f - filea

Upvotes: 0

Birei
Birei

Reputation: 36262

One way using awk. It saves MACs from fileB in an array and for each second field of fileA check it in the array and only print when not found.

awk '
    FNR == NR {
        data[ $0 ] = 1;
        next;
    }
    NFR < NR && !($2 in data)
' fileB fileA

Output:

172.0.0.3 AA:BB:CC:DD:EE:03

Upvotes: 1

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 250971

with open(FILEB) as file1,open(FILEA) as file2:
file1={mac.strip() for mac in file1}
file2={line.split()[1]:line.split()[0] for line in file2}
    for x in file2:
        if x not in file1:
            print("{0} {1}".format(file2[x],x))

output:

172.0.0.2 AA:BB:CC:DD:EE:05
172.0.0.4 AA:BB:CC:DD:EE:06
172.0.0.6 AA:BB:CC:DD:EE:03
172.0.0.66 AA:BB:CC:DD:EE:0E

Upvotes: 1

Phil Cooper
Phil Cooper

Reputation: 5877

Python:

macs = set(line.strip() for line in open('fileb'))
with open('filea') as ips:
    for line in ips:
        ip,mac = line.split()
        if mac not in macs:
            print line

EDIT: OK so everyone posted the same python answer. I reach for python first too but gawk at this:

awk 'NR == FNR {fileb[$1];next} !($2 in fileb)' fileb filea

EDIT2: OP removed the leading $ from the lines so python and awk change and fgrep comes out to play.

fgrep -v -f fileb filea

Upvotes: 1

jfs
jfs

Reputation: 414315

#!/usr/bin/env python
with open('fileb') as fileb, open('filea') as filea:
    macs = set(map(str.strip, fileb))
    for line in filea:
        ip_mac = line.split()
        if len(ip_mac) == 2 and ip_mac[1] not in macs:
           print(" ".join(ip_mac))

Upvotes: 1

mgilson
mgilson

Reputation: 309929

with open('filea','r') as fa:    
    with open('fileb','r') as f:
        MACS=set(line.strip() for line in f)

    for line in fa:
        IP,MAC=line.split()
        if MAC not in MACS:
            print (line.strip())

Upvotes: 1

Rob Davis
Rob Davis

Reputation: 15772

Does your input really have a dollar sign at the start of every line, or is that a formatting quirk of your question? If you can get rid of the dollar signs, then you can use this:

fgrep -v -f fileb filea

Upvotes: 4

corsiKa
corsiKa

Reputation: 82579

I could whip up a Java example that you could translate to whatever language you want

import java.io.*;
import java.util.*;
class Macs {
    public static void main(String...args)throws Exception {
        Set<String> macs = loadLines("macs.txt");
        Set<String> ips = loadLines("ips.txt");

        for(String raw : ips) {
            String[] tokens = raw.split("\\s"); // by space
            String ip = tokens[0];
            String mac = tokens[1];
            if(!macs.contains(mac))
                System.out.println(raw);
        } 
    }

    static Set<String> loadLines(String filename) throws Exception {
        Scanner sc = new Scanner(new File(filename));
        Set<String> lines = new HashSet<String>();
        while(sc.hasNextLine()) {
            // substring(1) removes leading $
            lines.add(sc.nextLine().substring(1).toLowerCase());
        }
        return lines;
    }
}

Redirecting this output to a file will give you your result.

With the following input file of

macs.txt

$AA:BB:CC:DD:EE:01
$AA:BB:CC:DD:EE:02
$AA:BB:CF:DD:EE:09
$AA:EE:CF:DD:EE:09

ips.txt

$172.0.0.1 AA:BB:CC:DD:EE:01
$172.0.0.2 AA:BB:CC:DD:EE:02
$172.0.0.2 AA:BB:CC:DD:EE:05
$172.0.0.66 AA:BB:CC:DD:EE:0E
$172.0.0.4 AA:BB:CC:DD:EE:06
$172.0.0.5 AA:BB:CF:DD:EE:09
$172.0.0.6 AA:BB:CC:DD:EE:03

Result:

c:\files\j>java Macs
172.0.0.6 aa:bb:cc:dd:ee:03
172.0.0.66 aa:bb:cc:dd:ee:0e
172.0.0.2 aa:bb:cc:dd:ee:05
172.0.0.4 aa:bb:cc:dd:ee:06

Upvotes: 0

Mark Ransom
Mark Ransom

Reputation: 308206

Python is easiest. Read File B into a dictionary, then go through File A and look for a match in the dictionary.

Upvotes: 0

Related Questions