Reputation: 11
I have multiple files in the same directory, each file represents a user and contains IP's used to log into this account, each in a new line.
I want to create a script that will check if the same IP occurs in multiple files and of course print duplicates.
I've tried using awk but with no luck, any help appreciated!
Upvotes: 0
Views: 1993
Reputation: 158
How about something like:
diff -u <(cat * | sort) <(cat * | sort | uniq)
In other words, the difference between all the files concatenated and sorted, and all the files concatenated, sorted, and then the duplicates removed.
Upvotes: 0
Reputation: 2103
Assuming that there are no repeated IP addresses on the same file, this should work for IPv4 addresses in many Bash versions:
#!/bin/bash
#For IP addresses v4, assuming no repeated IP addresses on the same file; result is stored on the file /tmp/repeated-ips
mkdir -p /tmp
grep -rhEo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' /home/user/folder > /tmp/ipaddresses-holder
sort /tmp/ipaddresses-holder | uniq -d > /tmp/repeated-ips
Exit 0
The script below is a little more complex, but it would work whether or not there are repeated IP addresses on a single file:
#!/bin/bash
#For IP addresses v4, result is stored on the file /tmp/repeated-ips
mkdir -p /tmp
grep -rEo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' /home/user/folder > /tmp/ipaddresses-holder
sort -u /tmp/ipaddresses-holder > /tmp/ipaddresses-holder2
grep -rhEo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' /tmp/ipaddresses-holder2 > /tmp/ipaddresses-holder3
sort /tmp/ipaddresses-holder3 | uniq -d > /tmp/repeated-ips
Exit 0
In both cases, the result is stored on the file /tmp/repeated-ips
Upvotes: 1
Reputation: 8140
Not sure I understand your question correctly, so here's what I think you want to do:
You have several files. Each file refers to a specific user and logs every IP address that that user has used to log in from. Example:
$ cat alice.txt
192.168.1.1
192.168.1.5
192.168.1.1
192.168.1.1
$ cat bob.txt
192.168.0.1
192.168.1.3
192.168.1.2
192.168.1.3
$ cat eve.txt
192.168.1.7
192.168.1.5
192.168.1.7
192.168.0.7
You want to find out whether the same IP address appears in multiple files.
Here's what I came up with.
#!/usr/bin/env bash
SEARCH_TERMS="search_terms.txt"
for source_file in $@
do
for search_term in $(sort -u $source_file)
do
found=$(grep -F "${search_term}" $@ --exclude=${source_file})
if [[ -n "${found}" ]]; then
echo "Found ${search_term} from ${source_file} also here:"
echo ${found}
fi
done
done
It's probably not the best solution.
Upvotes: 0
Reputation: 1898
Use the following awk command:
awk '$0 in a {print FILENAME, "IP:", $0, "also in:", a[$0]; next} {a[$0] = FILENAME}' /tmp/user*
Assuming that you have file just with the IP like this
[tmp]$cat /tmp/user1
1.1.1.1
[tmp]$cat /tmp/user2
2.2.2.2
[tmp]$cat /tmp/user3
1.1.1.1
Output
[tmp]$awk '$0 in a {print FILENAME, "IP:", $0, "also in:", a[$0]; next} {a[$0] = FILENAME}' /tmp/user*
/tmp/user3 IP: 1.1.1.1 also in: /tmp/user1
Explanation
awk '
$0 in a { # if IP already exists in array a
print FILENAME, "IP:", $0, \ # print the output
"also in:", a[$0];
next; # get the next record without further
} # processing
{a[$0] = FILENAME} # if reached here, then we are seeing IP
' # for the first time, so store it
Upvotes: 1