Tom
Tom

Reputation: 9643

Use perl to output unique lines from a log file?

In a previous question I asked how to output from a log file depending on a regex: How to use grep to output unique lines of code from a file?

Now the script that I'm using now outputs lists such as:

11.12.13.14 www.mydomain.org.uk
11.12.13.16 www.mydomain.org.uk
105.2.3.1 www.myseconddomain.org.uk
105.2.3.1 myseconddomain.org.uk

What I would like to do is erase lines that share the same C class ip. So I would want to tweak the previous answer to output:

11.12.13.14 www.mydomain.org.uk
105.2.3.1 www.myseconddomain.org.uk

How can I accomplish that?

Upvotes: 1

Views: 103

Answers (2)

Ilmari Karonen
Ilmari Karonen

Reputation: 50328

Here's a Perl one-liner that should do the trick:

perl -ne 'print if /^((\d+\.){3})/ and not $seen{$1}++' < logfile.txt

The regexp /^((\d+\.){3}/ matches the first three octets of the IP (or, to be exact, three sequences of one or more digits, each followed by a period, at the beginning of the line) and captures them in $1. The expression $seen{$1}++ then increments the corresponding element in the hash %seen (creating it if needed) and returns the value before the increment (which will thus be false if and only if that value of $1 has not been seen before).

Upvotes: 2

Kent
Kent

Reputation: 195049

try awk one-liner:

awk '!a[$1]++ && !b[$2]++' file

test

kent$  echo "11.12.13.14 www.mydomain.org.uk
11.12.13.16 www.mydomain.org.uk
105.2.3.1 www.myseconddomain.org.uk
105.2.3.1 myseconddomain.org.uk"|awk '!a[$1]++ && !b[$2]++'
11.12.13.14 www.mydomain.org.uk
105.2.3.1 www.myseconddomain.org.uk

Upvotes: 0

Related Questions