Reputation: 6739
I want to know how many user have visited google.com using my proxy with last 30 minutes.
awk -v bt=$(date "+%s" -d "30 minutes ago") '($1 > bt) && $4~/google.com/ {printf("%s|%s|%s|%s\n", strftime("%F %T",$1), $2 , $3, $4)} ' access.log
The logs look like this
2017-02-19 12:09:44|[email protected]|200|https://google.com/
2017-02-19 12:10:23|[email protected]|200|https://google.com/
Now i can easily count the number of records
awk -v bt=$(date "+%s" -d "30 minutes ago") '($1 > bt) && $4~/google.com/ {printf("%s|%s|%s|%s\n", strftime("%F %T",$1), $2 , $3, $4)} ' access.log | wc -l
Output is 2.
How can i modify the command to display only records with unique email.In the above case the output should be 1.
Upvotes: 1
Views: 668
Reputation: 16997
To list result
awk -v FS='|' -v bt="$(date +'%Y-%m-%d %H:%M:%S' -d '30 minutes ago')" '
($1 > bt) && $4~/google.com/ && !seen[$2]++
' access.log
To get count
awk -v FS='|' -v bt="$(date +'%Y-%m-%d %H:%M:%S' -d '30 minutes ago')" '
($1 > bt) && $4~/google.com/ && !seen[$2]++{ count++ }
END{ print count+0 }
' access.log
For Testing
# Current datetime of my system
$ date +'%Y-%m-%d %H:%M:%S'
2017-02-26 00:06:19
# 30 minutes ago what was datetime
$ date +'%Y-%m-%d %H:%M:%S' -d '30 minutes ago'
2017-02-25 23:36:20
# Input file, I modified datetime to check command
$ cat f
2017-02-25 23:10:44|[email protected]|200|https://google.com/
2017-02-25 23:45:23|[email protected]|200|https://google.com/
Output - 1 to see result
$ awk -v FS='|' -v bt="$(date +'%Y-%m-%d %H:%M:%S' -d '30 minutes ago')" '
($1 > bt) && $4~/google.com/ && !seen[$2]++
' f
2017-02-25 23:45:23|[email protected]|200|https://google.com/
Output - 2 to see count
$ awk -v FS='|' -v bt="$(date +'%Y-%m-%d %H:%M:%S' -d '30 minutes ago')" '
($1 > bt) && $4~/google.com/ && !seen[$2]++{ count++ }
END{ print count+0 }
' f
1
Upvotes: 1
Reputation: 301
Simply pipe the logs to
sort -u -t "|" -k "2"
So you will have something like:
awk -v bt=$(date "+%s" -d "30 minutes ago") '($1 > bt) && $4~/google.com/ {printf("%s|%s|%s|%s\n", strftime("%F %T",$1), $2 , $3, $4)} ' access.log | sort -u -t "|" -k "2"
Upvotes: 0
Reputation: 351
You can use sort
to select unique email account.
And you can refer to is-there-a-way-to-uniq-by-column
Upvotes: 0