Reputation: 2277
I need some help printing out a sorted txt logfile. there is no problem with the printing except that I dont want to print out the same IP number more than once.
This is my code.
text_file = open("access_log.txt")
entire_file = text_file.readlines()
text_file.close()
for line in reversed(entire_file):
try:
arr = line.split(' ')
date = arr[3]
print arr[0], "- - ", date[1:], " ",arr[6]
except IndexError, e:
error = e
As you se I just want to print out the IP number, the date and page that been visited. But only once from similar IP.
Well as you maybe see Im a total beginner =) Thanks
Upvotes: 2
Views: 279
Reputation: 3097
You can use groupby()
from itertools
to group an iterable by a key that you specify, and then only operate on the key (or the first item in a group), so long as it's sorted:
split=lambda l: l.split(' ')
for key, group in groupby(sorted(map(split, f)), key=itemgetter(0)):
line=next(group)
print key, "- - ", line[3][1:], " ", line[6]
Upvotes: 0
Reputation: 212825
# empty set of already seen ips:
seen_ips = set()
with open("access_log.txt") as f:
for line in lines:
arr = line.split(' ')
date = arr[3]
# if the ip still not seen, then print and add it to the seen_ips set:
if arr[0] not in seen_ips:
print arr[0], "- - ", date[1:], " ",arr[6]
seen_ips.add(arr[0])
# else (i.e. ip already seen) ignore and go on with the next line
Upvotes: 4