Dymond
Dymond

Reputation: 2277

Sorting a txt file with Python

I need some help printing out a sorted txt logfile. there is no problem with the printing except that I dont want to print out the same IP number more than once.

This is my code.

    text_file = open("access_log.txt")
entire_file = text_file.readlines()
text_file.close()

for line in reversed(entire_file):
    try:
        arr = line.split(' ')
        date = arr[3] 
        print arr[0], "- - ", date[1:], " ",arr[6] 
    except IndexError, e:
        error = e

As you se I just want to print out the IP number, the date and page that been visited. But only once from similar IP.

Well as you maybe see Im a total beginner =) Thanks

Upvotes: 2

Views: 279

Answers (2)

Austin Marshall
Austin Marshall

Reputation: 3097

You can use groupby() from itertools to group an iterable by a key that you specify, and then only operate on the key (or the first item in a group), so long as it's sorted:

split=lambda l: l.split(' ')
for key, group in groupby(sorted(map(split, f)), key=itemgetter(0)):
    line=next(group)
    print key, "- - ", line[3][1:], " ", line[6]

Upvotes: 0

eumiro
eumiro

Reputation: 212825

# empty set of already seen ips:
seen_ips = set()

with open("access_log.txt") as f:
    for line in lines:
        arr = line.split(' ')
        date = arr[3] 

        # if the ip still not seen, then print and add it to the seen_ips set:
        if arr[0] not in seen_ips:
            print arr[0], "- - ", date[1:], " ",arr[6]
            seen_ips.add(arr[0])
        # else (i.e. ip already seen) ignore and go on with the next line

Upvotes: 4

Related Questions