user1272849
user1272849

Reputation: 21

How to make this python script faster?

I tried to find an overlap list from two set of list. I first generate first list by

while True:
    line=f.readline()
    if not line:
        break
    list_1.append(line)

and use this list to scanning through the second file:

while True:
    line1=f1.readline()
    if not line1:
        break
    for i in list_1:
        if i==line1[:17]:
            list_2.append(line1)

Upvotes: 0

Views: 247

Answers (3)

garnertb
garnertb

Reputation: 9584

If you are trying to find the differences in two files you can also use the difflib module, which is included in the python standard library. The module provides classes and functions for comparing sequences. It can be used for example, for comparing files, and can produce difference information in various formats, including HTML and context and unified diffs. You can find useful comparison methods in the difflib documentation.

difflib.SequenceMatcher(None, file1.read(), file2.read())

Upvotes: 2

Lauritz V. Thaulow
Lauritz V. Thaulow

Reputation: 50985

ThiefMasters answer will output the common lines in arbitrary order. If you want the items output in the same order as they appear in one of the files, first read the other file into a set:

with open("file1.txt") as f:
    file1_set = set(f)

Then search through the file that controls the order:

with open("file2.txt") as f:
    list2 = [line for line in f if line in file1_set]

If the generated list2 does not fit in memory (I guess this is rather far fetched) we can still make it work by writing the results back to an output file continuously:

with open("file2.txt") as f:
    with open("out.txt", "w") as out:
        for line in f:
            if line in file1_set:
                out.write(line)

Upvotes: 4

ThiefMaster
ThiefMaster

Reputation: 318478

As long as none of the files is excessively big, store all lines in sets and then compare those sets:

lines_1 = set(f)
lines_2 = set(f1)
lines_in_both = lines_1 & lines_2

Upvotes: 2

Related Questions