Compare item in two big lists

I have two "big" lists! Both of them have about over 24.000 items and I have to select:

As my calculation, if I run loop to find the difference, there will have 24.000x2=48.000 loops!

Is there anyway to compare faster than my way?

just an example:

values of list 1: | a | a | b | c | d | e |

values of list 2: | a | b | c | g | a |

The results must be: => d, e, g

Thank a lot!

Upvotes: 1

Views: 267

Answers (2)

Veedrac
Veedrac

Reputation: 60167

In Python you can just do:

first  = set("aabcde")
second = set("abcga")

first ^ second
#>>> {'g', 'e', 'd'}

It will be slightly faster to do:

first  = "aabcde"
second = "abcga"

first, second = sorted([first, second], key=len)
set(first).symmetric_difference(second)
#>>> {'e', 'g', 'd'}

to avoid making a set from the larger list.

You might even want:

first  = "aabcde"
second = "abcga"

set_first = set(first)
set_first.symmetric_difference_update(second)
set_first
#>>> {'e', 'g', 'd'}

Even so, 24k items is tiny so there's no real worry.

Manually, the obvious way is:

first  = set("aabcde")
second = set("abcga")

difference = set()

for item in first:
    if item not in second:
        difference.add(item)

for item in second:
    if item not in first:
        difference.add(item)

difference
#>>> {'e', 'g', 'd'}

Upvotes: 1

thb
thb

Reputation: 14454

Yes. Good question. Read the members of each list into a tree structure (for example, std::set in C++). This orders your lists. Then, step through the two trees in tandem, deleting duplicates as you go.

An even better (but harder to understand) technique reads only one of the two lists into a hash-keyed structure (for example, std::unordered_set in C++11).

Or you can do a quicksort on both lists first, and forget about the trees. You have a lot of options. All the efficient options of which I can think however involve first sorting, or binning, or keying, at least one of the two lists; but, yes, I agree with you that 24,000 items are enough to merit a better approach than the first, naive one that comes to mind.

Upvotes: 0

Related Questions