Angelo
Angelo

Reputation: 5059

Set Operations in Python and code optimisation

I am trying to write a generic code to perform set operation on any number of input files.

normally for any set operation (where I already limit the number of input files), I use something like this.

my_set1 = set(map(str.strip, open('filename1.txt')))
my_set2 = set(map(str.strip, open('filename2.txt')))
common = myset1.intersection(my_set2)

Where each file has only one column.

Now what I am aiming is to put all the set theory functions in it. Something like.

python set.py -i file1,file2,file3,file4 -o inter

These inputs are taken from the user.

Actually user can define the number of input files and the kind of operation he will like.

If anyone of you can show me how it can be done, I can write for the other operations myself like for union and difference

Upvotes: 0

Views: 174

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121484

The set.intersection() and set.update_intersection() methods take any iterable, not just sets.

Since you are only interested in the end-product (the intersection between the files) you'd best use set.intersection_update() here.

Start with one set, then keep updating it with the rest of the files:

with open(files[0]) as infh:
    myset = set(map(str.strip, infh))

for filename in files[1:]:
    with open(filename) as infh:
        myset.intersection_update(map(str.strip, infh))

You can make the method used dynamic based on the command-line switch:

ops = {'inter': set.intersection_update,
       'union': set.update,
       'diff': set.difference_update}

with open(files[0]) as infh:
    myset = set(map(str.strip, infh))

for filename in files[1:]:
    with open(filename) as infh:
        ops[operation](myset, map(str.strip, infh))

Upvotes: 2

Related Questions