Reputation: 5059
I am trying to write a generic code to perform set operation on any number of input files.
normally for any set operation (where I already limit the number of input files), I use something like this.
my_set1 = set(map(str.strip, open('filename1.txt')))
my_set2 = set(map(str.strip, open('filename2.txt')))
common = myset1.intersection(my_set2)
Where each file has only one column.
Now what I am aiming is to put all the set theory functions in it. Something like.
python set.py -i file1,file2,file3,file4 -o inter
These inputs are taken from the user.
Actually user can define the number of input files and the kind of operation he will like.
If anyone of you can show me how it can be done, I can write for the other operations myself like for union and difference
Upvotes: 0
Views: 174
Reputation: 1121484
The set.intersection()
and set.update_intersection()
methods take any iterable, not just sets.
Since you are only interested in the end-product (the intersection between the files) you'd best use set.intersection_update()
here.
Start with one set, then keep updating it with the rest of the files:
with open(files[0]) as infh:
myset = set(map(str.strip, infh))
for filename in files[1:]:
with open(filename) as infh:
myset.intersection_update(map(str.strip, infh))
You can make the method used dynamic based on the command-line switch:
ops = {'inter': set.intersection_update,
'union': set.update,
'diff': set.difference_update}
with open(files[0]) as infh:
myset = set(map(str.strip, infh))
for filename in files[1:]:
with open(filename) as infh:
ops[operation](myset, map(str.strip, infh))
Upvotes: 2