Reputation: 3038
At the code I'm writing I need to intersect two horizontal list like:
chr1 aatt
chr8 tagg
chr11 aaaa
chr7 gtag
chr8 tagt
chr1 tttt
chr7 gtag
chr11 aaaa
chr9 atat
#This lists are compounded by one str per line, wich it has a "/t" in the middle.
#Also note that are in different order
How can I get the intersection between this two list?
chr7 gtag
chr11 aaaa
I'm also available to generate lists of two string per line, like this:\
('chr1', 'aatt')
('chr8', 'tagg')
('chr11', 'aaaa')
('chr7', 'gtag')
('chr8', 'tagt')
('chr1', 'tttt')
('chr7', 'gtag')
('chr11','aaaa')
('chr9', 'atat')
The important matter in this case is that the two columns must be treated as one
thanks for your time!
Upvotes: 1
Views: 464
Reputation: 775
Perhaps there is a performance optimization by not creating 2 sets from lists, which requires hashing all the items in the list, but creating only 1 set and iterating through the second list. If you know which list is large and which is small that could also help.
def intersect(smallList, largeList):
values = set(smallList)
intersection = []
for v in largeList:
if v in values:
intersection.append(v)
return intersection
Upvotes: 0
Reputation: 21
Use set intersection.
setC = set(listA) & set(listB)
listC = list(setC) # if you really need a list
Upvotes: 2
Reputation: 6620
Use Python sets
listA = (
('chr1', 'aatt'),
('chr8', 'tagg'),
('chr11', 'aaaa'),
('chr7', 'gtag'),
)
listB = (
('chr8', 'tagt'),
('chr1', 'tttt'),
('chr7', 'gtag'),
('chr11','aaaa'),
('chr9', 'atat'),
)
combined = set(listA).intersection(set(listB))
for c, d in combined:
print c, d
You can also use the &
like this:
combined = set(listA) & set(listB)
Upvotes: 4