user1620716
user1620716

Reputation: 1533

Python: Searching for common values in two files

I have two files that I loaded into Python, and I am trying to output a list that will display the common values between the two.

The list from the first file looks like this (not full list, just part of it):

    [datetime.datetime(2010, 7, 30, 12, 20, 19, 143000), datetime.datetime(2010, 7, 30, 12, 22, 33, 631000), datetime.datetime(2010, 7, 30, 12, 22, 41, 236000), datetime.datetime(2010, 7, 30, 12, 23, 43, 547000), datetime.datetime(2010, 7, 30, 12, 23, 57, 453000), datetime.datetime(2010, 7, 30, 12, 26, 4, 713000), datetime.datetime(2010, 7, 30, 12, 26, 9, 46000), datetime.datetime(2010, 7, 30, 12, 28, 30, 313000)]

And the second list looks like this (again, not the full list, just part of it):

    [datetime.datetime(2010, 7, 30, 13, 43, 2, 993000), datetime.datetime(2010, 7, 30, 13, 43, 10, 917000), datetime.datetime(2010, 7, 30, 13, 48, 56, 697000), datetime.datetime(2010, 7, 30, 13, 49, 14, 399000), datetime.datetime(2010, 7, 30, 13, 51, 45, 882000), datetime.datetime(2010, 7, 30, 13, 52, 6, 432000), datetime.datetime(2010, 7, 30, 13, 54, 26, 873000), datetime.datetime(2010, 7, 30, 13, 59, 2, 164000), datetime.datetime(2010, 7, 30, 13, 59, 15, 515000), datetime.datetime(2010, 7, 30, 14, 3, 43, 742000), datetime.datetime(2010, 7, 30, 14, 5, 59, 975000), datetime.datetime(2010, 7, 30, 14, 13, 36, 887000), datetime.datetime(2010, 7, 30, 14, 13, 42, 92000)]

And here is what the code looks like:

    for infilelines in text:
        lspl1 = infilelines.split(',')
        cattime = lspl1[2]
        catdate.append(datetime.strptime(lspl1[1]+cattime,'%d/%m/%Y%H:%M:%S.%f'))



    #print catdate


    NNSRCfile = sf.readsrc(NNSRC)
    l1 = NNSRCfile['date']
    l2 = NNSRCfile['time']
    NNSRCfile['datetime2'] = zip(l1,l2)
    #print NNSRCfile['datetime2']


    NNSRCfile['datetimenew'] = [datetime.strptime(i+j,'%m-%d-%Y%H:%M:%S.%f') for i,j in zip(l1,l2)]

    #print NNSRCfile['datetimenew']
    ii = 0
    while ii <= len(catdate):
        try:
            #print NNSRCfile['datetimenew'].index(catdate[ii]), "WE FOUND A MATCH!!!!"
            indices = NNSRCfile['datetimenew'].index(catdate[ii])
        except:
            print "No match."
            #continue
        ii += 1

I am not getting an error, but the script is not working the way I intended it to. I would like a full list of all of the common elements between the two separate lists. I know I didn't include the full lists in this post, but when I manually look through it, I know there are commonalities between the two.

Any help is greatly appreciated!

Upvotes: 1

Views: 122

Answers (3)

mgilson
mgilson

Reputation: 309929

I think that a set is in order here. Assuming you've parsed your files into the lists, you can do the following:

intersection = set(list1) & set(list2)

And Bob's your uncle, you're done!


There's also a method form of the above:

intersection = set(list1).intersection(list2)

which has the advantage that you don't need to create 2 sets (only 1). This means you don't actually need to read both of your files into memory -- you can parse/update the set lazily with the second file.

Upvotes: 3

aschmid00
aschmid00

Reputation: 7158

if you want the intersection of two lists then you could do this:

>>> a = [1,2,3,4]
>>> b = [3,4,5,6]
>>> set(a).intersection(set(b))
set([3, 4])

Upvotes: 0

nneonneo
nneonneo

Reputation: 179422

If you're just looking for common values, use set intersection:

a = set([1,3,5,7])
b = set([3,5,6,9])
print a.intersection(b) # prints set([3, 5])

Upvotes: 0

Related Questions