PicklePilot
PicklePilot

Reputation: 13

List Subtraction in Python

I've got a comma delimited text file with content that's kinda like this:

[email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Let's call it emails1.csv. I've got another comma delimited text file too:

[email protected], [email protected]

Let's call it emails2.csv. I need to subtract emails2.csv from emails1.csv using Python. In pseudocodenese:

emails1.csv = emails1.csv - emails2.csv

Total virgin to Python, but I made this based on a couple examples I found. Does it do what I think it does? That is, take the emails in emails2.csv out of emails1.csv and put the difference in a file called subtractomatic.csv.

from sets import Set
import csv

fin = open('emails1.csv', 'rb')
reader = csv.reader(fin)
email_list1 = list(reader)[0]

fin = open('emails2.csv', 'rb')
reader = csv.reader(fin)
email_list2 = list(reader)[0]

email_list1 = list(set(email_list1)-set(email_list2))

fout = open('subtractomatic.csv', 'wb')
writer = csv.writer(fout, quoting=csv.QUOTE_NONE)
writer.writerow(email_list1)

fout.close()

fin.close()
fin.close()

I think it does because my original file, namely emails1.csv, has X emails in it, and when I open subtractomatic.csv there are emails in it, and when I run

grep @ -o subtractomatic.csv | wc -l

in the terminal I get X/2, which makes sense because emails1.csv has twice as many emails as emails2.csv---by design. I am, however, also a novice, so I don't know that I'm looking at this thing right.

Upvotes: 1

Views: 76

Answers (3)

mgilson
mgilson

Reputation: 309929

Rather than the all set approach used by others, you can make B a set and filter out it's contents from A:

b_set = set(B)
a_filtered = [a for a in A if a not in b_set]

This has the advantage of keeping the order of A in a_filtered (sans the elements you want to remove)...

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174706

Use sets to find the difference between two lists and then assign the results back to the list 1. The sets module provides classes for constructing and manipulating unordered collections of unique elements. Common uses include membership testing, removing duplicates from a sequence, and computing standard math operations on sets such as intersection, union, difference, and symmetric difference.

>>> l1 = ['[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]']
>>> l2 = ['[email protected]', '[email protected]']
>>> set(l1)-set(l2)
{'[email protected]', '[email protected]', '[email protected]', '[email protected]'}
>>> list(set(l1)-set(l2))
['[email protected]', '[email protected]', '[email protected]', '[email protected]']
>>> l1 = list(set(l1)-set(l2))
>>> l1
['[email protected]', '[email protected]', '[email protected]', '[email protected]']

Upvotes: 2

Marcin
Marcin

Reputation: 238209

You can use sets:

difference = set(listA) - set(listB)

Upvotes: 2

Related Questions