Reputation: 103
I have 2 CSV files containing two columns and a large number of rows. The first column is the id, and the second is the set of paired values. e.g.:
CSV1:
1 {[1,2],[1,4],[5,6],[3,1]}
2 {[2,4] ,[6,3], [8,3]}
3 {[3,2], [5,2], [3,5]}
CSV2:
1 {[2,4] ,[6,3], [8,3]}
2 {[3,4] ,[3,3], [2,3]}
3 {[1,4],[5,6],[3,1],[5,5]}
Now I need to get a CSV file which contains either exact matching items or subset which belongs to both CSVs.
Here the result should be:
{[2,4] ,[6,3], [8,3]}
{[1,4],[5,6],[3,1]}
Can anyone suggest python code to do this?
Upvotes: 3
Views: 2862
Reputation: 6276
As suggested by this answer you can use set.intersection
to get the intersection of two sets, however this does not work with lists as items. Instead you can also use filter
(comparable to this answer):
>>> l1 = [[1,2],[1,4],[5,6],[3,1]]
>>> l2 = [[1,4],[5,6],[3,1],[5,5]]
>>> filter(lambda q: q in l2, l1)
[[1, 4], [5, 6], [3, 1]]
In Python 3 you should convert it to list
since there filter
returns an iterable:
>>> list(filter(lambda x: x in l2,l1))
You can load CSV files (if they are really comma [or some other character] separated files) with csv.reader
or pandas.read_csv
for example.
Upvotes: 3