Reputation: 111
I'm working on datasets and this is what I wrote till now.
import itertools
import csv
import numpy
def counter(x):
unique, counts = numpy.unique(result, return_counts=True)
list1= numpy.asarray((unique, counts)).T
return list1
def findsubsets(S,m):
return list(itertools.combinations(S, m))
sup=input("enter min support\n")
reader = csv.reader(open("test.csv", "rb"), delimiter=",")
X = list(reader)
result = numpy.array(X).astype("string")
print(result)
(m,n)=result.shape
list1=counter(result)
print("deleting items which have less support")
print(list1)
(a,b)=list1.shape
l=[]
for x in range(a):
a=int(list1[x][1])
sup1=int(sup)
if(a>=sup):
l.append(list1[x][0])
print"after deleting"
print(l)
print("making sets")
o=(findsubsets(l,2))
print(o)
print(X)
the list o has these tuples
[('Beer', 'Bread'), ('Beer', 'Coke'), ('Beer', 'Diaper'), ('Beer', 'Milk'), ('Bread', 'Coke'), ('Bread', 'Diaper'), ('Bread', 'Milk'), ('Coke', 'Diaper'), ('Coke', 'Milk'), ('Diaper', 'Milk')]
and the list X has
[['Bread', 'Diaper', 'Beer', 'Eggs'], ['Milk', 'Diaper', 'Beer', 'Coke'], ['Bread', 'Milk', 'Diaper', 'Beer'], ['Bread', 'Milk', 'Diaper', 'Coke']]
I want to check that every tuple of list-o was really in list-X or not.
for example beer, bread
is a tuple of list-o
beer,bread was present in list-X for 2 times. I want to return the count 2. How can I do it?
EDIT :
***********I did this using sets********
O = [('Beer', 'Bread'), ('Beer', 'Coke'), ('Beer', 'Diaper'), ('Beer', 'Milk'), ('Bread', 'Coke'), ('Bread', 'Diaper'), ('Bread', 'Milk'), ('Coke', 'Diaper'), ('Coke', 'Milk'), ('Diaper', 'Milk')]
X = [['Bread', 'Diaper', 'Beer', 'Eggs'], ['Milk', 'Diaper', 'Beer', 'Coke'], ['Bread', 'Milk', 'Diaper', 'Beer'], ['Bread', 'Milk', 'Diaper', 'Coke']]
dict = defaultdict(int)
for tuple in O:
for LST in X:
if set(tuple) <= set(LST):
dict[tuple] += 1
Upvotes: 1
Views: 427
Reputation: 675
You could try something of the form
[(l2[0][i], l2[1][i]) == l1[i] for i in range(len(l1))]
The question is a little vague about the comparison you would like to make, but I can infer (I hope correctly) that the idea is to take the two lists and "stack" them side by side so they look comparable to the list of tuples. Then, I assume you want to perform an equality check.
Here, the equality check performed is an exact match on tuple. This could be incorrect. If so, I can revise my answer.
I assume you are writing this as a python question, but tags would be useful here.
The code I've supplied takes the first element of the second list you've provided (a list) and the second list you've provided (another list). It then creates a tuple based on the index and compares it to the matching element from the list of tuples. This returns a list of booleans: true if there is an identical tuple, false if there is no identical tuple at that index. The length of the list should be the same as the length of the list of tuples you've provided.
If I've interpreted your question correctly, the output is as expected
[False, False, False, False]
When asking a question like this, it is really useful to specify what you have already tried and why your code produces outputs that you don't expect. Makes it easier for folks to understand the issue and give useful answers! Comment and edit the question to clarify.
BEFORE EDITS (Feb. 17)
AFTER EDITS (Feb. 18)
I'll preserve the above in case people end up having a similar question. Your question has been edited. Let's see if this addresses the question.
You want to know the number of times a tuple's elements are ALL contained in a list of lists. This is a perfect application of sets.
lists_to_sets = [set(l) for l in X]
tuples_to_sets = [set(t) for t in o]
Now you want to count the number of times a subset is a member of a superset, so:
[sum([t_set.issubset(l_set) for l_set in lists_to_sets]) for t_set in tuples_to_sets]
For each of the tuples, this counts the number of lists all the elements of the tuple appear in (are subsets of). This gives the expected output:
[2, 1, 3, 2, 1, 3, 2, 2, 2, 3]
Upvotes: 1
Reputation: 164623
collections.defaultdict
provides one intuitive method:
from collections import defaultdict
lst_o = [('Beer', 'Bread'), ('Beer', 'Coke'), ('Beer', 'Diaper'), ('Beer', 'Milk'), ('Bread', 'Coke'), ('Bread', 'Diaper'), ('Bread', 'Milk'), ('Coke', 'Diaper'), ('Coke', 'Milk'), ('Diaper', 'Milk')]
lst_x = [['Bread', 'Diaper', 'Beer', 'Eggs'], ['Milk', 'Diaper', 'Beer', 'Coke'], ['Bread', 'Milk', 'Diaper', 'Beer'], ['Bread', 'Milk', 'Diaper', 'Coke']]
d = defaultdict(int)
for tup in lst_o:
for lst in lst_x:
if set(tup) <= set(lst):
d[tup] += 1
# defaultdict(int,
# {('Beer', 'Bread'): 2,
# ('Beer', 'Coke'): 1,
# ('Beer', 'Diaper'): 3,
# ('Beer', 'Milk'): 2,
# ('Bread', 'Coke'): 1,
# ('Bread', 'Diaper'): 3,
# ('Bread', 'Milk'): 2,
# ('Coke', 'Diaper'): 2,
# ('Coke', 'Milk'): 2,
# ('Diaper', 'Milk'): 3})
See set
documentation for information on set
operations.
Upvotes: 1