Reputation: 878
Here is my code in python, which checks whether smaller sub-arrays in big array are inside in one of the sub-arrays. I don't know which is the biggest sub-array's index, so I can not compare against index. Maybe it's good to sort. But if there are more than one big subarrays? i.e. I want to have in final array only those big sub-arrays, which contain smaller sub-arrays
[[1, 2, 3], [4, 5, 6], [1, 2, 3, 4, 5, 6], [7,8], [9, 10, 11, 12], [7, 8, 9, 10, 11, 12]]
Second, what should I do if order of the elements can be random, e.g.:
[[1, 3, 2], [4, 5, 6], [1, 2, 3, 4, 5, 6], [7,8], [9, 10, 11, 12], [7, 8, 9, 10, 11, 12]]
Here is my code:
arr1 = [[1, 2, 3], [4, 5, 6], [1, 2, 3, 4, 5, 6], [7,8], [9, 10, 11, 12], [7, 8, 9, 10, 11, 12]]
arr2 = []
arrInd = 0
arrInd2 = len(arr1)
for k in arr1:
for n in reversed(arr1):
if set(n).issubset(set(k)):
arr2.append(k)
print(arr2)
I would like to see as output:
[[1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12]]
But I have:
[[4, 5, 6], [1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [7, 8], [9, 10, 11, 12], [7, 8, 9, 10, 11, 12], [7, 8, 9, 10, 11, 12], [7, 8, 9, 10, 11, 12]]
UPDATE
Because there were so many good questions I have to add some more details.
1) There is an array of arrays:
arr1 = [[]]
2) it contains some sub-arrays like these ones [1, 2, 3], [4, 5, 6], [4, 6, 5], [7, 8, 9, 10, 11, 12], [11, 12, 13, 14, 15], [1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12, 13, 14, 15], [111, 222, 333], [444, 555], [777, 888], [111, 222, 333, 444, 555], [111, 222, 333, 444, 555, 777, 888], [1000, 2000, 3000]
What is possible?
1) Inside arrays order may vary: [4, 5, 6], [4, 6, 5]
and [1, 2, 3, 4, 5, 6]
, so it won't be numerical order, in fact in real data (I am currently testing) there may be even letters, like A77
, B81
;
2) There are big and small arrays, there is always a biggest one (or several), which never intersects with other biggest arrays, so here [1, 2, 3, 4, 5, 6]
and [7, 8, 9, 10, 11, 12, 13, 14, 15]
- they can not intersect with each other, but they contain some mini sub-arrays;
3) Each big sub-array contains some smaller ones. However, not all of them and some small arrays can exist independently. For instance: [1, 2, 3, 4, 5, 6]
contains [1, 2, 3], [4, 5, 6], [4, 6, 5]
, but it doesn't contain [7, 8, 9, 10, 11, 12], [11, 12, 13, 14, 15]
or [111, 222, 333]
;
4) There also may be intermediate "big arrays": [111, 222, 333, 444, 555]
is smaller, compared to [111, 222, 333, 444, 555, 777, 888]
so the first one [111, 222, 333, 444, 555]
must be excluded.
5*)I would like to add big arrays to arr2
(final array) without small once, except for case *** [1000, 2000, 3000]
to arr2
(final array)
What my code was supposed to do?
It was supposed to go through arr1 and oppose different elements to each other in order to detect whether n element contains k element and vice-versa, as it goes from start to end for k in arr1:
and from end to start for n in reversed(arr1):
UPDATE 2. Comparison of lengths!
if len(n) < len(k):
for k in arr1:
for n in reversed(arr1):
if len(n) < len(k):
if set(n).issubset(set(k)):
arr2.append(k)
And it is much closer. Actually it is almost it, but it makes duplicates, judging by the amount of which, it seems that problem of order (1) is not a problem at all:
arr2 = [[1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12, 13, 14, 15], [7, 8, 9, 10, 11, 12, 13, 14, 15], [111, 222, 333, 444, 555], [111, 222, 333, 444, 555], [111, 222, 333, 444, 555, 777, 888], [111, 222, 333, 444, 555, 777, 888], [111, 222, 333, 444, 555, 777, 888], [111, 222, 333, 444, 555, 777, 888]]
It is, of course possible to start the removal of duplicates and to run arr2 in such a way again to get rid of [111, 222, 333, 444, 555]
which is actually part of [111, 222, 333, 444, 555, 777, 888]
, but there must be a more elegant way.
Upvotes: 3
Views: 12135
Reputation: 1649
You can simply compare two arrays by converting them into strings then comparing by __contains__
print (''.join(arr1).__contains__(''.join(arr2)))
Upvotes: 0
Reputation: 2140
Are you looking for this?
def get_subset_list():
for arr in arrays:
others = [x for x in arrays if x != arr]
for other in others:
for val in arr:
if val not in other:
break
else:
break
else:
yield arr
import json
x = get_subset_list()
print set([json.dumps(y) for y in x])
Upvotes: 0
Reputation: 50220
You have a big list of lists, and you want to look up a lot of lists in it? Do it like you would handle looking up any other kind of object in a big collection: Store them (or a suitable version, to be exact) in a set to allow fast lookup.
Lists cannot be set elements, but you can transform the small lists to tuples and store those:
myindex = set(tuple(sm) for sm in big_list)
Then to check if [1, 3, 4]
is in your big_list
, you just say:
if tuple([1, 3, 4]) in myindex:
...
If you want to match lists regardless of order (so that [1, 3, 4]
is considered equal to [3, 1, 4]
), turn them into sets too. But not regular sets (they can't be set elements either), but frozenset
:
myindex = myindex = set(frozenset(sm) for sm in big_list)
That's about it.
Upvotes: 1
Reputation: 59586
So you are trying to get rid of all those lists which contain only elements which are also in another list?
def eachMasterList(allLists):
allSets = [ set(lst) for lst in allLists ]
for lst, s in zip(allLists, allSets):
if not any(s is not otherSet and s < otherSet for otherSet in allSets):
yield lst
To use the function, try this:
print(list(eachMasterList([[1, 2, 3], [4, 5, 6], [1, 2, 3, 4, 5, 6], [7,8], [9, 10, 11, 12], [7, 8, 9, 10, 11, 12]])))
Upvotes: 2
Reputation: 107347
You can use a recursion function :
>>> l =[(1, 3, 2), (4, 5, 6), (1, 2, 3, 4, 5, 6), (7, 8), (9, 10, 11, 12), (7, 8, 9, 10, 11, 12)]
>>> def find_max_sub(l):
... for i,j in enumerate(l):
... for t,k in enumerate(l[i+1:],1):
... if set(j).intersection(k):
... return find_max_sub([set(l.pop(i))|set(l.pop(t))]+l)
... return l
...
>>> find_max_sub(l)
[(1, 2, 3, 4, 5, 6), (7, 8, 9, 10, 11, 12)]
All you need is looping over your main list and compare the elements in list with other and when you find to sub lists that have any intersection you pop those elements and add new sub list to main list then call the function with new list.
Upvotes: 0