Reputation: 6201
If I have this:
a='abcdefghij'
b='de'
Then this finds b in a:
b in a => True
Is there a way of doing an similar thing with lists? Like this:
a=list('abcdefghij')
b=list('de')
b in a => False
The 'False' result is understandable - because its rightly looking for an element 'de', rather than (what I happen to want it to do) 'd' followed by 'e'
This is works, I know:
a=['a', 'b', 'c', ['d', 'e'], 'f', 'g', 'h']
b=list('de')
b in a => True
I can crunch the data to get what I want - but is there a short Pythonic way of doing this?
To clarify: I need to preserve ordering here (b=['e','d'], should return False).
And if it helps, what I have is a list of lists: these lists represents all possible paths (a list of visited nodes) from node-1 to node-x in a directed graph: I want to 'factor' out common paths in any longer paths. (So looking for all irreducible 'atomic' paths which constituent all the longer paths).
Upvotes: 19
Views: 43481
Reputation: 8953
Don't know if this is very pythonic, but I would do it in this way:
def is_sublist(a, b):
if not a: return True
if not b: return False
return b[:len(a)] == a or is_sublist(a, b[1:])
Shorter solution is offered in this discussion, but it suffers from the same problems as solutions with set
- it doesn't consider order of elements.
UPDATE:
Inspired by MAK I introduced more concise and clear version of my code.
UPDATE: There are performance concerns about this method, due to list copying in slices. Also, as it is recursive, you can encounter recursion limit for long lists. To eliminate copying, you can use Numpy slices which creates views, not copies. If you encounter performance or recursion limit issues you should use solution without recursion.
Upvotes: 9
Reputation: 83
This should work with whatever couple of lists, preserving the order. Is checking if b is a sub list of a
def is_sublist(b,a):
if len(b) > len(a):
return False
if a == b:
return True
i = 0
while i <= len(a) - len(b):
if a[i] == b[0]:
flag = True
j = 1
while i+j < len(a) and j < len(b):
if a[i+j] != b[j]:
flag = False
j += 1
if flag:
return True
i += 1
return False
Upvotes: 2
Reputation: 223172
I think this will be faster - It uses C implementation list.index
to search for the first element, and goes from there on.
def find_sublist(sub, bigger):
if not bigger:
return -1
if not sub:
return 0
first, rest = sub[0], sub[1:]
pos = 0
try:
while True:
pos = bigger.index(first, pos) + 1
if not rest or bigger[pos:pos+len(rest)] == rest:
return pos
except ValueError:
return -1
data = list('abcdfghdesdkflksdkeeddefaksda')
print find_sublist(list('def'), data)
Note that this returns the position of the sublist in the list, not just True
or False
. If you want just a bool
you could use this:
def is_sublist(sub, bigger):
return find_sublist(sub, bigger) >= 0
Upvotes: 7
Reputation: 29953
Use the lists' string representation and remove the square braces. :)
def is_sublist(a, b):
return str(a)[1:-1] in str(b)
EDIT: Right, there are false positives ... e.g. is_sublist([1], [11])
. Crappy answer. :)
Upvotes: 0
Reputation: 29953
I timed the accepted solution, my earlier solution and a new one with an index. The one with the index is clearly best.
EDIT: I timed nosklo's solution, it's even much better than what I came up with. :)
def is_sublist_index(a, b):
if not a:
return True
index = 0
for elem in b:
if elem == a[index]:
index += 1
if index == len(a):
return True
elif elem == a[0]:
index = 1
else:
index = 0
return False
def is_sublist(a, b):
return str(a)[1:-1] in str(b)[1:-1]
def is_sublist_copylist(a, b):
if a == []: return True
if b == []: return False
return b[:len(a)] == a or is_sublist_copylist(a, b[1:])
from timeit import Timer
print Timer('is_sublist([99999], range(100000))', setup='from __main__ import is_sublist').timeit(number=100)
print Timer('is_sublist_copylist([99999], range(100000))', setup='from __main__ import is_sublist_copylist').timeit(number=100)
print Timer('is_sublist_index([99999], range(100000))', setup='from __main__ import is_sublist_index').timeit(number=100)
print Timer('sublist_nosklo([99999], range(100000))', setup='from __main__ import sublist_nosklo').timeit(number=100)
Output in seconds:
4.51677298546
4.5824368
1.87861895561
0.357429027557
Upvotes: 3
Reputation: 26586
I suspect there are more pythonic ways of doing it, but at least it gets the job done:
l=list('abcdefgh')
pat=list('de')
print pat in l # Returns False
print any(l[i:i+len(pat)]==pat for i in xrange(len(l)-len(pat)+1))
Upvotes: 12
Reputation: 8879
So, if you aren't concerned about the order the subset appears, you can do:
a=list('abcdefghij')
b=list('de')
set(b).issubset(set(a))
True
Edit after you clarify: If you need to preserve order, and the list is indeed characters as in your question, you can use:
''.join(a).find(''.join(b)) > 0
Upvotes: 2