Reputation: 154564
For example, something like:
>>> [1, 2, 3].contains_sequence([1, 2])
True
>>> [1, 2, 3].contains_sequence([4])
False
I know that the in
operator can do this for strings:
>>> "12" in "123"
True
But I'm looking for something that operates on iterables.
Upvotes: 5
Views: 468
Reputation: 15944
As others have said, there's no builtin for this. Here's an implementation that is potentially more efficient than the other answers I've seen -- in particular, it scans through the iterable, just keeping track of what prefix sizes of the target sequence it's seen. But that increased efficiency comes at some expense in increased verbosity over some of the other approaches that have been suggested.
def contains_seq(iterable, seq):
"""
Returns true if the iterable contains the given sequence.
"""
# The following clause is optional -- leave it if you want to allow `seq` to
# be an arbitrary iterable; or remove it if `seq` will always be list-like.
if not isinstance(seq, collections.Sequence):
seq = tuple(seq)
if len(seq)==0: return True # corner case
partial_matches = []
for elt in iterable:
# Try extending each of the partial matches by adding the
# next element, if it matches.
partial_matches = [m+1 for m in partial_matches if elt == seq[m]]
# Check if we should start a new partial match
if elt==seq[0]:
partial_matches.append(1)
# Check if we have a complete match (partial_matches will always
# be sorted from highest to lowest, since older partial matches
# come before newer ones).
if partial_matches and partial_matches[0]==len(seq):
return True
# No match found.
return False
Upvotes: 2
Reputation: 215009
deque appears to be useful here:
from collections import deque
def contains(it, seq):
seq = deque(seq)
deq = deque(maxlen=len(seq))
for p in it:
deq.append(p)
if deq == seq:
return True
return False
Note that this accepts arbitrary iterables for both arguments (no slicing required).
Upvotes: 2
Reputation: 33407
As there's no builtin, I made a nice version:
import itertools as it
def contains(seq, sub):
seq = iter(seq)
o = object()
return any(all(i==j for i,j in zip(sub, it.chain((n,),seq,
(o for i in it.count())))) for n in seq)
This do not require any extra lists (if you use it.izip
or Py3k).
>>> contains([1,2,3], [1,2])
True
>>> contains([1,2,3], [1,2,3])
True
>>> contains([1,2,3], [2,3])
True
>>> contains([1,2,3], [2,3,4])
False
Extra points if you have no trouble reading it. (It does the job, but the implementation is not to be taked too seriously). ;)
Upvotes: 1
Reputation: 2491
If preserving of order is not necessary, you can use sets (builtin):
>>> set([1,2]).issubset([1,2,3])
True
>>> set([4]).issubset([1,2,3])
False
Otherwise:
def is_subsequence(sub, iterable):
sub_pos, sub_len = 0, len(sub)
for i in iterable:
if i == sub[sub_pos]:
sub_pos += 1
if sub_pos >= sub_len:
return True
else:
sub_pos = 0
return False
>>> is_subsequence([1,2], [0,1,2,3,4])
True
>>> is_subsequence([2,1], [0,1,2,3,4]) # order preserved
False
>>> is_subsequence([1,2,4], [0,1,2,3,4])
False
This one works with any iterator.
Upvotes: 2
Reputation: 251428
Is there a Python builtin? No. You can accomplish this task in various ways. Here is a recipe that does it, and also gives you the position of the subsequence in the containing sequence:
def _search(forward, source, target, start=0, end=None):
"""Naive search for target in source."""
m = len(source)
n = len(target)
if end is None:
end = m
else:
end = min(end, m)
if n == 0 or (end-start) < n:
# target is empty, or longer than source, so obviously can't be found.
return None
if forward:
x = range(start, end-n+1)
else:
x = range(end-n, start-1, -1)
for i in x:
if source[i:i+n] == target:
return i
return None
Upvotes: 3
Reputation: 310049
As far as I know, there's no way to do this. You can roll your own function pretty easily, but I doubt that will be terribly efficient.
>>> def contains_seq(seq,subseq):
... #try: junk=seq[:]
... #except: seq=tuple(seq)
... #try: junk=subseq[:]
... #except: subseq=tuple(subseq)
... ll=len(subseq)
... for i in range(len(seq)-ll): #on python2, use xrange.
... if(seq[i:i+ll] == subseq):
... return True
... return False
...
>>> contains_seq(range(10),range(3)) #True
>>> contains_seq(range(10),[2,3,6]) #False
Note that this solution does not work with generator type objects (it only works on objects that you can slice). You could check seq
to see if it is sliceable before proceeding and cast to a tuple
if it isn't sliceable -- But then you get rid of the benefits of slicing. You could re-write it to check one element at a time instead of using slicing, but I have a feeling performance would suffer even more.
Upvotes: 2
Reputation: 43860
Referenced from https://stackoverflow.com/a/6822773/24718 modified to use a list.
from itertools import islice
def window(seq, n=2):
"""
Returns a sliding window (of width n) over data from the iterable
s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...
"""
it = iter(seq)
result = list(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + [elem]
yield result
def contains_sequence(all_values, seq):
return any(seq == current_seq for current_seq in window(all_values, len(seq)))
test_iterable = [1,2,3]
search_sequence = [1,2]
result = contains_sequence(test_iterable, search_sequence)
Upvotes: 4
Reputation: 5537
You could convert it into a string and then do matching on it
full_list = " ".join([str(x) for x in [1, 2, 3]])
seq = " ".join([str(x) for x in [1, 2]])
seq in full_list
Upvotes: -2