Reputation: 4353
For two lists I want
A = [ 1,2,3,4,5]
B = [4,5,6,7]
result C = [1,2,3,4,5,6,7]
if I specify an overlap of 2.
Code so far:
concat_list = []
word_overlap = 2
for lst in [lst1, lst2, lst3]:
if (len(concat_list) != 0):
if (concat_list[-word_overlap:] != lst[:word_overlap]):
concat_list += lst
elif ([concat_list[-word_overlap:]] == lst[:word_overlap]):
raise SystemExit
else:
concat_list += lst
doing it for lists of strings, but should be the same thing.
EDIT:
What I want my code to do is, first, check if there is any overlap (of 1, of 2, etc), then concatenate lists, eliminating the overlap (so I don't get double elements).
[1,2,3,4,5] + [4,5,6,7] = [1,2,3,4,5,6,7]
but
[1,2,3] + [4,5,6] = [1,2,3,4,5,6]
I want it to also check for any overlap smaller than my set word_overlap.
Upvotes: 0
Views: 938
Reputation: 5474
You can use set and union
s.union(t): new set with elements from both s and t
>> list(set(A) | set(B))
[1, 2, 3, 4, 5, 6, 7]
But you can't have the exact number you need to overlap this way.
To answer you question, you will have to ruse and use a combination of sets:
get only the number of elements you need in this list using slicing
get new list with elements in either A or B but not both
OVERLAP = 1
A = [1, 2, 3, 4, 5]
B = [4, 5, 6, 7]
C = list(set(A) | set(B)) # [1, 2, 3, 4, 5, 6, 7]
D = list(set(A) & set(B)) # [4, 5]
D = D[OVERLAP:] # [5]
print list(set(C) ^ set(D)) # [1, 2, 3, 4, 6, 7]
just for fun, a one-liner could give this:
list((set(A) | set(B)) ^ set(list(set(A) & set(B))[OVERLAP:])) # [1, 2, 3, 4, 6, 7]
Where OVERLAP
is the constant where you need you reunion.
Upvotes: 1
Reputation: 76
assuming that both lists will be consecutive, and list a will always have smaller values than list b. I come up with this solution. This will also help you detect overlap.
def concatenate_list(a,b):
max_a = a[len(a)-1]
min_b = b[0]
if max_a >= min_b:
print 'overlap exists'
b = b[(max_a - min_b) + 1:]
else:
print 'no overlap'
return a + b
For strings you can do this also
def concatenate_list_strings(a,b):
count = 0
for i in xrange(min(len(a),len(b))):
max_a = a[len(a) - 1 - count:]
min_b = b[0:count+1]
if max_a == min_b:
b = b[count +1:]
return 'overlap count ' + str(count), a+b
count += 1
return a + b
Upvotes: 0
Reputation: 15877
Here's a naïve variant:
def concat_nooverlap(a,b):
maxoverlap=min(len(a),len(b))
for overlap in range(maxoverlap,-1,-1):
# Check for longest possible overlap first
if a[-overlap:]==b[:overlap]:
break # Found an overlap, don't check any shorter
return a+b[overlap:]
It would be more efficient with types that support slicing by reference, such as buffers or numpy arrays.
One quite odd thing this does is, upon reaching overlap=0, it compares the entirety of a (sliced, which is a copy for a list) with an empty slice of b. That comparison will fail unless they were empty, but it still leaves overlap=0, so the return value is correct. We can handle this case specifically with a slight rewrite:
def concat_nooverlap(a,b):
maxoverlap=min(len(a),len(b))
for overlap in range(maxoverlap,0,-1):
# Check for longest possible overlap first
if a[-overlap:]==b[:overlap]:
return a+b[overlap:]
else:
return a+b
Upvotes: 1
Reputation: 808
Not sure if I correctly interpreted your question, but you could do it like this:
A = [ 1,2,3,4,5]
B = [4,5,6,7]
overlap = 2
print A[0:-overlap] + B
If you want to make sure they have the same value, your check could be along the lines of:
if(A[-overlap:] == B[:overlap]):
print A[0:-overlap] + B
else:
print "error"
Upvotes: 0