AnarKi
AnarKi

Reputation: 987

efficient way to merge consecutive list elements based on condition

given a list of tuples where the values are sorted, what is the most efficient way of merging consecutive elements if the difference between the elements is less than a value x

in the following list, if x=100 then (287, 790) and (855, 945) would be merged into (287,945), then (287,945) would merge with (955, 2205) into (287,2205) and so on

[(287, 790),
 (855, 945),
 (955, 2205),
 (2230, 2264),
 (2362, 2729),
 (3906, 4473)]

and this would be the output in this case:

[(287, 2729),
 (3906, 4473)]

Upvotes: 1

Views: 986

Answers (4)

wjandrea
wjandrea

Reputation: 32944

This is like M Z's answer, but using a nested list instead of a list of tuples, to make the merging easier:

def merge(intervals, difference):
    merged = []
    prev_end = float('-inf')  # Initial loop value
    for start, end in intervals:
        if start-prev_end < difference:
            # Merge, i.e. set the new top end value
            merged[-1][1] = end
        else:
            # The current interval becomes the top
            merged.append([start, end])
        prev_end = end  # For next loop
    return merged


some_list = [
    (287, 790),
    (855, 945),
    (955, 2205),
    (2230, 2264),
    (2362, 2729),
    (3906, 4473)]
print(merge(some_list, 100))  # -> [[287, 2729], [3906, 4473]]

Converting the result to a list of tuples is also O(n).

Upvotes: 0

Alec
Alec

Reputation: 135

x = 100
data = [
        (287, 790),
        (855, 945),
        (955, 2205),
        (2230, 2264),
        (2362, 2729),
        (3906, 4473)
]

def merger(x, data):
    merged = [data[0],]
    the_others = [data[i] for i in range(1, len(data)) if (data[i][0]-data[i-1][1])>x]
    merged.extend(the_others)
    return merged

print(merger(x, data))

Upvotes: 0

M Z
M Z

Reputation: 4799

If you're only merging this once (meaning you don't try to produce several different resulting tuples from one original tuple), the best you can do is going to be an O(n) solution, or basically a for loop.

x = 100
to_merge = [(287, 790),
 (855, 945),
 (955, 2205),
 (2230, 2264),
 (2362, 2729),
 (3906, 4473)]

def merge(lst, x):
    if not lst or len(lst) <= 1:
        return lst
    new_lst = []
    start, end = lst[0]
    for start1, end1 in lst[1:]:
        if start1 - end < x:
            end = end1
        else:
            new_lst.append((start, end))
            start, end = start1, end1
    new_lst.append((start, end))
    return new_lst

print(merge(to_merge, x))

If you were to do this multiple times to the same original list, then you might find using a dynamic programming method more efficient.

Upvotes: 3

Green Cloak Guy
Green Cloak Guy

Reputation: 24691

orig_list = [
 (287, 790),
 (855, 945),
 (955, 2205),
 (2230, 2264),
 (2362, 2729),
 (3906, 4473)]
merged_list = []
# iterate through original list
orig_iter = iter(orig_list)
# pop the first element and keep track of it as the "last tuple visited"
last_tuple = next(orig_iter)
for next_tuple in orig_iter:
    # if tuples are too close, merge them
    if (next_tuple[0] - last_tuple[1]) < 100:
        last_tuple = (last_tuple[0], next_tuple[1])
    # otherwise, push the last_tuple to the new list and replace it
    else:
        merged_list.append(last_tuple)
        last_tuple = next_tuple
# when we're done traversing the list, push whatever's remaining onto the end
merged_list.append(last_tuple)

print(merged_list)
# [(287, 2729), (3906, 4473)]

Note that (287, 2205) merges with (2230, 2264) since the difference is less than 100. I suspect this was a typo in the original question.

Upvotes: 3

Related Questions