D.C.
D.C.

Reputation: 31

Combining discrete and/or overlapping time sequences from lists

I have 2 lists of ordered times which are the start/stop times for ServiceA and ServiceB, respectively. I want to combine the lists in one list containing the start and stop times (in order) of when at least 1 of the services was running. (Both services always start after 00:01:00 and stop before 23:59:00.)

Example:

ListA = ["08:03:19","14:22:22","17:00:02","18:30:01"]
ListB = ["15:19:03","18:00:00","18:35:05","19:01:00"]
... the magic happens
Result =["08:03:19","14:22:22","15:19:03","18:30:01","18:35:05","19:01:00"]

The code below does not produce the desired results. After numerous attempts that accounted for most but not all possible cases, I'm posting what I have currently. The lists could be vastly different, for example there may be no overlap or complete overlap between start/stop times of the 2 services.

#!/usr/bin/python2.7

def combineOverlappingTimes(aList, bList, CurrentResults):
    ReturnList = []
    aStart = aList[0]
    aStop = aList[1]
    bStart = bList[0]
    bStop = bList[1]

    if len(CurrentResults) == 0:
        LastTimeInCurrentResults = "00:00:00"
    else:
        LastTimeInCurrentResults = CurrentResults[(len(CurrentResults)-1)]

    print "aStart= %s\naStop= %s\nbStart= %s\nbStop= %s" % (aStart,aStop,bStart,bStop)
    print "LastTimeInCurrentResults= %s" % LastTimeInCurrentResults
    if aStart >= LastTimeInCurrentResults and bStart >= LastTimeInCurrentResults:
        if aStart > bStart:
            if bStart > aStop:
                ReturnList.append( (aStart,aStop) )
            elif bStart < aStop:
                ReturnList.append( (bStart,bStop ) )
        else: #(aStart < bStart)
            if aStop < bStart:
                ReturnList.append( (bStart,bStop) )
            elif aStop > bStop: 
                ReturnList.append( (bStart,aStop) )
    elif aStart >= LastTimeInCurrentResults:
        ReturnList.append( (aStart, aStop) )
    else: # either A or B is beforeLastTime
        if aStart < LastTimeInCurrentResults:
            ReturnList.append( (LastTimeInCurrentResults, aStop) )
        elif bStart < LastTimeInCurrentResults:
            ReturnList.append( (LastTimeInCurrentResults, bStop) )

    print ( "combineOverlappingTime ReturnList= " + str(ReturnList))
    print "++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++\n\n"
    return ReturnList

# main()
#####################################################################
def main():

    ListA = ["08:03:19","14:22:22","14:22:25","14:22:30","18:00:02","18:30:01"]
    ListB = ["14:22:36","15:18:10","15:19:03","18:00:01","18:00:05","19:01:00"]
    ResultList = []

    i = 0
    while i < len(ListA):
        if i == 0:
            ListA_StartTime= ListA[i]
            ListA_StopTime = ListA[i+1]
        else:
            if i == len(ListA)-2:
                ListA_StartTime= ListA[i]
                ListA_StopTime = ListA[i+1]
            else:
                ListA_StartTime= ListA[i]
                ListA_StopTime = ListA[i+1]

        j = 0
        ListB_StartTime, ListB_StopTime = "",""
        for time in ListB:
            if j % 2 == 0:
                ListB_StartTime= time
            else:
                ListB_StopTime = time

            if ListB_StartTime!= "" and ListB_StopTime != "":
                tempSetA, tempSetB = [], []
                tempSetA.append(ListB_StartTime)
                tempSetA.append(ListB_StopTime)
                tempSetB.append(ListA_StartTime)
                tempSetB.append(ListA_StopTime)
                combinedTimes = combineOverlappingTimes(tempSetA, tempSetB, ResultList)
                for start,stop in combinedTimes:
                    ResultList.append(start)
                    ResultList.append(stop)
                ListB_StartTime, ListB_StopTime = "",""
            j += 1

        i += 2

    print "ResultList= %s \n\n" % str(ResultList)
    DesiredList = ["08:03:19","14:22:22","14:22:25","14:22:30","14:22:36","15:18:10","15:19:03","18:00:01","18:00:02","19:01:00"]
    print "Desired Results: %s" % str(DesiredList)


if __name__ == '__main__':
    main()

Upvotes: 3

Views: 82

Answers (2)

Moon Cheesez
Moon Cheesez

Reputation: 2701

Breaking down the problem

Your code can be more readable if you break it down into functions and plan before coding.

Your problem can be broken down into smaller manageable parts:

  1. Create a function that returns the total uptime of two timings.
  2. Sort the timings in aList and bList in ascending order
  3. Compare the first two timings of aList and bList to get the total uptime for the first two timings.

4a. If it returns 4 timings, the timings given are disjoint. Take the last two timings and compare with the next two in aList.

4b. If not, the timings are connected. Take the two timings and compare with the next two and compare with the next two in aList.

  1. With the results from 4, compare the timings in bList in a similar fashion as in 4.
  2. Repeat until end of the list.

The problem, after reading through your code, seems to be that it is too unclear what needs to be done (step by step).

Solving the problem

1. Create a function that returns the total uptime of two timings.

The timings will have these possible cases:

NOT CONNECTED
Case 1
A: ---
B:      -----
Case 2
A:      -----
B: ---

CONNECTION
Connected A starts first
Case 1
A: -------
B:   ---------
Case 2
A: -------
B:   ---

Connected B starts first
Case 3
A:   ---------
B: -------
Case 4
A:   ---
B: -------

EQUALITY
Starting Equality
Case 1
A: ---
B: -----
Case 2
A: -----
B: ---

Ending Equality
Case 3
A: -----
B:   ---
Case 4
A:   ---
B: -----

Total Equality
Case 5
A: -----
B: -----

With that in mind, you can go ahead and create the code for it.

def combined_uptime(a, b):
    # Only accept lists of length two
    aStart = a[0]
    aStop = a[1]
    bStart = b[0]
    bStop = b[1]

    # < means "earlier than"
    # > means "later than"

    # Unconnected
    # Not connected Case 1
    if aStop < bStart:
        return (aStart, aStop, bStart, bStop)
    # Not connected Case 2
    elif bStop < aStart:
        return (bStart, bStop, aStart, aStop)

    # From this point on, A and B are connected

    # CONNECTION
    # A starts first
    if aStart <= bStart:
        if aStop < bStop:
            # Connection Case 1 + Equality Case 1
            return (aStart, bStop)
        else:
            # Connection Case 2 + Equality Case 2 + Equality Case 3 + Equality Case 5
            return (aStart, aStop)
    else:
        if bStop < aStop:
            # Connection Case 3
            return (bStart, aStop)
        else:
            # Connection Case 4 + Equality Case 4
            return (bStart, bStop)

2. Sort the timings in aList and bList in ascending order.

This can be done by converting all your timings to be stored as tuples, sort it, then remove the tuples.

def sort_ascending(x):
    # Store each set in a tuple
    l = [(s, x[i*2 + 1]) for i, s in enumerate(x[::2])]

    # Sort in ascending order
    l.sort(key=lambda tup: tup[0])
    print l
    # Remove tuples
    ret_list = []
    [ret_list.extend(s) for s in l]
    return ret_list

3-6. The recursive function

You would see the last step includes repetition of the same process. This usually hints that a recursive function will do the job.

Using all the functions as mentioned above, here is the recursive function:

def uptime(a, b, result=None):
    print a
    print b
    print result
    ret_list = []

    # Return the result if either a or b is empty
    if a == [] and b == []:
        return result
    elif a and b == []:
        return result.extend(a) or result[:]
    elif b and a == []:
        return result.extend(b) or result[:]

    # Prevent overwriting, make a copy
    aList = list(a)[:]
    bList = list(b)[:]

    # Get results from previous iteration
    if result:
        # Process aList
        results_aList = list(combined_uptime(aList[0:2], result))
        del aList[0:2]

        if len(results_aList) != 2:
            ret_list.extend(results_aList[0:2])
            del results_aList[0:2]

        # Process bList
        results_bList = list(combined_uptime(bList[0:2], results_aList))
        del bList[0:2]

        if len(results_bList) != 2:
            ret_list.extend(results_bList[0:2])
            del results_bList[0:2]

        print "CLEAR"
        # Add result
        ret_list.extend(uptime(aList, bList, results_bList))
    else:
        # First iteration
        results_aList_bList = list(combined_uptime(aList[0:2], bList[0:2]))
        del aList[0:2]
        del bList[0:2]

        if len(results_aList_bList) != 2:
            # Disjoint
            ret_list.extend(results_aList_bList[0:2])
            del results_aList_bList[0:2]
        print "CLEAr"
        ret_list.extend(uptime(aList, bList, results_aList_bList))
    return ret_list

In your test case that you gave in the comments, it will return

["00:00:01","22:00:00", "22:00:00","23:58:59"]

when passed in

ListA = ["00:00:01","22:00:00"]
ListB = ["08:00:00", "09:00:00", "22:00:00","23:58:59"]

Upvotes: 0

aldanor
aldanor

Reputation: 3481

You could do this without a single for-loop by using itertools.reduce from the standard library to do the heavy lifting. Some consider this to be more idiomatic (except Guido of course who doesn't like the reduce function so much he chose to remove it from Python's prelude).

from functools import reduce

# this would work with any comparable values in `aList` and `bList`
aList = [0, 3, 7, 10, 13, 14]
bList = [2, 4, 10, 11, 13, 15]

# split both lists into tuples of the form `(start, stop)`
aIntervals = list(zip(aList[::2], aList[1::2]))
bIntervals = list(zip(bList[::2], bList[1::2]))

# sort the joint list of intervals by start time
intervals = sorted(aIntervals + bIntervals)

# reduction function, `acc` is the current result, `v` is the next interval
def join(acc, v):
    # if an empty list, return the new interval
    if not acc:
        return [v]
    # pop the last interval from the list
    last = acc.pop()
    # if the intervals are disjoint, return both
    if v[0] > last[1]:
        return acc + [last, v]
    # otherwise, join them together
    return acc + [(last[0], max(last[1], v[1]))]

# this is an iterator with joined intervals...
joined_intervals = reduce(join, intervals, [])

# ... which we can join back into a single list of start/stop times
print(list(sum(joined_intervals, ())))

The output is, as expected,

[0, 4, 7, 11, 13, 15]

By testing with the timelike values in the provided example:

aList = ['08:03:19', '14:22:22', '17:00:02', '18:30:01']
bList = ['15:19:03', '18:00:00', '18:35:05', '19:01:00']

this also yields the desired answer of

['08:03:19', '14:22:22', '15:19:03', '18:30:01', '18:35:05', '19:01:00']

Upvotes: 1

Related Questions