Zaya
Zaya

Reputation: 336

Schedule k tasks with times to finish w(k) and N workers, without any tasks occurring out of order within a single worker

Suppose I have several tasks with uneven lengths. For example, let k=8 and w(k)=[4,5,3,10,1,1,1,3]. How would I assign these to N=3 workers such that any tasks assigned to a given worker are sequential while minimizing the difference in weights assigned to each worker?

E.g. in the above example, if we assign tasks like {0,1,2}, {3}, {4,5,6,7} then the workers will have to do 4+5+3=12, 10=10, 1+1+1+3=6 units of work each, with the maximum imbalance of 12-6=6.

I vaguely remember something like this in an algorithms class (the knapsack problem maybe?) but the extra wrinkle here is that the order within workers must be sequential. Does an algorithm exist that finds the optimal, or pretty good solution to this problem?

Edit for extra clarification:

By sequential, I mean that if more than one task is assigned to a worker, the tasks should have indices {n, n+1, n+2, ...}. Another way to think about this is to partition the original weight/cost array into subarrays without changing the order. E.g. valid (but suboptimal) solutions are [[4,5],[10],[1,1,1,3]], [[4],[5],[10,1,1,1,3]] etc.

Upvotes: 0

Views: 80

Answers (1)

Chaitanya Katti
Chaitanya Katti

Reputation: 53

You can simplify your question to partitioning a N-sized sequence into k smaller sequences. Where N is number of tasks and k is number of workers. This is a Linear Partitioning Problem. Here is a solution form Steven Skienna's book The Algorithm Design Mannual that minimizes the the largest sum of the sequences.

def partition(arr, k):
    n = len(arr)
    # Compute prefix sums
    prefix_sum = [0] * (n + 1)
    for i in range(1, n + 1):
        prefix_sum[i] = prefix_sum[i - 1] + arr[i - 1]

    # Initialize DP and decision tables
    dp = [[float('inf')] * (k + 1) for _ in range(n + 1)]
    dividers = [[0] * (k + 1) for _ in range(n + 1)]

    # Base cases
    for i in range(1, n + 1):
        dp[i][1] = prefix_sum[i]  # Only one partition, sum all elements up to i
    for j in range(1, k + 1):
        dp[1][j] = arr[0]  # Only one element, every partition equals arr[0]

    # Fill DP table using the recurrence
    for i in range(2, n + 1):  # Array size
        for j in range(2, k + 1):  # Number of partitions
            for x in range(1, i):  # Possible partition points
                cost = max(dp[x][j - 1], prefix_sum[i] - prefix_sum[x])
                if dp[i][j] > cost:
                    dp[i][j] = cost
                    dividers[i][j] = x  # Store partition point

    # Recover the partitions
    def recover_partitions(d, n, k):
        partitions = []
        while k > 0:
            x = d[n][k]
            partitions.append((x, n))
            n = x
            k -= 1
        partitions.reverse()
        return partitions

    partitions = recover_partitions(dividers, n, k)
    return dp[n][k], partitions


# Example:
arr = [4, 5, 3, 10, 1, 1, 1, 3]
k = 3
min_cost, partitions = partition(arr, k)

# Format the output partitions
partitioned_arrays = []
for i, (start, end) in enumerate(partitions):
    partitioned_arrays.append(arr[start:end])

print("Partitions:", partitioned_arrays) # [[4, 5, 3], [10], [1, 1, 1, 3]]
print("Minimum largest sum:", min_cost) # 12

Note: n and k are swapped for code clarity.

Upvotes: 0

Related Questions