Job Scheduling Algorithm in Java

I need to design an efficient algorithm for a scheduling problem and I really don't have a clue.

There's a machine that produces pills in a certain rate. For example, the machine might be capable to produce 1 pill if it is allowed to continuously work for one day, 4 pills, if it is allowed to work for 3 days, 16 pills if it works for 5 days, and so on. If I stop the machine and take out all the pills then the machine will start from day 1 again. All pills that I took out from the machine must be used on the same day.

There is a certain amount of patients come and take pills everyday. The patients must be treated on the same day, untreated patients are ignored. The goal is to decide which days to stop the machine and treat as many patients as possible in n days.

Suppose the number of days n = 5, given example input

int[] machineRate = {1,2,4,8,16};
int[] patients =    {1,2,3,0,2};

In this case, if I stop the machine on day 3, I will have 4 pills. I can treat 3 patients and throw away 1 pill. Then I stop the machine on day 5 again, since it was stopped on day 3, it has been working for 2 days, therefore I have 2 pills to treat 2 patients. In the end 3+2=5, the output = 5 patients.

If I stop the machine on day 2, day 3, day 5. Then the output will be (2 pills for 2 patients on day 2) +(1 pill for 3 patients on day 3) +（2 pills for 2 patients on day 5）. That equals to 5 patients as well.

The machineRate[] and patients[] vary according to input.

What's the algorithm that finds the maximum number of treated patients?

Upvotes: 7

Answers (3)

Constantinos

Reputation: 1208

I've implemented chiastic-security's design, but the performance isn't great when n gets larger than 10000 or so. If anyone has any other ideas please let me know because I thought this was a pretty interesting problem. I tried it with recursion at first but kept running out of memory, so I had to do it in a loop instead. I was storing a big 2d array with all the results so far but then I realised that I only ever need to access the previous "row" of results so I'm only using 2 arrays: "current" and "previous":

static int calculateMax() {
    int[] previous = new int[n];
    for (int daysMachineRunning=0; daysMachineRunning<n; daysMachineRunning++) {
        previous[daysMachineRunning] = treatPatients(0, daysMachineRunning);
    }

    int[] current = null;
    for (int daysRemaining=1; daysRemaining<n; daysRemaining++) {
        current = new int[n-daysRemaining];
        for (int daysMachineRunning=0; daysMachineRunning<n-daysRemaining; daysMachineRunning++) {
            current[daysMachineRunning] = Math.max(
                    treatPatients(daysRemaining, daysMachineRunning) + previous[0],
                    previous[daysMachineRunning+1]
                );
        }
        previous = current;
    }
    return current[0];
}

static int treatPatients(int daysRemaining, int daysMachineRunning) {
    return Math.min(patients[n-1-daysRemaining], machineRate[daysMachineRunning]);
}

EDIT: I've now implemented a 2nd approach, but still getting issues where n>=10000 or so: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space. Here's my code if anyone is interested in pursuing further:

static final int[][] results = new int[n][n];
static final SortedSet<Target> queue = new TreeSet<>(new Comparator<Target>() {
    @Override
    public int compare(Target o1, Target o2) {
        if (o1.daysRemaining < o2.daysRemaining)
            return -1;
        else if (o1.daysRemaining > o2.daysRemaining)
            return 1;
        else if (o1.daysMachineRunning < o2.daysMachineRunning)
            return 1;
        else if (o1.daysMachineRunning > o2.daysMachineRunning)
            return -1;
        else return 0;
    }
});

public static void main(String[] args) {
    for (int i=0; i<n; i++) {
        Arrays.fill(results[i], -1);
    }

    if (n <= 10) {
        System.out.println(Arrays.toString(machineRate));
        System.out.println(Arrays.toString(patients));
    } else
        System.out.println(n);

    System.out.println(calculateMax());
}

static class Target {
    int daysRemaining, daysMachineRunning;
    Target(int daysRemaining, int daysMachineRunning) {
        this.daysRemaining = daysRemaining;
        this.daysMachineRunning = daysMachineRunning;
    }
}

static int calculateMax() {
    addTarget(n-1, 0);
    while (results[n-1][0]==-1) {
        Target t = queue.first();
        queue.remove(t);
        calculateMax(t);
    }
    return results[n-1][0];
}

static void calculateMax(Target t) {
    int daysRemaining = t.daysRemaining;
    int daysMachineRunning = t.daysMachineRunning;
    int treatedPatients = Math.min(patients[n-1-daysRemaining], machineRate[daysMachineRunning]);
    if (daysRemaining==0)
        results[0][daysMachineRunning] = treatedPatients;
    else {
        int resultA = results[daysRemaining-1][0];
        int resultB = results[daysRemaining-1][daysMachineRunning+1];
        if (resultA>=0 && resultB>=0) {
            results[daysRemaining][daysMachineRunning] = Math.max(treatedPatients + resultA, resultB);
        }
        else {
            if (resultA==-1)
                addTarget(daysRemaining-1, 0);
            if (resultB==-1)
                addTarget(daysRemaining-1, daysMachineRunning+1);
            addTarget(daysRemaining, daysMachineRunning);
        }
    }
}

static void addTarget(int a, int b) {
    queue.add(new Target(a,b));
}

Upvotes: 0

chiastic-security

Reputation: 20520

This is a nice dynamic programming problem.

The way to think of dynamic programming is to ask yourself two questions:

Is there a trivial version of this problem if I reduce one (or more) of the variables to zero (or similar)?
Is there a simple way of calculating the answer to a problem of size n+1 if I know answers to all problems of size n? (Here, "size" is problem-specific, and you need to find the right notion of size that helps with the problem in hand.)

For this problem, what would be a trivial version? Well, suppose the number of days was 1. Then it would be easy: I stop the machine, and treat as many patients as I can. There's no point doing anything else.

Now, if we consider the number of days left as our notion of size, we get an answer to the second question as well. Suppose we know all answers to all problems where there are n days left. Let's write maxTreat(days, running) for the maximum number we could treat if there were days days left, and if the machine had initially been running for running days.

Now there are n+1 days left; and the machine has been running so far for k days. We've got two options: (1) stop the machine; (2) don't stop it. If we stop the machine, we can treat some patients today (we can work out the number based on k), and thereafter we can treat maxTreat(n, 1) patients, because there are then n days left, and by tomorrow the machine will have been running again for just one day. If we don't stop the machine, we can't treat anyone today, but thereafter we'll be able to treat maxTreat(n,k+1) patients, because by tomorrow the machine will have been running for k+1 days.

I will leave you to work out the precise details, but to solve it efficiently, we create a multidimensional array, based on number of days left, and number of days for which the machine has been running so far. We then iterate through the array, solving all the possible problems, starting from the trivial (one day left) and working backwards (two days, then three days, and so on). At each stage, the problem we're solving is either trivial (so we can just write the answer in), or something we can calculate from entries we wrote into the array at the previous step.

The really cool thing about dynamic programming is that we're creating a cache of all the results as we go. So for problems where a recursive approach would otherwise end up needing to calculate the answer to a sub-problem several times, with dynamic programming we never end up solving a sub-problem more than once.

Additional comments now that I've seen your implementation:

For one, I'm not too surprised that it starts to slow down when you hit 10,000 or so. The algorithm is O(n^2), because at each iteration you have to fill the array with up to n entries before you can move to the next level. I'm quite certain that O(n^2) is the best asymptotic complexity you're going to get for this puzzle, though.

If you want to speed it up further, you could look at a top-down approach. Currently you're doing bottom-up dynamic programming: solving the cases of size 0, then of size 1, and so on. But you can also do it the other way round. Essentially this is the same algorithm as if you were writing a horribly inefficient recursive solution that calculates solutions to sub-problems on the fly, except that you cache a result every time you calculate it. So it looks something like this:

Set up your two-dimensional array to hold solutions to sub-problems. Pre-fill it with -1 for each case. A value of -1 will indicate that you haven't solved that sub-problem yet.
Write a routine that solves maxTreat(days, running) in terms of answers to sub-problems at the next level down. When you want the answers to the sub-problems, look in the array. If there's a -1 in there, you haven't solved that one yet, so you recursively solve it, and then put the answer into the array. If there's anything other than -1, you can just use the value you find there, because you've already calculated it. (You can also use a HashMap instead of the multidimensional array.)

This is better in one way and worse in another. It's worse because you have overheads associated with the recursion, and because you'll eventually run out of stack with the recursive calls. You might need to bump up the stack size with a command-line parameter to the JVM.

But it's better in one key respect: you don't calculate answers to all sub-problems, but only the ones you need to know the answers to. For some problems, that's a massive difference, and for some, it's not. It's hard to get the right intuition, but I think it might make a big difference here. After all, each answer depends on only two sub-problems from the previous row.

The ultimate solution (don't try this until you get the top-down recursive one going first!) is to do it top-down but without recursion. This will avoid the stack space issue. To do this, you create a stack of sub-problems (use an ArrayDeque) that need solving, and you keep taking them off the front of the queue until there are none left. The first thing is to push onto the stack the large problem for which you need a solution. Now, you iteratively pop problems off the stack until it's empty. Pop one off, and call it P. Then:

Look in your array or HashMap to see if P has been solved. If so, return the answer.
If not, look to see if the sub-problems for P have already been solved. If they have, then you can solve P, and you cache the answer. If the stack's now empty, then you've solved your final problem, and you output the answer for P.
If the sub-problems haven't all been solved, then push P back onto the stack. Then push any of P's sub-problems that haven't yet been solved onto the stack as well.

What will happen as you go is that your stack will grow initially as you push the main problem, and its sub-problems, and then its sub-problems, onto the stack. Then you'll start solving the smaller instances and putting the results into the cache, until eventually you have everything you need to solve the main problem.

It doesn't use significantly less memory than the recursive top-down approach, but it does use heap space rather than JVM stack space, and that means it scales up better because the JVM stack is much smaller than the heap.

This is quite difficult, though. At the very least, keep your working solution before you start coding up the more difficult version!

Upvotes: 6

Hannes

Reputation: 2073

Another approach would be to predict the next day or days. Say we have seen 1,2.patients in the last days, we could either take the two pills today and cure two patients or predict three or more for the next day and let machine run. If we have no raise like 1,1, we would predict one patient for tomorrow and take the one pill today. If the next day turns out diffrent like 1, 4, 0, we just adjust the prediction for the next day to be 1/2, i.e 2. Upside of this solution is that you can work with uncertainty, i.e. you do not know what tomorrow brings. this allows us to stream the data. Down side is that the first patient will always die.

Upvotes: 0

Job Scheduling Algorithm in Java

Answers (3)

Related Questions