Redouane Elghazi
Redouane Elghazi

Reputation: 19

Finding the cost of the cheapest way through a 2d array on x or y axis

was asked a similar problem on a job interview lately and i struggle to come up with a more efficient solution.

The problem's Rules, assuming travel is done through x axis, left to right for simplicity:

The goal is, given these restrictions find the cheapest way to travel left through right through the array.

Example:

Array: {
  {920, 239, 191, 267, 166, 879, 791, 998, 447,  617, 31,  169},
  {361, 516, 506, 279, 406, 231, 828, 410, 408,  199, 507, 671},
  {143, 69,  675, 847, 871, 704, 471, 796, 1000, 711, 42,  380},
  {559, 407, 555, 390, 672, 415, 902, 570, 803,  29,  394, 937},
  {407, 336, 427, 801, 509, 803, 267, 617, 47,   710, 529, 423},
  {377, 26,  561, 950, 134, 343, 542, 342, 549,  65,  39,  900}
};

Result: 2634
Path: 143, 69, 506, 279, 134, 343, 542, 342, 47, 29, 31

My example code:

#include <iostream>
#include <algorithm>
#include <list>
#include <sstream>
#include <set>

namespace stack_overflow_example {

    int const x_dim = 6;
    int const y_dim = 12;
    int expected = 2634;

    int arr[x_dim][y_dim] =
            {
                    {920, 239, 191, 267, 166, 879, 791, 998, 447,  617, 31,  169},
                    {361, 516, 506, 279, 406, 231, 828, 410, 408,  199, 507, 671},
                    {143, 69,  675, 847, 871, 704, 471, 796, 1000, 711, 42,  380},
                    {559, 407, 555, 390, 672, 415, 902, 570, 803,  29,  394, 937},
                    {407, 336, 427, 801, 509, 803, 267, 617, 47,   710, 529, 423},
                    {377, 26,  561, 950, 134, 343, 542, 342, 549,  65,  39,  900}
            };

    int minimumValue = INT32_MAX;

    void recursive_function(int x, int y, int value, std::set<int> unusedX) {
        int summed = value + arr[x][y];
        if (summed > minimumValue) {
            return;
        }
        if ((y + 1) < y_dim) {
            recursive_function(x, y + 1, summed, unusedX);
            unusedX.erase(x);
            for (auto const &i: unusedX) {
                recursive_function(i, y + 1, summed, unusedX);
            }
        } else {
            minimumValue = minimumValue < summed ? minimumValue : summed;
        }
    }

    void calculate() {
        std::set<int> unusedX;
        for (int i = 0; i < x_dim; i++) {
            unusedX.emplace(i);
        }
        for (int i = 0; i < x_dim; i++) {
            recursive_function(i, 0, 0, unusedX);
        }
        std::cout << "Expected was: " << expected << " received " << minimumValue;
    }

}

int main() {
    stack_overflow_example::calculate();
}

The solution I presented works correctly but a poor algorithmic complexity, an array of size [12][24] already takes an unacceptable amount of time to travel. The recruiter implied there is a faster answer but I am unable to come up with a solution, which is where i am hoping for your help. Any programming language will suffice, I just posted the problem in cpp because it's the language I use to do algorithmic problems. Thanks in advance!

Upvotes: 0

Views: 811

Answers (1)

Aivean
Aivean

Reputation: 10882

This is a dynamic programming problem.

The DP problem can be described in terms of the "subtask" (with its associated answer) and the "dynamic function" (a function that calculates the answer for the task based on the already solved subtasks).

In this case, the "subtask" is the minimal cost of the partial path from left to i-th column. More specifically, the "key" (or the "question") for the subtask can be defined as Pair<Set<Int>, Int>, where Set<Int> is the set of already visited rows, and second Int is the last visited row (the row where current subpath ended). The "value" (or "answer") for the subtask is the single Int — the minimal cost of the subpath associated with the "key".

The dynamic function needs to calculate the minimal cost of the subpath with length (i+1) based on the already calculated minimal subpaths with length i. This can be best illustrated by the following pseudocode:

for (set, last) in "valid subpath_next indices":
  subpath_next[set, last] = min(

    ## staying on the same (last) row 
    subpath_prev[set, last],    ​
​
​    ## changing row (from j to last)
    ​min(subpath_prev[set - last, j] for j in set)     

 ​) + arr[i+1, last]

As we don't know yet the "valid subpath_next indices", the better approach to this is the "bottom-up" way:

for (set, last) in subpath_prev.indices:
  ## staying on the same (last) row
  subpath_next[set, last] = min(
    subpath_next[set, last],
    subpath_prev[set, last] + arr[i+1, last]
  )

  ​## changing row (from last to j)
  for j in rows:
    if j is not in set:
      subpath_next[set + j, j] = min(
        subpath_next[set + j, j],
        subpath_prev[set, last] + arr[i+1, j]
      )

Initially, for the first step (i=1, pathes of length 1) the subpath_prev is initialized with the single value for each row (subpath_prev[{i}, i]=arr[i, 0]).

Having the set of solved subtasks for the ith step in the subpath_prev this loop calculates the values for the subtasks of the i+1th step, subpath_next. After that, subpath_next becomes subpath_prev for the next step. (Note, this outer loop for i is not shown in the pseudocode above).

After calculating subpath_next for the final column (i == n), the answer will be the minimal path among the elements of the subpath_next.


Now, you may notice the similarity of your recursive_function to the dynamic function given above. There are two key differences:

  1. Bottom-up approach calculates the subtask only once, while recursive function (without explicit caching) will do the same work many times. Using recursive function with the explicit cache is one possible way to solve DP problems with "top-down" approach.

  2. Instead of having x as the part of they "key" for the subtask, bottom-up approach allows us to store only the pertinent intermediate results, those of the previous step only. This allows to reduce the space usage. However, this works only if the problem doesn't require the full reconstructed path as the answer, only the cost. If we were to reconstruct the whole path, we'd need to store previous row as the part of the "key" or the "value" of the subtask.


Complexity analysis:

m is the number of rows and n is the number of columns.

Space: O(m * 2m):

This is basically how many keys subpath_prev has at each step.

Time: O(n * m * space) or O(n * m² * 2m):

i travels for 1 to n, for each i step we consider each key in subpath_prev (space), and for each key we also iterate every row (m).

Upvotes: 1

Related Questions