SDR
SDR

Reputation: 69

Dynamic programming algorithm to find palindromes in a directed acyclic graph

The problem is as follows: given a directed acyclic graph, where each node is labeled with a character, find all the longest paths of nodes in the graph that form a palindrome.

The initial solution that I thought of was to simply enumerate all the paths in the graph. This effectively generates a bunch of strings, on which we can then apply Manacher's algorithm to find all the longest palindromes. However, this doesn't seem that efficient, since the amount of paths in a graph is exponential in the number of nodes.

Then I started thinking of using dynamic programming directly on the graph, but my problem is that I cannot figure out how to structure my "dynamic programming array". My initial try was to use a 2d boolean array, where array[i][j] == true means that node i to node j is a palindrome but the problem is that there might be multiple paths from i to j.

I've been stuck on this problem for quite a while now I can't seem to figure it out, any help would be appreciated.

Upvotes: 2

Views: 2383

Answers (1)

Daniel Martin
Daniel Martin

Reputation: 23548

The linear-time trick of Manacher's algorithm relies on the fact that if you know that the longest palindrome centered at character 15 has length 5 (chars 13-17), and there's a palindrome centered at node 19 of length 13 (chars 13-25), then you can skip computing the longest palindrome centered at character 23 (23 = 19 + (19 - 15)) because you know it's just going to be the mirror of the one centered at character 15.

With a DAG, you don't have that kind of guarantee because the palindromes can go off in any direction, not just forwards and backwards. However, if you have a candidate palindrome path from node m to node n, whether you can extend that string to a longer palindrome doesn't depend on the path between m and n, but only on m and n (and the graph itself).

Therefore, I'd do this:

First, sort the graph nodes topologically, so that you have an array s[] of node indexes, and there being an edge from s[i] to s[j] implies that i < j.

I'll also assume that you build up an inverse array or hash structure sinv[] such that s[sinv[j]] == j and sinv[s[n]] == n for all integers j in 0..nodeCount-1 and all node indexes n.

Also, I'll assume that you have functions graphPredecessors, graphSuccessors, and graphLetter that take a node index and return the list of predecessors on the graph, the list of successors, or the letter at that node, respectively.

Then, make a two-dimensional array of integers of size nodeCount by nodeCount called r. When r[i][j] = y, and y > 0, it will mean that if there is a palindrome path from a successor of s[i] to a predecessor of s[j], then that path can be extended by adding s[i] to the front and s[j] to the back, and that the extension can be continued by y more nodes (including s[i] and s[j]) in each direction:

for (i=0; i < nodeCount; i++) {
  for (j=i; j < nodeCount; j++) {
    if (graphLetter(s[i]) == graphLetter(s[j])) {
      r[i][j] = 1;
      for (pred in graphPredecessors(s[i])) {
        for (succ in graphSuccessors(s[j])) {
          /* note that by our sorting, sinv[pred] < i <= j < sinv[succ] */
          if (r[sinv[pred]][sinv[succ]] >= r[i][j]) {
            r[i][j] = 1 + r[sinv[pred]][sinv[succ]];
          }
        }
      }
    } else {
      r[i][j] = 0;
    }
  }
}

Then find the maximum value of r[x][x] for x in 0..nodeSize-1, and of r[lft][rgt] where there is an edge from s[lft] to s[rgt]. Call that maximum value M, and say you found it at location [i][j]. Each such i, j pair will represent the center of a longest palindrome path. As long as M is greater than 1, you then extend each center by finding a pred in graphPredecessors(s[i]) and a succ in graphSuccessors(s[j]) such that r[sinv[pred]][sinv[succ]] == M - 1 (the palindrome is now pred->s[i]->s[j]->succ). You then extend that by finding the appropriate index with an r value of M - 2, etc., stopping when you reach a spot where the value in r is 1.

I think this algorithm overall ends up with a runtime of O(V^2 + E^2), but I'm not entirely certain of that.

Upvotes: 1

Related Questions