Optimizations for longest path problem in cyclic graph

Craig Younkins picture Craig Younkins · Nov 23, 2010 · Viewed 10.3k times · Source

What optimizations exist for trying to find the longest path in a cyclic graph?

Longest path in cyclic graphs is known to be NP-complete. What optimizations or heuristics can make finding the longest path faster than DFSing the entire graph? Are there any probabilistic approaches?

I have a graph with specific qualities, but I'm looking for an answer to this in the general case. Linking to papers would be fantastic. Here is a partial answer:

  1. Confirm it is cyclic. Longest path in acyclic graphs is easily computed using dynamic programming.

  2. Find out if the graph is planar (which algorithm is best?). If it is, you might see if it is a block graph, ptolemaic graph, or cacti graph and apply the methods found in this paper.

  3. Find out how many simple cycles there are using Donald B Johnson's algorithm (Java implementation). You can change any cyclic graph into an acyclic one by removing an edge in a simple cycle. You can then run the dynamic programming solution found on the Wikipedia page. For completeness, you would have to do this N times for each cycle, where N is the length of the cycle. Thus, for an entire graph, the number of times you have to run the DP solution is equal to the product of the lengths of all cycles.

  4. If you have to DFS the entire graph, you can prune some paths by computing the "reachability" of each node in advance. This reachability, which is mainly applicable to directed graphs, is the number of nodes each node can reach without repetitions. It is the maximum the longest path from that node could possibly be. With this information, if your current path plus the reachability of the child node is less than the longest you've already found, there is no point in taking that branch as it is impossible that you would find a longer path.

Answer

j_random_hacker picture j_random_hacker · Nov 24, 2010

Here is a O(n*2^n) dynamic programming approach that should be feasible for up to say 20 vertices:

m(b, U) = the maximum length of any path ending at b and visiting only (some of) the vertices in U.

Initially, set m(b, {b}) = 0.

Then, m(b, U) = max value of m(x, U - x) + d(x, b) over all x in U such that x is not b and an edge (x, b) exists. Take the maximum of these values for all endpoints b, with U = V (the full set of vertices). That will be the maximum length of any path.

The following C code assumes a distance matrix in d[N][N]. If your graph is unweighted, you can change every read access to this array to the constant 1. A traceback showing an optimal sequence of vertices (there may be more than one) is also computed in the array p[N][NBITS].

#define N 20
#define NBITS (1 << N)

int d[N][N];       /* Assumed to be populated earlier.  -1 means "no edge". */
int m[N][NBITS];   /* DP matrix.  -2 means "unknown". */
int p[N][NBITS];   /* DP predecessor traceback matrix. */

/* Maximum distance for a path ending at vertex b, visiting only
   vertices in visited. */
int subsolve(int b, unsigned visited) {
    if (visited == (1 << b)) {
        /* A single vertex */
        p[b][visited] = -1;
        return 0;
    }

    if (m[b][visited] == -2) {
        /* Haven't solved this subproblem yet */
        int best = -1, bestPred = -1;
        unsigned i;
        for (i = 0; i < N; ++i) {
            if (i != b && ((visited >> i) & 1) && d[i][b] != -1) {
                int x = subsolve(i, visited & ~(1 << b));
                if (x != -1) {
                    x += d[i][b];
                    if (x > best) {
                        best = x;
                        bestPred = i;
                    }
                }
            }
        }

        m[b][visited] = best;
        p[b][visited] = bestPred;
    }

    return m[b][visited];
}

/* Maximum path length for d[][].
   n must be <= N.
   *last will contain the last vertex in the path; use p[][] to trace back. */
int solve(int n, int *last) {
    int b, i;
    int best = -1;

    /* Need to blank the DP and predecessor matrices */
    for (b = 0; b < N; ++b) {
        for (i = 0; i < NBITS; ++i) {
            m[b][i] = -2;
            p[b][i] = -2;
        }
    }

    for (b = 0; b < n; ++b) {
        int x = subsolve(b, (1 << n) - 1);
        if (x > best) {
            best = x;
            *last = b;
        }
    }

    return best;
}

On my PC, this solves a 20x20 complete graph with edge weights randomly chosen in the range [0, 1000) in about 7s and needs about 160Mb (half of that is for the predecessor trace).

(Please, no comments about using fixed-size arrays. Use malloc() (or better yet, C++ vector<int>) in a real program. I just wrote it this way so things would be clearer.)