KMP prefix table

Question 1

KMP prefix table

string algorithm data-structures pattern-matching

Cratylus · Dec 9, 2012 · Viewed 31.1k times · Source

Answer

Answer

Every number belongs to corresponding prefix ("a", "ab", "aba", ...) and for each prefix it represents length of longest suffix of this string that matches prefix. We do not count whole string as suffix or prefix here, it is called self-suffix and self-prefix (at least in Russian, not sure about English terms).

So we have string "ababaca". Let's look at it. KMP computes Prefix Function for every non-empty prefix. Let's define s[i] as the string, p[i] as the Prefix function. prefix and suffix may overlap.

+---+----------+-------+------------------------+
| i |  s[0:i]  | p[i]  | Matching Prefix/Suffix |
+---+----------+-------+------------------------+
| 0 | a        |     0 |                        |
| 1 | ab       |     0 |                        |
| 2 | aba      |     1 | a                      |
| 3 | abab     |     2 | ab                     |
| 4 | ababa    |     3 | aba                    |
| 5 | ababac   |     0 |                        |
| 6 | ababaca  |     1 | a                      |
|   |          |       |                        |
+---+----------+-------+------------------------+

Simple C++ code that computes Prefix function of string S:

vector<int> prefixFunction(string s) {
    vector<int> p(s.size());
    int j = 0;
    for (int i = 1; i < (int)s.size(); i++) {
        while (j > 0 && s[j] != s[i])
            j = p[j-1];

        if (s[j] == s[i])
            j++;
        p[i] = j;
    }   
    return p;
}

Question 2

I am reading about KMP for string matching.
It needs a preprocessing of the pattern by building a prefix table.
For example for the string ababaca the prefix table is: P = [0, 0, 1, 2, 3, 0, 1]
But I am not clear on what does the numbers show. I read that it helps to find matches of the pattern when it shifts but I can not connect this info with the numbers in the table.

KMP prefix table

Answer

Related questions