Explain markov-chain algorithm in layman's terms

Takafu Keyomama picture Takafu Keyomama · Nov 2, 2010 · Viewed 17k times · Source

I don't quite understand this Markov... it takes two words a prefix and suffix saves up a list of them and makes random word?

    /* Copyright (C) 1999 Lucent Technologies */
/* Excerpted from 'The Practice of Programming' */
/* by Brian W. Kernighan and Rob Pike */

#include <time.h>
#include <iostream>
#include <string>
#include <deque>
#include <map>
#include <vector>

using namespace std;

const int  NPREF = 2;
const char NONWORD[] = "\n";    // cannot appear as real line: we remove newlines
const int  MAXGEN = 10000; // maximum words generated

typedef deque<string> Prefix;

map<Prefix, vector<string> > statetab; // prefix -> suffixes

void        build(Prefix&, istream&);
void        generate(int nwords);
void        add(Prefix&, const string&);

// markov main: markov-chain random text generation
int main(void)
{
    int nwords = MAXGEN;
    Prefix prefix;  // current input prefix

    srand(time(NULL));
    for (int i = 0; i < NPREF; i++)
        add(prefix, NONWORD);
    build(prefix, cin);
    add(prefix, NONWORD);
    generate(nwords);
    return 0;
}

// build: read input words, build state table
void build(Prefix& prefix, istream& in)
{
    string buf;

    while (in >> buf)
        add(prefix, buf);
}

// add: add word to suffix deque, update prefix
void add(Prefix& prefix, const string& s)
{
    if (prefix.size() == NPREF) {
        statetab[prefix].push_back(s);
        prefix.pop_front();
    }
    prefix.push_back(s);
}

// generate: produce output, one word per line
void generate(int nwords)
{
    Prefix prefix;
    int i;

    for (i = 0; i < NPREF; i++)
        add(prefix, NONWORD);
    for (i = 0; i < nwords; i++) {
        vector<string>& suf = statetab[prefix];
        const string& w = suf[rand() % suf.size()];
        if (w == NONWORD)
            break;
        cout << w << "\n";
        prefix.pop_front(); // advance
        prefix.push_back(w);
    }
}

Answer

Vivin Paliath picture Vivin Paliath · Nov 2, 2010

According to Wikipedia, a Markov Chain is a random process where the next state is dependent on the previous state. This is a little difficult to understand, so I'll try to explain it better:

What you're looking at, seems to be a program that generates a text-based Markov Chain. Essentially the algorithm for that is as follows:

  1. Split a body of text into tokens (words, punctuation).
  2. Build a frequency table. This is a data structure where for every word in your body of text, you have an entry (key). This key is mapped to another data structure that is basically a list of all the words that follow this word (the key) along with its frequency.
  3. Generate the Markov Chain. To do this, you select a starting point (a key from your frequency table) and then you randomly select another state to go to (the next word). The next word you choose, is dependent on its frequency (so some words are more probable than others). After that, you use this new word as the key and start over.

For example, if you look at the very first sentence of this solution, you can come up with the following frequency table:

According: to(100%)
to:        Wikipedia(100%)
Wikipedia: ,(100%)
a:         Markov(50%), random(50%)
Markov:    Chain(100%)
Chain:     is(100%)
is:        a(33%), dependent(33%), ...(33%)
random:    process(100%)
process:   with(100%)
.
.
.
better:    :(100%)

Essentially, the state transition from one state to another is probability based. In the case of a text-based Markov Chain, the transition probability is based on the frequency of words following the selected word. So the selected word represents the previous state and the frequency table or words represents the (possible) successive states. You find the successive state if you know the previous state (that's the only way you get the right frequency table), so this fits in with the definition where the successive state is dependent on the previous state.

Shameless Plug - I wrote a program to do just this in Perl, some time ago. You can read about it here.