Data structure for Markov Decision Process

JackAW picture JackAW · Dec 20, 2012 · Viewed 8.9k times · Source

I have implemented the value iteration algorithm for simple Markov decision process Wikipedia in Python. In order to keep the structure (states, actions, transitions, rewards) of the particular Markov process and iterate over it I have used the following data structures:

  1. dictionary for states and actions that are available for those states:

    SA = { 'state A': {' action 1', 'action 2', ..}, ...}

  2. dictionary for transition probabilities:

    T = {('state A', 'action 1'): {'state B': probability}, ...}

  3. dictionary for rewards:

    R = {('state A', 'action 1'): {'state B': reward}, ...}.

My question is: is this the right approach? What are the most suitable data structures (in Python) for MDP?

Answer

Xiong Yiliang picture Xiong Yiliang · Jan 11, 2013

I implemented Markov Decision Processes in Python before and found the following code useful.

http://aima.cs.berkeley.edu/python/mdp.html

This code is taken from Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig.