Iron Chain

Markov Chains – Explained

Markov Chains is a probabilistic process, that relies on the current state to predict the next state. For Markov chains to be effective the current state has to be dependent on the previous state in some way; For instance, from experience we know that if it looks cloudy outside, the next state we expect is rain. We can also say that when the rain starts to subside into cloudiness, the next state will most likely be sunny. Not every process has the Markov Property, such as the Lottery, this weeks winning numbers have no dependence to the previous weeks winning numbers.

Usually when we have data, we calculate the probability of a state by counting the amount of times it occurs within a total of all states, we then end up with 0.5(50%) cloudy, 0.3(30%) rain, 0.2(20%) Sunny days of a year; Containing no information about when they occur relating to each other. To get the dependent probabilities, we count the number of times states occur relating to a state, and when we do that for all the states we then end up with a Markov Table :

Markov Probability Table
Markov Table of Probabilities

To generate a simple prediction of events, we can say today is Cloudy, and from the table above we can determine that the next state will most likely be Rain with 50%, then we go to the Rain state, and the following day it will probably still be raining with 60%. Markov Chains can be represented as a state diagram, or a matrix called a Transition Matrix:

Markov State Diagram
Markov State Diagram
Transition Matrix
Transition Matrix

The Transition Matrix transitions from Row to Column as in the Markov Table. We can do calculations with a Transition Matrix  utilizing a State Vector(vector of our current conditions) to give us the probabilities of the next states.

Current State Vector
Current State Vector

The above figure is set to 1(100%) cloudy for the current state, to calculate the probabilities of the next state we multiply the Current State Vector with the Matrix in figure 3:

State2 = Vector*Matrix
=[(1 *0.1 + 0*0.3 + 0*0.4); (1*0.5 + 0*0.6 + 0*0.1); (1*0.4 + 0*0.1 + 0*0.5)]
= [C;R;S] = [0.1; 0.5; 0.4]

Vector x Matrix Animation
Vector x Matrix Animation

No surprise, most likely rain. Now we can calculate the probabilities of the state after that using the resultant vector and multiplying it with the matrix in figure 3:

State3 =State2*Matrix
=[(0.1*0.1 + 0.5*0.3 + 0.4*0.4); (0.1*0.5 + 0.5*0.6 + 0.4*0.1); (0.1*0.4 + 0.5*0.1 + 0.4*0.5)]
= C;R;S = [0.32; 0.39; 0.29]

Rain most likely again, and we can calculate the probabilities of the state after that by multiplying with figure 2 again. That is the method of the Markov Chain of probabilities.

The Markov Chains that I have been working with are called 1st order Markov Chains, they only deal with 1 state to predict the next. In the above example, as you can see, when it transitions from cloudy to rain, it then absorbs into the rain state, never leaving leaving that state. The reason this happens is because the Transition Table only holds information of the last state, we don’t know if it was sunny or raining before it was cloudy. you could have a 2nd order Markov Chain that would take the last two states and get the probability of the next states. All that is required is grouping the last two states into 1 state as in the example Table Below:

2nd Order Markov Table
2nd Order Markov Table

Markov Chains do go more in-depth and I’ve only touched the surface. When programming Markov Chains most developers use the table method, linking a list of states to its list of next state probabilities. One toy program that people like to mention synonymous with Markov Chains is the Markov Chain text generator, trained on text, basically the states are words, and each word is linked to a list of words that have appeared after it in the training text.

Here`s a quote from my 3rd Order Markov Chain Text Generator trained on the Bible  “Then Joshua said to the olive-tree, Be king over us.”

This tutorial is not complete – I will be adding pseudo-code, and turning the calculations into images. Thank You for Reading.

Link outs: <- Comprehensive PDF


About these ads

19 thoughts on “Markov Chains – Explained

  1. How are marchov decision process models different from chains? There the outcomes are partly random. How would that change the process you outlined here? Thanks.


    1. I was thinking of writing on hmm, but I’m looking for a real world example I could use, instead of theoretical analogies. Might use encryption cracking, using the English alphabet as hidden states.


  2. So, when you do 2nd ordered Markov chain, do you also had to change the current state in the formula and provide the state before the current, or only transition matrix is updated?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s