Markov Chains

Introduction

Calculate

Introduction

Imagine a game of chance where you can only be on positions 'A', 'B' or 'C'. Suppose these are the rules:

If you're on the A position, your probabilities of ending up in 'A', 'B', or 'C' could be like this:

	A	B	C
If you're currently on 'A'	.05	.03	.02

Where A → A means you stay on position A.

If you're on the B position, however, your probabilities of ending up on the various positions could be different:

	A	B	C
If you're currently on 'B'	.04	.02	.04

And if you're on a C position, your probabilities of ending up on the other squares could also be different:

	A	B	C
If you're currently on 'C'	.03	.01	.06

These could be combined into one matrix:

The probability you'll end up here →	A	B	C
If you're currently on 'A'	.05	.03	.02
If you're currently on 'B'	.04	.02	.04
If you're currently on 'C'	.03	.01	.06

Or, just...

	A	B	C
A	.05	.03	.02
B	.04	.02	.04
C	.03	.01	.06

where each row represents an 'input state' (where you are now) and each column represents an 'output state" (where you'll end up next). Let's call this "Matrix M"

The thing is, if you apply Matrix M to move from position A, B, or C to position A, B, or C, you could turn around and apply Matrix M to move again. And repeat that process as many times as you like. This was invented by the Russian mathematician Andrey Markov (1856–1922). And repeatedly applying a matrix like the one above is called a 'Markov Chain". It turns out that Markov Chains (or "Markov Process') are good at modeling all sorts of real world processes.

We could imagine a similar game where the probabilities of your next position depend on your current position and your previous position. This is called a '2nd order' Markov matrix because it depends on two states: where you are now and where you were just before now. So Matrix M is a '1st order" because it only depends on one 'input' state: where you are right now. And a '3rd' order Markove matrix depend on three input states. And so forth.

Brain Bend Moment:

1st order Markov matrices need two dimensions: a dimension for the one input state and a dimension for output state. Across and down.
2nd order Markov matrices need three dimensions: two for the input states and one for the output state.
And 3rd order Markov matrices need four dimensions. (Yikes!)

The thing is, generally speaking the higher the order, the better the approximation. In Claude Shannon's seminal paper A Mathematical Theory of Communication1, he briefly considers probabilitstic approximations of English, first using sequences of letters, then of words (probabilities derived by analysing English text):

Zero-order approximation (symbols independent and equiprobable). XFOML RXKHRJFFJUJ ZLPWCFWKCYJ FFJEYVKCQSGHYD QPAAMKBZAACIBZLHJQD.
First-order approximation (symbols independent but with frequencies of English text). OCRO HLI RGWR NMIELWIS EU LL NBNESEBYA TH EEI ALHENHTTPA OOBTTVA NAH BRL.
Second-order approximation (digram structure as in English). ON IE ANTSOUTINYS ARE T INCTORE ST BE S DEAMY ACHIN D ILONASIVE TUCOOWE AT TEASONARE FUSO TIZIN ANDY TOBE SEACE CTISBE.
Third-order approximation (trigram structure as in English). IN NO IST LAT WHEY CRATICT FROURE BIRS GROCID PONDENOME OF DEMONSTURES OF THE REPTAGIN IS REGOACTIONA OF CRE.
First-order word approximation. Rather than continue with tetragram, ::: , n-gram structure it is easier and better to jump at this point to word units. Here words are chosen independently but with their appropriate frequencies.
Second-order word approximation. The word transition probabilities are correct but no further structure is included. THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.

The point is that as the order becomes larger, the results become more organic and lifelike.

1 Shannon, C.E. A Mathematical Theory of Communication, 1948. Online here.

Markov Calculator

Title:
Order: Matrix Size:
Max Chain Length:

Generated Markov Chain:

Output one element per line