Consider a discrete Markov source, {\mathscr{X} \ = \ X_i\}_{i=1}^{\infty}} ona finite alphabet set. Let the initial distribution be {Q} and the transition probability for the {n^{th}} step be {P_n}. When can we say that {\mathscr{X}} is stationary?
Clearly, the source has to be time invariant and thefore we need {P_n = P, \forall n}. For {\mathscr{X}} to be stationary, we need
\displaystyle f(X_1) = f(X_2) = \ \ldots \ f(X_n)
etc, where {f()} is the distribution. But {f(X_n) = QP^n}. Thus {Q = QP} guarantees that all {X_n}‘s have the same distribution. Now, consider, say, {f(X_1;X_2;X_3)} and {f(X_2;X_3;X_4)}.
\displaystyle f(X_1;X_2;X_3) = f(X_1)f(X_2/X_1)f(X_3/X_2).
\displaystyle f(X_2;X_3;X_4) = f(X_2)f(X_3/X_2)f(X_4/X_3).
It is clear that for the two joint distributions to be equal, {f(X_1) = f(X_2)} is enough and therefore {Q = QP} is sufficient.


The great Prof. Varadhan made a visit to the IISc on Febrary 13th. He gave a lecture at the IISc faculty hall on entropy and large deviations. The following example was interesting.

Consider a bug with limited energy trapped in the valley of a steep peak. It tries to scale the peak and reach the other side. Evey time it fails, it falls to the bottom of the valley. It changes its strategy and starts all over again. After a very large number of attempts, it succeeds and reaches the peak. As it goes down the other side, an observer on the other valley sees the bug coming down. He becomes curious, goes to the top of the peak and looks below to see how steep was the bug’s scale.

A bug scaling a peak

Now, having seen the bug, what can he conclude about the strategy adopted by the bug? Prof Varadhan made the following comment. The observer can surely conclude that the bug would have adopted the most efficient strategy. That has to be the case because as the number of iterations becomes very large, the probability of success is dominated by that of the best strategy and if a success ocurs it must come from the best strategy.

Let us consider the following problem which is a popular example in game theory.

A mother wants to divide a cake between her two children. To make both of them happy she has to ensure that each one gets an equal share. That is she has to cut the cake into two equal halves. But the problems is that the cake does not have a regular shape. Hence what is equal to her eyes may not be equal to the eyes of her children. The consequences of an unequal division are imaginable. So how can she make both her kids happy?

The solution is to ask one of the kids to cut the cake into two and the other to choose the piece. It can be verified that this solves the problem. The interesting thing about the solution is that the mother was able to satisfy both the kids even without knowing what will make them happy.

Now if the mother has N kids what will she do?

What will happen if some kids form a collusion and try to get a bigger share for them ?

While reading through the achievements of the Kerala school of mathematics I came across the Madhava-Leibniz formula for computing the value of pi/4. Though the formula was discovered by Madhava its popularly known as Leibniz formula. This is an instance of Stiglers Law, which is the tendency of NOT attributing a discovery to its original discoverer.

Popular instances of the law include the United States of America, Halleys comet, Planck’s constant and, to my surprise, Gaussian distribution! I came to know that the Gaussian distribution was not first proposed by Gauss, a very late discovery for a communication engineer. The distribution was first proposed by de Moivre of the de Moivre-Laplace formula to approximate binomial distributions for large n. Now by recursive application of Stiglers Law can we say that De Moivre also was not the first proposer of the normal distribution ?