Next: Coupling Algorithms; Perfect MCMC Up: Error Analysis for MCMC Previous: MCMC Estimators; Bias and   Contents

### Asymptotic Variance of Estimation; Mean Squared Error

For the statistical model introduced in Section 3.4.2 we now investigate the asymptotic behavior of the variance if .

Theorem 3.19   Define and let be the fundamental matrix of defined by . Then

 (78)

Proof

• Clearly,

 (79)

and thus

• This representation will now be used to show (78) for the case .
• In this case we observe

and

• Furthermore, by the stationarity of the Markov chain ,

where

and denotes the matrix of the -step transition probabilities.
• A combination of the results above yields

• where the second equality is due to the identity

• Taking into account the representation formula (76) for this implies (78).
• It is left to show that (78) is also true for an arbitrary initial distribution .
• At this point we will use a more precise notation: We will write instead of and instead of .
• It suffices to show that

 (80)

• For this purpose we introduce the following notation: For let

und

• Then, by (79),

where we denote the three summands in the last expression by , and , respectively.
• does not depend on and hence .
• As the state space is finite we obtain for that

• This implies for any , as

with probability for all and

• Furthermore, for we have the following estimate

where it is easy to see that the supremum is finite.
• Due to the ergodicity of the Markov chain , the last summand will become arbitrarily small for sufficiently large . This completes the proof of (80).

Remarks

• Recall that
• by Theorem 2.1 from ,,Statistik I'' we obtain for the mean squared error of the MCMC estimator defined in (70) for that

 (81)

• i.e., the mean squared error of the MCMC estimator is equal to the sum of the squared bias and the variance of the estimator .
• Both summands on the right hand side of (81) converge to 0 if but with different rates of convergence.
• In Theorem 3.19 we showed that .
• On the other hand, by Theorem 3.18 we get that .
• Consequently, the asymptotic behavior of the mean squared error of is crucially influenced by the asymptotic variance of the estimator, whereas the bias plays a minor role.

• In other words: It can make sense to choose the simulation matrix such that
• the asymptotic variance is as small as possible,
• even if this results in a certain increase of the asymptotic bias .

In order to investigate this problem more deeply we introduce the following notation: Let

where is an arbitrary function and is an arbitrary reversible pair.

Theorem 3.20
• Let and be two transition matrices on such that and are reversible.
• For arbitrary such that let ,
• i.e., outside the diagonal all entries of the transition matrix are greater or equal than the corresponding entries of the transition matrix .
• Then, for any function ,

 (82)

Proof

• Let be a transition matrix such that the pair is reversible. It suffices to show that

 (83)

• By Theorem 3.19,

 (84)

where denotes the fundamental matrix of introduced by (74).
• On the other hand, as , we get that

and thus

• Taking into account (84) this implies

 (85)

• As the pair is reversible, by the representation formula (75) for the fundamental matrix that was derived in Lemma 3.3 we obtain for arbitrary

• This implies

• Thus, by (85),

 (86)

where the last equality is due to the fact that

which is an immediate consequence of the definition (74) of .
• As is a stochastic matrix and is reversible
• only the entries where (or alternatively the entries where ) can be chosen arbitrarily. This can be seen as follows.
• For every pair such that the entries , and can be expressed via in the following way:

where and are constants that do not depend on .
• For arbitrary the entry of the matrix product is given by

• This implies that the matrix is non-negative definite, i.e., for all

• By (86) this yields for arbitrary such that

• This completes the proof of (83).

Remarks
As a particular consequence of Theorem 3.20 we get that
• the simulation matrix of the Metropolis algorithm (i.e. if we consider equality in (50) ) minimizes the asymptotic variance
• within the class of all Metropolis-Hastings algorithms having an arbitrary but fixed ,,potential transition matrix'' .

Next: Coupling Algorithms; Perfect MCMC Up: Error Analysis for MCMC Previous: MCMC Estimators; Bias and   Contents
Ursa Pantle 2006-07-20