Metropolis Algorithm

by Weidong Liang

Beijing, 2014.04.09


Introduction

Simliar to rejection sampling, in the Metropolis algorithm, we sample from a proposal distribution; the difference is that we maintain a record of the current state \( x^{\tau} \) so that the new sample would be generated using \( q(x | x^{\tau}) \). Also, Metropolis algorithm requires that the proposal distribution is symmetric, i.e. \( q(x_A | x_B) = q(x_B | xA) \) for all values of \( x_A \) and \( x_B \). The proposal sample is then accepted with the probability: $$ A(x^*, x^{(\tau)}) = min(1, \frac{\tilde{p}( x^* )}{\tilde{p}( x^{(\tau)} )} ) $$ where \( P(x) = \tilde{p}(x) / Z \) is the desired distribution. If the proposal sample is accepted, then \( x^{(\tau + 1)} = x^* \), otherwise \( x^{(\tau + 1 )} = x^{(\tau)} \) and a new sample is drawn from \( q( x | x^{\tau} )\) and the process is repeated.


Simulation

In this simulation, we will apply the Metropolis Algorithm to sample from a bivariate Gaussian with a proposal probability distribution of isotropic Gaussian, \( P(X, Y) = P(X)P(Y) \), \( P(X) \sim N(\mu, \sigma^2) \), \( P(Y) \sim N(\mu, \sigma^2) \).

For multi-variate Gaussian distribution, we have pdf: $$ f_x(x_1, ..., x_k) = \frac{1}{ \sqrt{ (2\pi)^k | \Sigma | } } exp({ -\frac{1}{2} (x - \mu)^T \Sigma^{-1} (x - \mu) }) $$ In the bivariate case, we have: $$ \mu = \begin{bmatrix} \mu_x \\ \mu_y \end{bmatrix} , \Sigma = \begin{bmatrix} \sigma_x^2 & \rho \sigma_x \sigma_y \\ \rho \sigma_x \sigma_y & \sigma_y^2 \end{bmatrix} $$ therefore, the pdf of the bivariate Gaussian distribution is: $$ f(x, y) = \frac{1}{2 \pi \sigma_x \sigma_y \sqrt{1 - \rho^2} } exp( -\frac{1}{2 (1 - \rho^2) } [ \frac{ (x - \mu_x)^2 }{ \sigma_x^2 } + \frac{ (y - \mu_y)^2 }{ \sigma_y^2 } - \frac{2 \rho (x - \mu_x) (y - \mu_y) }{ \sigma_x \sigma_y } ] ) $$

To sample from the bivariate Gaussian distribution, we sample from a proposal isotropic Gaussian distribution \( q(x | x^{\tau}) = N(x^{\tau}, \sigma) \); we starts at the point colored orange, and the subsequent samples are drawn and connected with a line which if accepted will be colored green and if rejected will be colored red. The blue points form an equi-probability contour, which are generated from the desired bivariate Gaussian distribution with a standard deviation of 1. Note how the samples drawn eventually move from the lower right to the area enclosed by the contour line; thus the samples eventually move from are of low probability to area of high probability (as determined by the objective probability distribution).

TODO: add histogram plot for x(0), x(1)


Algorithm

Metropolis-Algorithm(tilde_q(x), QSampler(x) )

  1. if \( x^{(\tau)} \) not initialized
    1. \( x^{(\tau)} = argmax_x{\tilde{p}(x)} \)  // if not possible, any x should do
  2. \( x^* \sim QSampler(x^{\tau}) \)  // sample from \( q(x|x^{(\tau)}) \)
  3. \( A(x^*, x^{(\tau)}) = min(1, \frac{\tilde{p}(x^*)}{\tilde{p}(x^{(\tau)})}) \)  //Calculate acceptance probability
  4. \( u = U(0, 1) \)  // generates a uniform random sample
  5. if \( u \lt A(x^*, x^{(\tau)}) \)
    1. \( x^{(\tau)} = x^* \)
    2. return \(x ^* \)
  6. else
    1. return Metropolis-Algorithm(tilde_q(x), QSampler(x) )


Implementation


Note

From the formulation of the acceptance probability \( A(x^*, x^{(\tau)} ) \), we know that if the new sample is more likely, it will always be accepted; otherwise the less likely it is compared to \( x^{\tau} \), the less likely it will be accepted; therefore the samples generated will eventually falls within the area where \( p(x) \) is large, i.e. more likely to be generated.

The samples generated are highly correlated, correlation between samples maybe reduced selecting only every n-th generated sample by increasing the step sizes between each state transition.


Reference


comments powered by Disqus