This post presents exercises using the `MCMCglmm`

package in order to compute parameter estimates in a Bayesian fashion, relying on Mark Chain Monte Carlo (MCMC) methods. The `MCMCglmm`

package typically deals with Generalized Linear (Mixed) Models (GLMM).

This package mainly uses Gibbs Sampling to update the parameters, but also the Metropolis-Hastings. Course notes are available as a vignette for this package. An overview is also available.

**Backgrounds**

After the first introduction of Bayesian inference with (MCMC) techniques in the first sets of exercises(here and here), you can find exercises related to these two popular MCMC algorithms that are used in `MCMCglmm `

: Gibbs sampler and the Metropolis algorithm.

We will use the packages `MCMCglmm`

and `coda`

together with packages `evd`

(to generate sample from Gumbel distribution), `reshape2 `

, and `tidyverse`

.

Solutions for this set of exercises can be found here.

**Exercise 1**

Load the data “Traffic” from the package “MASS.” (HINT : use the `data()`

function)

Show a quick summary of the structure of this data-set; ex. the number of observations and columns, type of variables, etc.

**Exercise 2**

Explain the variable “*y”* (which is the number of citizens injured in road accidents) with a generalized linear (frequentest) model, taking into account, the limit, the year and the day.

(HINT : since “*y”* is a count variable, take the Poisson family)

Show the summary of the fitted model. How many parameters are significant at the 5% level?

**Exercise 3**

Now, create the same model in a Bayesian setting by using the `MCMCglmm`

framework. Note that the prior distribution is an Inverse-Wishart in `MCMCglmm`

:

**a.** Set weak prior parameters (HINT: check the information given with the help of `?MCMCglmm`

and have a look at the parameter `prior`

. Set only the R matrix, since we don’t have random effects.)

**b.** Run the model with 10,000 iterations.

**Exercise 4**

Plot the posterior chains by using the plotting method of `MCMCglmm`

; ex. plot the individual trace-plots of the variable, year and day (only), together with the density plot.

What can you say about the convergence of these chains and the Bayesian point estimates of these parameters?

Then, plot the variance-co-variance posterior chain of the residual error.

**Exercise 5**

Compute the posterior mean residual for a particular data point: for the day “85.”

**Exercise 6**

Now, run the same model, but with “*year”* taken as a random effect and plot the posterior chains. Take a burn-in of 1000.

What can you observe?

**Exercise 7**

Check the convergence of the model with:

– The Raftery-Lewis diagnostic

– The Geweke diagnostic

– (BONUS)The Gelman-Rubin diagnostic. (HINT: You can use `mclapply`

from the parallel package to run it faster for the chains with different starting values.)

**Exercise 8**

Make a prediction from this model for the year 1961 **and ** year 1962 separately.

Plot these predictions together with density plots using the `ggplot2`

(or `tidyverse`

) package. Plot these two plots side-by-side. (HINT: use the `gridExtra`

package, for example.)

**Exercise 9**

Compute the posterior estimates for years 1961 and 1962.

(HINT: take the mode of the posterior distribution.)

## Leave a Reply