Below are the solutions to these exercises on “Bayesian Inference : introduction for MCMC techniques (part 1)”.

############### # # # Exercise 1 # # # ############### # a. Binomial distribution with n = 1000 and probability of 'success' = 735/1000 plot(dbinom(x = seq(1, 100, 1), size = 100, prob = 735/1000), type = "l", lwd = 3, main = "Likelihood", ylab = "density", xlab = "% less than 8 hours") abline(v = 73.5, col = "red")

# b. The Beta distribution should be used since its domain is inside [0,1], same as is the proportion. In R, have a look at ?rbeta # each function gives you the density, the probability, the quantile and a random number from this distribution.

############### # # # Exercise 2 # # # ###############

By looking at the properties of this distribution found in a., we can see that the mean and variance are function of the two “shape” parameters parameters. By simply solving the system, we can find the parameters (Look at this answer ). The following function allows to do that :

Then, we can use it to compute our parameters and finally find our prior

############### # # # Exercise 3 # # # ############### # We know that posterior ~ Beta(n_sucess + alpha, n- ) n <- 1000 n_sucess <- 735 alpha_post <- prior.params$alpha + n_sucess beta_post <- prior.params$beta + n - n_sucess plot(x = x_grid, dbeta(x = x_grid, shape1 = alpha_post, shape2 = beta_post), lwd = 3, type = "l", ylab = "density", col = "red", main = "posterior and prior") # Posterior lines(x = x_grid, dbeta(x = x_grid, shape1 = prior.params$alpha, shape2 = prior.params$beta), lwd = 3, lty = 2) # Prior legend("topright",c("Posterior","Prior"),col=c("red", "black"),lwd=c(3,3), lty = c(1,2))

# The prior has nearly no effect on the posterior. ############### # # # Exercise 4 # # # ############### n <- 10 n_sucess <- 7 alpha_post_small <- prior.params$alpha + n_sucess beta_post_small <- prior.params$beta + n - n_sucess plot(x = x_grid, dbeta(x = x_grid, shape1 = alpha_post_small, shape2 = beta_post_small), lwd = 3, type = "l", ylab = "density", col = "red", main = "Old and new posterior", xlab = "proportion sleeping less than 8 hours") # Posterior lines(x = x_grid , dbeta(x = x_grid, shape1 = alpha_post, shape2 = beta_post), lwd = 3, lty = 2) legend("topleft",c("Large sample","Small sample"), col=c("black","red"),lwd=c(3,3), lty = c(2, 1))

# ---> The posterior is more influenced by the prior since there is less data (likelihood). The posterior is shifted to the prior (left) and is more spread. ############### # # # Exercise 5 # # # ############### M <- 1000 set.seed(1234) # Make it reproducible post_sample <- rbeta(M, shape1 = alpha_post, shape2 = beta_post) # Plot the true posterior density toegther with the empirical density of the genrated sample plot(x = seq(from = 0,to = 1, by = 0.01), y = dbeta(x = seq(from = 0,to = 1, by = 0.01), shape1 = alpha_post, shape2 = beta_post), type = "l", xlim = c(0.6, 0.85), lwd = 3, ylab = "density", xlab = "proportion less than 8 hours", main = "Posterior") lines(density(post_sample), lwd = 3, col = "red", lty = 2) legend("topleft", c("Theoretical","Generated"), col=c("black","red"), lwd=c(3,3), lty = c(1,2))

############### # # # Exercise 6 # # # ############### mean(post_sample) # Very close to the

## [1] 0.7350159

median(post_sample)

## [1] 0.7353194

sd(post_sample)

## [1] 0.01364187

############### # # # Exercise 7 # # # ############### # Find the 2.5% and 97.5% quantiles of the random sample : quantile(post_sample, probs = c(0.025, 0.975))

## 2.5% 97.5% ## 0.7058056 0.7601061

# 95% of the probability mass of the posterior distribution of p lies inside this interval based on this particular random sample. #In other words, there is 95% posterior probability that p lies within this interval given data we have observed.

## Leave a Reply