On this set of exercises, we are going to explore some of the probability functions in R with practical applications. Basic probability knowledge is required.

Note: We are going to use random number functions and random process functions in R such as `runif`

, a problem with these functions is that every time you run them you will obtain a different value. To make your results reproducible you can specify the value of the seed using `set.seed(‘any number’)`

before calling a random function. (If you are not familiar with seeds, think of them as the tracking number of your random numbers). For this set of exercises we will use `set.seed(1)`

, don’t forget to specify it before every random exercise.

Answers to the exercises are available here

If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.

**Exercise 1**

**Generating random numbers. ** Set your seed to 1 and generate 10 random numbers using `runif`

and save it in an object called `random_numbers`

.

**Exercise 2**

Using the function `ifelse`

and the object `random_numbers`

simulate coin tosses. Hint: If `random_numbers`

is bigger than .5 then the result is head, otherwise is tail.

Another way of generating random coin tosses is by using the `rbinom`

function. Set the seed again to 1 and simulate with this function 10 coin tosses. Note: The value you will obtain is the total number of heads of those 10 coin tosses.

**Exercise 3**

Using the function `rbinom`

to generate 10 unfair coin tosses with probability success of 0.3. Set the seed to 1.

**Exercise 4**

We can simulate rolling a die in R with `runif`

. Save in an object called `die_roll`

1 random number with `min = 0`

and `max = 6`

. This mean that we will generate a random number between 1 and 6.

Apply the function `ceiling`

to `die_roll`

. Don’t forget to set the seed to 1 before calling `runif`

.

**Exercise 5**

Simulate normal distribution values. Imagine a population in which the average height is 1.70 m with an standard deviation of 0.1, using `rnorm`

simulate the height of 100 people and save it in an object called `heights`

.

To get an idea of the values of heights applying the function `summary`

to it.

**Exercise 6**

a) What’s the probability that a person will be smaller or equal to 1.90 m ? Use `pnorm`

b) What’s the probability that a person will be taller or equal to 1.60 m? Use `pnorm`

**Exercise 7**

The waiting time (in minutes) at a doctor’s clinic follows an exponential distribution with a rate parameter of 1/50. Use the function `rexp`

to simulate the waiting time of 30 people at the doctor’s office.

**Exercise 8**

What’s the probability that a person will wait less than 10 minutes? Use `pexp`

**Exercise 9**

What’s the waiting time average?

**Exercise 10**

Let’s assume that patients with a waiting time bigger than 60 minutes leave. Out of 100 patients that arrive to the clinic how many are expected to leave? Use `pexp`

Simon Reinsperger says

Nice exercise set!

For exercise 10, you probably meant to use pexp().

Also the solution is for 100 patients.

Francisco Méndez says

Hi Simon,

Thanks for taking a look at the set and for letting us know about the error. I already fix it.

Lee Tibbert says

Thank you for this exercise set.

Could you take a look at the directions for exercise 4.

It says:

“Save in an object called die_roll 1 random number with min = 1 and max = 6.

This mean that we will generate a random number between 1 and 6.”

The R help for runif indicates that, except for rare cases, runif() will never

return the extreme values. That is, it returns an _open_ interval between

min & max. Looks to me like a die role of 1 will never happen (and that

the interval from min to min +1 will always be one point short, asymptotic

to uniform, not exactly).

I checked my understanding by running the exercise with 100 draws and

never got a 1. Am I missing something?

I report this because if there is a bug, it will probably confuse people.

Lee

Lee Tibbert says

To be clear, when ceiling() is applied, a 1 spot will not appear, since 1.n

will be forced to 2 spots.

(I believe there is also a pedantic “short by one point” issue at the

max end.)

Thanks again. I am still working my way through, but have found

these exercises very useful.

Lee

Francisco Méndez says

Yeah the mistake was in min =1 it should suggest to fix min = 0. The ceiling function will always round the numbers to the closer higher natural number

Francisco Méndez says

Hi Lee,

Thank you so much for taking the time to check the set of exercises. You are right,

the exercise should say min = 0 instead of 1. I will fix it ASAP

Lee Tibbert says

Francisco,

Could you check the description of Exercise 2 against the supplied answers?

The description says:

Note: The value you will obtain is the total number of heads of those 10 coin tosses.

The (kindly) supplied answer appears to give the number of _tails_ seen (by counting

the line above in the answer).

Am I missing something? Is reporting errata useful to you?

Thank you.

Lee

Francisco Méndez says

This one is tricky, what the function rbinom reports is the number of succesfull events, in this case I decided to be heads (randomly)

The tricky part is that the probability of success and the failure probability are the same .5 so in this particularly exercise you can treat the result as head or tails.

On exercise 3 this distinction is clear because the probability of success is different from the failure one

Lee Tibbert says

Francisco,

Thank you for your timely & helpful reply. I concur that the

results of exercise 3 are clearer,

Sorry to be dense, but I still do not understand the provided

solution for exercise 2. The variable coin_tosses_1 in the

answer set clearly has 6 heads (as I would expect from the

printout of variable random_numbers in the solution to

exercise 1). The variable coin_tosses in part 2 of exercise 2

is the inverse of coin_tosses_1 and clearly differs.

If the intent of the exercise is to show two ways of calculating

the same thing, the difference is puzzling.& confusing.

Thank you for any help.

Lee

The C code in rbinom.c is pretty dense but it appears

to have a floating point comparison to 0.5. Example 2, part 2

appears to be triggering the inversion of the sense of success by

falling on the unexpected side of the fencepost. Someone, somewhere

must love floating point math, perhaps its mother!

Lee Tibbert says

Francisco,

Could you check the match between the description of Exercise 6

and the (kindly) supplied answer?

I believe that the supplied answer is the intended exercise but that

the description contains two subtle fencepost errors. The description

uses “smaller than” and “taller than”. By my understanding of English

these translate to “strictly less than” and “strictly taller than” or .

That is, _exclusive_ of the density at the endpoint.

If I understand pnorm() correctly, it returns a density _inclusive_ (=)

of the endpoint.

One can fuss around with the math to exclude the endpoint, but that makes

the problem both harder and different than the supplied answer.

My apology if I am missing something obvious.

Lee

Francisco Méndez says

That’s an interesting comment. Your right pnorm() returns a density _inclusive_(=), but since the normal distribution is

continous the probability of the endpoint alone is 0 in an strict sense. In other words the probability that someone’s height is 1.6800000 (imagine more zeros after)

is 0, there will be persons super close to that value but never the same one.

By then the probability that a person will be smaller than 1.90 is the same as the probability that a person will be smaller or equal to 1.90

Lee Tibbert says

Francisco,

Could you take a look at the (mis) match of the description & solution

for Exercise 10? The relevant fragment of the description is:

Out of 100 patients that arrive to the clinic how many are expected to leave?

Given that description, I believe there is a missing truncation in the (kindly)

provided solution.

Punting a long philosophical discussion about the essential natures of people &

beers(1), people & beers come in whole units. This gives an integer result for

“how many”. Admittedly a fine point,

I believe a fix would be to either add “whole number” to the description

change the solution to truncate (better teaching option), or to just change

the solution.

Lee

1) Agreed that the concept of beers having discrete units rather than a composite

amount is relatively recent, since bottling,