Below are the solutions to these exercises on Examining data.
#################### # # # Exercise 1 # # # #################### data(islands) length(islands)
## [1] 48
#################### # # # Exercise 2 # # # #################### mean(islands)
## [1] 1252.729
median(islands)
## [1] 41
#################### # # # Exercise 3 # # # #################### range(islands)[1]
## [1] 12
range(islands)[2]
## [1] 16988
#################### # # # Exercise 4 # # # #################### sd(islands)
## [1] 3371.146
range(islands)[2] - range(islands)[1]
## [1] 16976
#################### # # # Exercise 5 # # # #################### quantile(islands)
## 0% 25% 50% 75% 100% ## 12.00 20.50 41.00 183.25 16988.00
quantile(islands, c(.05,.95))
## 5% 95% ## 13.00 8481.75
#################### # # # Exercise 6 # # # #################### IQR(islands)
## [1] 162.75
#################### # # # Exercise 7 # # # #################### hist(islands)

hist(islands,prob=T)

#################### # # # Exercise 8 # # # #################### boxplot(islands)

boxplot(islands, outline = F)

#################### # # # Exercise 9 # # # #################### boxplot(islands, plot=F)$out
## Africa Antarctica Asia Australia Europe ## 11506 5500 16988 2968 3745 ## Greenland North America South America ## 840 9390 6795
#################### # # # Exercise 10 # # # #################### stem(islands)
## ## The decimal point is 3 digit(s) to the right of the | ## ## 0 | 00000000000000000000000000000111111222338 ## 2 | 07 ## 4 | 5 ## 6 | 8 ## 8 | 4 ## 10 | 5 ## 12 | ## 14 | ## 16 | 0
Thanks for this!
I do have a couple of comments. In 5b, the question asks for 0.05%, but the Solution is for 5%. The 0.05% quantile is 12.0235, but given that the question also asked for 95%, I assume the real typo was in the question.
And for 9, there is a hint to use the “prob=F” option, but I don’t think that is really helpful here. I’d suggest changing the hint to mention that there is more output from the boxplot function than the plot.
Thanks for your observations, you are right, there is a typo on the question. Regarding the hint that’s also a good observation, is not intuitive to think that the function provides more information appart from the plot.
7b asked for a “histogram of proportions”. You have plotted the density. I’m not sure this is right. The first bar is 41 out of a total of 48. Surely as a proportion this is 41/48 = 0.85 (or 85%). I can’t find a way of doing this automatically so I used the approach suggested here:
http://stackoverflow.com/questions/7324683/use-hist-function-in-r-to-get-percentages-as-opposed-to-raw-frequencies
Other than that a nice exercise and I certainly learnt some little tricks
An alternative, but equivalent, strategy would be to multiply the vector of densities by the width of the bar:
h <- hist(islands)
h$density <- h$density*2000
plot(h, freq=FALSE)
Remember densities have the property that the total area under the histogram is 1