Below are the solutions to these exercises on forecasting with multivariate regression.

#################### # # # Exercise 1 # # # #################### auto <- read.csv("vehicles.csv") plot(auto$sales, type = "n", ylim = c(0, 5000), ylab = "Sales, '000 units", xlab = "Period", main = "US light vehicle sales in 1976-2016") lines(auto$sales)

#################### # # # Exercise 2 # # # #################### auto$trend <- seq(1:nrow(auto)) auto$income_lag <- c(NA, auto$income[1:nrow(auto)-1]) auto$unemp_lag <- c(NA, auto$unemp[1:nrow(auto)-1]) auto$rate_lag <- c(NA, auto$rate[1:nrow(auto)-1]) #################### # # # Exercise 3 # # # #################### regressions_result <- regsubsets(sales ~ ., data = auto) plot(regressions_result, col = colorRampPalette(c("darkgreen", "grey"))(10))

#################### # # # Exercise 4 # # # #################### require(leaps) regressions_result_extended <- regsubsets(sales ~ ., data = auto, nbest = 2) plot(regressions_result_extended, col = colorRampPalette(c("darkgreen", "grey"))(10))

#################### # # # Exercise 5 # # # #################### # Given that the vertical scale of the plots is reversed the model with the lowest BIC # is the uppermost one (the model that includes unemp, rate, and their lagged values as # explanatory variables) fit <- lm(sales ~ unemp + rate + unemp_lag + rate_lag, data = auto) summary(fit)

## ## Call: ## lm(formula = sales ~ unemp + rate + unemp_lag + rate_lag, data = auto) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1145.76 -230.19 -3.68 233.05 818.06 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 5963.35 192.77 30.935 < 2e-16 *** ## unemp -465.56 47.06 -9.892 < 2e-16 *** ## rate -262.21 72.24 -3.630 0.000383 *** ## unemp_lag 239.50 46.45 5.156 7.46e-07 *** ## rate_lag 197.41 73.44 2.688 0.007961 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 347.1 on 158 degrees of freedom ## (1 observation deleted due to missingness) ## Multiple R-squared: 0.6401, Adjusted R-squared: 0.631 ## F-statistic: 70.26 on 4 and 158 DF, p-value: < 2.2e-16

#################### # # # Exercise 6 # # # #################### require(forecast) assumptions <- read.csv("vehicles_assumptions.csv") fcast <- forecast(fit, newdata = assumptions, h = 4) summary(fcast)

## ## Forecast method: Linear regression model ## ## Model Information: ## ## Call: ## lm(formula = sales ~ unemp + rate + unemp_lag + rate_lag, data = auto) ## ## Coefficients: ## (Intercept) unemp rate unemp_lag rate_lag ## 5963.3 -465.6 -262.2 239.5 197.4 ## ## ## Error measures: ## ME RMSE MAE MPE MAPE MASE ## Training set -3.071816e-14 341.6871 273.3704 -0.9414979 7.769387 0.5926906 ## ## Forecasts: ## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95 ## 1 3703.978 3240.687 4167.270 2992.952 4415.005 ## 2 4517.922 4061.405 4974.439 3817.293 5218.551 ## 3 4122.130 3668.325 4575.935 3425.663 4818.597 ## 4 4364.266 3910.470 4818.063 3667.812 5060.721

#################### # # # Exercise 7 # # # #################### fcast_sales <- append(auto$sales, fcast$mean) fcast_sales <- ts(fcast_sales, start = c(1976, 1), frequency = 4) #################### # # # Exercise 8 # # # #################### plot(window(fcast_sales, start = c(2000,1), end = c(2017,4)), type = "n", ylab = "Sales, '000 units", main = "Light Weight Vehicle Sales") lines(window(fcast_sales, start = c(2000,1), end = c(2016,4))) lines(window(fcast_sales, start = c(2016,4), end = c(2017,4)), col = "blue", lwd = 3)

#################### # # # Exercise 9 # # # #################### require(lmtest) bgtest(fit, 4)

## ## Breusch-Godfrey test for serial correlation of order up to 4 ## ## data: fit ## LM test = 95.5866, df = 4, p-value < 2.2e-16

# the p-value is less than 2.2e-16, which implies that the null hypothesis of # the absence of autocorrelation of the orders 1-4 can be rejected. #################### # # # Exercise 10 # # # #################### require(forecast) res <- residuals(fit) Pacf(res)

# the coefficients located beyond the area enclosed by the two blue dotted lines # are statistically significant at 5% level (the default significance level # for the Pacf function) # The plot shows that correlation is present at lags 1, 3, 4, 6, 9, 13, and 19.

**What's next:**

- Explore all our (>1000) R exercises
- Find an R course using our R Course Finder directory
- Subscribe to receive weekly updates and bonus sets by email
- Share with your friends and colleagues using the buttons below

bambangpe says

regressions_result plot(regressions_result, col = colorRampPalette(c(“darkgreen”, “grey”))(10))

Error in plot.new() : figure margins too large

How to solve this problem? Thank’s

Kostiantyn Kravchuk says

Thank you for the interest in the exercise.

Try to expand the plot panel in RStudio (a panel in the lower right corner).

It looks like the problem is in the

`plot.subsets`

function from the`leaps`

package. The function sets the plot margins to large values (`par(mar=c(7,5,6,3)+0.1)`

), and provides no way to change those values. If the size of the plot panel is not sufficient to draw the plot given large margins you get the error you mentioned.