Everything went fine until the last lines of code in this exercise, when:

predictions <- predict(fit.lda, validation)

Resulted in:

Error in eval(predvars, data, env) : object 'Sepal.Length' not found

I repeated the whole exercise to see if anything went missing, but it looks like I can't find a reason. Did anyone have the same problem?

I'm an R beginner, so it's difficult for me to understand what might have happened.

R.

]]>> predictions confusionMatrix(predictions, validation$Species)

Confusion Matrix and Statistics

Reference

Prediction setosa versicolor virginica

setosa 10 0 0

versicolor 0 10 3

virginica 0 0 7

Overall Statistics

Accuracy : 0.9

95% CI : (0.7347, 0.9789)

No Information Rate : 0.3333

P-Value [Acc > NIR] : 1.665e-10

Kappa : 0.85

Mcnemar’s Test P-Value : NA

Statistics by Class:

Class: setosa Class: versicolor Class: virginica

Sensitivity 1.0000 1.0000 0.7000

Specificity 1.0000 0.8500 1.0000

Pos Pred Value 1.0000 0.7692 1.0000

Neg Pred Value 1.0000 1.0000 0.8696

Prevalence 0.3333 0.3333 0.3333

Detection Rate 0.3333 0.3333 0.2333

Detection Prevalence 0.3333 0.4333 0.2333

Balanced Accuracy 1.0000 0.9250 0.8500

> predictions confusionMatrix(predictions, validation$Species)

Confusion Matrix and Statistics

Reference

Prediction setosa versicolor virginica

setosa 10 0 0

versicolor 0 10 0

virginica 0 0 10

Overall Statistics

Accuracy : 1

95% CI : (0.8843, 1)

No Information Rate : 0.3333

P-Value [Acc > NIR] : 4.857e-15

Kappa : 1

Mcnemar’s Test P-Value : NA

Statistics by Class:

Class: setosa Class: versicolor Class: virginica

Sensitivity 1.0000 1.0000 1.0000

Specificity 1.0000 1.0000 1.0000

Pos Pred Value 1.0000 1.0000 1.0000

Neg Pred Value 1.0000 1.0000 1.0000

Prevalence 0.3333 0.3333 0.3333

Detection Rate 0.3333 0.3333 0.3333

Detection Prevalence 0.3333 0.3333 0.3333

Balanced Accuracy 1.0000 1.0000 1.0000

My choise is based on this:

results <- resamples(list(lda=fit.lda, cart=fit.cart, knn=fit.knn, svm=fit.svm, rf=fit.rf))

summary(results)

and this:

# compare accuracy of models

dotplot(results)

What results did you find when you ran the solution?

]]>The exercise implies that the LDA was the most accurate model, yet in the results it appeared as though the knn was stronger. Does your choice of LDA relate to the k=7 in the knn? ]]>