Cross-Validation

Lab 4C

Directions: Follow along with the slides, completing the questions in blue on your computer, and answering the questions in red in your journal.

Space, Click, Right Arrow or swipe left to move to the next slide.

What is cross-validation?

Step 1: training-test split

set.seed(123)
training_rows <- sample(1:____, size = 85)
training <- slice(arm_span, ____)
test <- slice(____, - ____)

Aside: set.seed()

Whenever you split data into training and test, always use set.seed first.

Aside: training-test ratio

Step 2: training the model

Step 3: test the model

test <- mutate(test, ____ = predict(best_training, newdata = ____))

Recap

Why cross-validate?

Example of overfitting

Example of overfitting, continued