Most models automatically included in finnts, including all multivariate models, have various hyperparameters with values that need to be chosen before a model is trained. Finn solves this by leveraging the tune package within the tidymodels ecosystem.
When prep_models()
is ran, hyperparameters and back test
splits are calculated and written to disk. You can get the results by
calling get_prepped_models()
.
#> Loading required package: modeltime
#> Finn Submission Info
#> * Experiment Name: finnts_fcst
#> * Run Name: get_prepped_models-20231130T175731Z
#>
#> i Prepping Data
#> Warning: package 'tibble' was built under R version 4.1.3
#> Warning: package 'dplyr' was built under R version 4.1.3
#> Warning: package 'timetk' was built under R version 4.1.3
#> Warning: package 'forecast' was built under R version 4.1.3
#> Warning: package 'tidyselect' was built under R version 4.1.3
#> Warning: package 'stringr' was built under R version 4.1.3
#> Warning: package 'doParallel' was built under R version 4.1.3
#> Warning: package 'iterators' was built under R version 4.1.3
#> Warning: package 'lubridate' was built under R version 4.1.3
#> Warning: package 'dials' was built under R version 4.1.3
#> Warning: package 'scales' was built under R version 4.1.3
#> Warning: package 'workflows' was built under R version 4.1.3
#> Warning: package 'Cubist' was built under R version 4.1.3
#> Warning: package 'earth' was built under R version 4.1.3
#> Warning: package 'glmnet' was built under R version 4.1.3
#> Warning: package 'purrr' was built under R version 4.1.3
#> Warning: package 'rules' was built under R version 4.1.3
#> Warning: package 'fs' was built under R version 4.1.3
#> Warning: package 'digest' was built under R version 4.1.3
#> Warning: package 'tidyr' was built under R version 4.1.3
#> Warning: package 'vroom' was built under R version 4.1.3
#> Warning: package 'cli' was built under R version 4.1.3
#>
v Prepping Data [5.8s]
#>
i Creating Model Workflows
v Creating Model Workflows [538ms]
#>
i Creating Model Hyperparameters
v Creating Model Hyperparameters [823ms]
#>
i Creating Train Test Splits
v Creating Train Test Splits [1.5s]
#> # A tibble: 31 x 4
#> Run_Type Train_Test_ID Train_End Test_End
#> <chr> <dbl> <date> <date>
#> 1 Future_Forecast 1 2015-06-01 2015-12-01
#> 2 Back_Test 2 2015-05-01 2015-06-01
#> 3 Back_Test 3 2015-04-01 2015-06-01
#> 4 Back_Test 4 2015-03-01 2015-06-01
#> 5 Back_Test 5 2015-02-01 2015-06-01
#> 6 Back_Test 6 2015-01-01 2015-06-01
#> 7 Back_Test 7 2014-12-01 2015-06-01
#> 8 Back_Test 8 2014-11-01 2015-05-01
#> 9 Back_Test 9 2014-10-01 2015-04-01
#> 10 Validation 10 2014-09-01 2014-10-01
#> # i 21 more rows
#> [1] "Future_Forecast" "Back_Test" "Validation" "Ensemble"
#> # A tibble: 3 x 3
#> Model_Name Model_Recipe Model_Workflow
#> <chr> <chr> <list>
#> 1 arima R1 <workflow>
#> 2 ets R1 <workflow>
#> 3 xgboost R1 <workflow>
#> # A tibble: 12 x 4
#> Model Recipe Hyperparameter_Combo Hyperparameters
#> <chr> <chr> <dbl> <list>
#> 1 arima R1 1 <tibble [0 x 0]>
#> 2 ets R1 1 <tibble [0 x 0]>
#> 3 xgboost R1 1 <tibble [1 x 4]>
#> 4 xgboost R1 2 <tibble [1 x 4]>
#> 5 xgboost R1 3 <tibble [1 x 4]>
#> 6 xgboost R1 4 <tibble [1 x 4]>
#> 7 xgboost R1 5 <tibble [1 x 4]>
#> 8 xgboost R1 6 <tibble [1 x 4]>
#> 9 xgboost R1 7 <tibble [1 x 4]>
#> 10 xgboost R1 8 <tibble [1 x 4]>
#> 11 xgboost R1 9 <tibble [1 x 4]>
#> 12 xgboost R1 10 <tibble [1 x 4]>
The above outputs allow a Finn user to understand what hyperparameters are chosen for tuning and how the model refitting process will work. When tuning hyperparameters, Finn uses the “Validation” train/test splits, with the final parameters chosen using RMSE. For some models like ARIMA that don’t follow a traditional hyperparameter tuning process, the model is fit from scratch across all train/test splits. After hyperparameters are chosen, the model is refit across the “Back_Test” and “Future_Forecast” splits. The “Back_Test” splits are the true testing data that will be used when selecting the final “Best-Model”. “Ensemble” splits are also created as ensemble training data if ensemble models are chosen to run. Ensemble models follow a similar tuning process.