importance_perm() computes model-agnostic variable importance scores by permuting individual predictors (one at a time) and measuring how worse model performance becomes.

importance_perm(
  wflow,
  data,
  metrics = NULL,
  type = "original",
  size = 500,
  times = 10,
  eval_time = NULL,
  event_level = "first"
)

Arguments

wflow

A fitted workflows::workflow().

data

A data frame of the data passed to workflows::fit.workflow(), including the outcome and case weights (if any).

metrics

A yardstick::metric_set() or NULL.

type

A character string for which level of predictors to compute. A value of "original" (default) will return values in the same representation of data. Using "derived" will compute them for any derived features/predictors, such as dummy indicator columns, etc.

size

How many data points to predict for each permutation iteration.

times

How many iterations to repeat the calculations.

eval_time

For censored regression models, a vector of time points at which the survival probability is estimated. This is only needed if a dynamic metric is used, such as the Brier score or the area under the ROC curve.

event_level

A single string. Either "first" or "second" to specify which level of truth to consider as the "event". This argument is only applicable when estimator = "binary".

Value

A tibble with extra classes "importance_perm" and either "original_importance_perm" or "derived_importance_perm". The columns are:

  • .metric the name of the performance metric:

  • predictor: the predictor

  • n: the number of usable results (should be the same as times)

  • mean: the average of the differences in performance. For each metric, larger values indicate worse performance (i.e., higher importance).

  • std_err: the standard error of the differences.

  • importance: the mean divided by the standard error.

  • For censored regression models, an additional .eval_time column may also be included (depending on the metric requested).

Details

The function can compute importance at two different levels.

  • The "original" predictors are the unaltered columns in the source data set. For example, for a categorical predictor used with linear regression, the original predictor is the factor column.

  • "Derived" predictors are the final versions given to the model. For the categorical predictor example, the derived versions are the binary indicator variables produced from the factor version.

This can make a difference when pre-processing/feature engineering is used. This can help us understand how a predictor can be important

Importance scores are computed for each predictor (at the specified level) and each performance metric. If no metric is specified, defaults are used:

For censored data, importance is computed for each evaluation time (when a dynamic metric is specified).

Examples

if (!rlang::is_installed(c("modeldata", "recipes", "workflows"))) {
  library(modeldata)
  library(recipes)
  library(workflows)
  library(dplyr)

  set.seed(12)
  dat_tr <-
    sim_logistic(250, ~ .1 + 2 * A - 3 * B + 1 * A *B, corr = .7) |>
    dplyr::bind_cols(sim_noise(250, num_vars = 10))

  rec <-
    recipe(class ~ ., data = dat_tr) |>
    step_interact(~ A:B) |>
    step_normalize(all_numeric_predictors()) |>
    step_pca(contains("noise"), num_comp = 5)

  lr_wflow <- workflow(rec, logistic_reg())
  lr_fit <- fit(lr_wflow, dat_tr)

  set.seed(39)
  orig_res <- importance_perm(lr_fit, data = dat_tr, type = "original",
                              size = 100, times = 25)
  orig_res

  set.seed(39)
  deriv_res <- importance_perm(lr_fit, data = dat_tr, type = "derived",
                               size = 100, times = 25)
  deriv_res
}