Chapter 6 Model Predictions

To be consistent with snake_case, new_data should be used instead of newdata.

The function to produce predictions should be a class-specific predict method with arguments object, new_data, and possibly type. Other arguments, such as level, should be standardized. {note}
The main predict method can internally defer to separate, unexported functions (predict_class, etc).
type should also come from a set of pre-defined values such as

type	application
`numeric`	numeric predictions
`class`	hard class predictions
`prob`	class probabilities, survivor probabilities
`link`	`glm` linear predictor
`conf_int`	confidence intervals
`pred_int`	prediction intervals
`raw`	direct access to prediction function
`param_pred`	predictions across tuning parameters
`quantile`	quantile predictions

and should be validated using match.arg().

To determine whether or not to return standard errors for predictions, use a std_error argument that takes on TRUE/FALSE value. By default, do not report standard error or other measures of uncertainty, as these can be expensive to compute. Clearly document whether any standard errors are for confidence or prediction intervals.

Other values should be assigned with consensus.

6.1 Input Data

If new_data is not supplied, an error should be thrown. It should not default to an archived version of the training set contained in the model object.
The data requirements for new_data should be the same as those for the orginal model fit function.
The model outcome should never be required to be in new_data.
new_data should be tolerant of extra columns. For example, if all variables are in some data frame dataset, predict(object, dataset) should immediately know which variables are required for prediction, check for their presence, and select only those from dataset before proceeding.
The prediction code should work whether new_data has multiple rows or a single row.

Predictions should not depend on which observations are present in new_data. {note}.
When novel factor levels appear in the test set for factor predictors, the default behavior should be to throw an informative error. For models where this is a reasonable way to make predictions on novel factor levels, users need to explicitly specify that they want this behavior, and it’s good practice to message() for these prediction cases.

The return value is a tibble with the same number of rows as the data being predicted and in the same order. This implies that na.action should not affect the dimensions of the outcome object (i.e., it should be ignored). {note} The class of the tibble can be overloaded to accommodate specialized methods as long as basic data frame functionality is maintained. {note}. For observations with missing data such that a prediction cannot be generated, we recommend returning NA.
The return tibble can contain extra attributes for values relevant to the prediction (e.g. level for intervals) but care should be taken to make sure that these attributes are not destroyed when standard operations are applied to the tibble (e.g. arrange, filter, etc.). Columns of constant values (e.g. adding level as a column) should be avoided.

Specific cases:

For univariate, numeric point estimates, the column should be named .pred. For multivariate numeric predictions (excluding probabilities), the columns should be named .pred_{outcome name}.
Class predictions should be factors with the same levels as the original outcome and named .pred_class.
For class probability predictions, the columns should be named the same as the factor levels, e.g., .pred_{level}, and there should be as many columns as factor levels.
If interval estimates are produced (e.g. prediction/confidence/credible), the column names should be .pred_lower and .pred_upper. If a standard error is produced, the column should be named .std_error. If intervals are produced for class probabilities, the levels should be included (e.g., .pred_lower_{level}),

For predictions that are not simple scalars, such as distributions or non-rectangular structures, the .pred column should be a list-column {note}
In cases where the outcome is being directly predicted, the predictions should be on the same scale as the outcome. The same would apply to associated interval estimates. This is equivalent to type = "response" for generalized linear models and the like. Reasonable exceptions include estimation of the standard error of prediction (perhaps occurring on the link-level/scale of the linear predictors).