Skip to content

Embed categorical predictors

step_embed() embed_control()
Encoding Factors into Multiple Columns
step_feature_hash()
Dummy Variables Creation via Feature Hashing
step_lencode_bayes()
Supervised Factor Conversions into Linear Functions using Bayesian Likelihood Encodings
step_lencode_glm()
Supervised Factor Conversions into Linear Functions using Likelihood Encodings
step_lencode_mixed()
Supervised Factor Conversions into Linear Functions using Bayesian Likelihood Encodings
step_woe()
Weight of evidence transformation

Collapse categorical predictors

step_collapse_cart()
Supervised Collapsing of Factor Levels
step_collapse_stringdist()
collapse factor levels using stringdist

Embed numeric predictors

step_discretize_cart()
Discretize numeric variables with CART
step_discretize_xgb()
Discretize numeric variables with XgBoost
step_pca_sparse()
Sparse PCA Signal Extraction
step_pca_sparse_bayes()
Sparse Bayesian PCA Signal Extraction
step_pca_truncated()
Truncated PCA Signal Extraction
step_umap()
Supervised and unsupervised uniform manifold approximation and projection (UMAP)

Datasets

solubility
Compound solubility data

Weight of evidence helpers

add_woe()
Add WoE in a data frame
woe_table()
Crosstable with woe between a binary outcome and a predictor variable.

miscellaneous

dictionary()
Weight of evidence dictionary