R build status Coverage status CRAN_Status_Badge Downloads

tidymodels is a “meta-package” for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.

It includes a core set of packages that are loaded on startup:

  • broom takes the messy output of built-in functions in R, such as lm, nls, or t.test, and turns them into tidy data frames.

  • dials has tools to create and manage values of tuning parameters.

  • dplyr contains a grammar for data manipulation.

  • ggplot2 implements a grammar of graphics.

  • infer is a modern approach to statistical inference.

  • parsnip is a tidy, unified interface to creating models.

  • purrr is a functional programming toolkit.

  • recipes is a general data preprocessor with a modern interface. It can create model matrices that incorporate feature engineering, imputation, and other help tools.

  • rsample has infrastructure for resampling data so that models can be assessed and empirically validated.

  • tibble has a modern re-imagining of the data frame.

  • tune contains the functions to optimize model hyper-parameters.

  • workflows has methods to combine pre-processing steps and models into a single object.

  • yardstick contains tools for evaluating models (e.g. accuracy, RMSE, etc.)

There are a few modeling packages that are also installed along with tidymodels (but are not attached on startup):

  • tidypredict translates some model prediction equations to SQL for high-performance computing.

  • tidyposterior can be used to compare models using resampling and Bayesian analysis.

  • tidytext contains tidy tools for quantitative text analysis, including basic text summarization, sentiment analysis, and text modeling.

To install:

require(devtools)
devtools::install_github("tidymodels/tidymodels")

When loading the package, the versions and conflicts are listed:

library(tidymodels)
## Registered S3 method overwritten by 'xts':
##   method     from
##   as.zoo.xts zoo
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────────────────────────── tidymodels 0.0.3 ──

## ✓ broom     0.5.4     ✓ recipes   0.1.9
## ✓ dials     0.0.4     ✓ rsample   0.0.5
## ✓ dplyr     0.8.4     ✓ tibble    2.1.3
## ✓ ggplot2   3.2.1     ✓ tune      0.0.1
## ✓ infer     0.5.1     ✓ workflows 0.1.0
## ✓ parsnip   0.0.5     ✓ yardstick 0.0.5
## ✓ purrr     0.3.3 
## ── Conflicts ────────────────────────────────────────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
## ✖ purrr::discard()  masks scales::discard()
## ✖ dplyr::filter()   masks stats::filter()
## ✖ dplyr::lag()      masks stats::lag()
## ✖ ggplot2::margin() masks dials::margin()
## ✖ dials::offset()   masks stats::offset()
## ✖ recipes::step()   masks stats::step()