The main goal of tidypredict is to enable running predictions inside databases. It reads the model, extracts the components needed to calculate the prediction, and then creates an R formula that can be translated into SQL. In other words, it is able to parse a model such as this one:
tidypredict can return a SQL statement that is ready to run inside the database. Because it uses dplyr’s database interface, it works with several databases back-ends, such as MS SQL:
Install tidypredict from CRAN using:
Or install the development version using devtools as follows:
tidypredict has only a few functions, and it is not expected that number to grow much. The main focus at this time is to add more models to support.
| Function | Description |
|---|---|
tidypredict_fit() |
Returns an R formula that calculates the prediction |
tidypredict_sql() |
Returns a SQL query based on the formula from tidypredict_fit()
|
tidypredict_to_column() |
Adds a new column using the formula from tidypredict_fit()
|
tidypredict_test() |
Tests tidyverse predictions against the model’s native predict() function |
tidypredict_interval() |
Same as tidypredict_fit() but for intervals (only works with lm and glm) |
tidypredict_sql_interval() |
Same as tidypredict_sql() but for intervals (only works with lm and glm) |
parse_model() |
Creates a list spec based on the R model |
as_parsed_model() |
Prepares an object to be recognized as a parsed model |

Instead of translating directly to a SQL statement, tidypredict creates an R formula. That formula can then be used inside dplyr. The overall workflow would be as illustrated in the image above, and described here:
tidypredict reads model, and creates a list object with the necessary components to run predictionstidypredict builds an R formula based on the list objectdplyr evaluates the formula created by tidypredict
dplyr translates the formula into a SQL statement, or any other interfaces.dplyr
tidypredict writes and reads a spec based on a model. Instead of simply writing the R formula directly, splitting the spec from the formula adds the following capabilities:
.rds - Specifically for cases when the model needs to be used for predictions in a Shiny app.tidypredict. It also means, that the parsed model spec can become a good alternative to using PMML.
The following models are supported by tidypredict:
lm()
glm()
randomForest::randomForest()
ranger - ranger::ranger()
earth::earth()
xgboost::xgb.Booster.complete()
Cubist::cubist()
partykit - partykit::ctree()
parsnip
tidypredict supports models fitted via the parsnip interface. The ones confirmed currently work in tidypredict are:
lm() - parsnip: linear_reg() with “lm” as the engine.randomForest::randomForest() - parsnip: rand_forest() with “randomForest” as the engine.ranger::ranger() - parsnip: rand_forest() with “ranger” as the engine.earth::earth() - parsnip: mars() with “earth” as the engine.broom
The tidy() function from broom works with linear models parsed via tidypredict