Extends the mlr3 package with a data backend to transparently work with databases. Two additional backends are currently implemented:
DataBackendDplyr: Relies internally on the abstraction of dplyr and dbplyr.
DataBackendDuckDB: Connector to duckdb.
You can install the released version of mlr3db from CRAN with:
And the development version from GitHub with:
# install.packages("devtools") devtools::install_github("mlr-org/mlr3db")
library(mlr3) library(mlr3db) # Create a classification task: task = tsk("spam") # Convert the task backend from a data.table backend to a DuckDB backend. # By default, a temporary directory is used to store the database files. # Note that the in-memory data is now used anymore, its memory will get freed # by the garbage collector. task$backend = as_duckdb_backend(task$backend) # The requested data will be queried from the database in the background: learner = lrn("classif.rpart") ids = sample(task$row_ids, 3000) learner$train(task, row_ids = ids)