miceDRF provides an imputation method for the mice framework based on distributional random forests (DRF).
The package extends multiple imputation by chained equations (MICE) with a nonparametric approach that models conditional distributions rather than only conditional means. This allows flexible imputation of complex data structures, nonlinear effects, and heterogeneous conditional distributions.
The method can be used directly within the standard mice workflow via:
method = "DRF"Installation
Install the development version from GitHub with:
if (!requireNamespace("devtools", quietly = TRUE)) {
install.packages("devtools")
}
devtools::install_github("KrystynaGrzesiak/miceDRF")Example
library(mice)
library(miceDRF)
set.seed(123)
# Generate data
n <- 200
d <- 5
X <- matrix(runif(n * d), nrow = n, ncol = d)
# Introduce missing values
pmiss <- 0.2
X.NA <- apply(X, 2, function(x) {
U <- runif(length(x))
ifelse(U <= pmiss, NA, x)
})
# Imputation with DRF
imp <- mice(X.NA, m = 1, method = "DRF")
Ximp <- complete(imp)References
Näf, J., Scornet, E., & Josse, J. (2024). What is a good imputation under MAR missingness? arXiv preprint. https://arxiv.org/abs/2403.19196
Cevid, D., Michel, L., Näf, J., Meinshausen, N., and Buehlmann, P. (2022). Distributional random forests: Heterogeneity adjustment and multivariate distributional regression. Journal of Machine Learning Research, 23(333), 1–79.
Citation
If you use miceDRF in your research, please cite:
Näf, J., Grzesiak, K., and Scornet, E. (2025). How to rank imputation methods? arXiv preprint arXiv:2507.11297. https://doi.org/10.48550/arXiv.2507.11297