Double Machine Learning for Causal Inference in Julia
DoubleML.jl implements double/de-biased machine learning methods for causal inference, following Chernozhukov et al. (2018).
This package is inspired by, and aims to closely follow, the DoubleML Python package, but is unaffiiliated with it.
Why DoubleML.jl?
- Leverage Julia's speed, with up to 10x faster model fitting compared to Python (based on early benchmarks).
- MLJ Integration: Use any MLJ-compatible model for nuisance estimation, with the flexibility to control model iteration and model tuning (see examples)
- StatsAPI Compliance:
coef(),stderror(),confint(),coeftable() - Cross-fitting: K-fold sample splitting with multiple repetitions
- Bootstrap Inference: Joint confidence intervals with bootstrapped standard errors
This package remains in early development and testing stages. The following models are currently implemented:
| Model | Use Case | Learners | Status |
|---|---|---|---|
DoubleMLPLR |
Continuous/binary treatment | ml_l, ml_m (+ ml_g for IV-type) |
Implemented |
DoubleMLIRM |
Binary treatment only | ml_g, ml_m (classifier) |
Implemented |
DoubleMLLPLR |
Binary outcome (Y ∈ {0,1}) | ml_M, ml_t, ml_m (+ ml_a) |
using DoubleML, MLJ, DataFrames, StableRNGs
data = make_plr_CCDDHNR2018(500, alpha=0.5, rng=StableRNG(42))
RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree verbosity=0
ml_l = RandomForestRegressor()
ml_m = RandomForestRegressor()
model = DoubleMLPLR(data, ml_l, ml_m, n_folds=5)
fit!(model)
summary(model)
println("Treatment effect: ", coef(model)[1])
println("95% CI: ", confint(model))- User Guide - Installation, concepts, and workflow
- Tutorials - Step-by-step examples
- API Reference - Complete API documentation
- Examples - Pluto notebooks
There are many features and models still not yet implemented in this package. The broad roadmap is to achieve feature parity with the DoubleML package in Python.
Currently, a variety of tests against the Python package are implemented to ensure similar functionality of the DoubleMLPLR, DoubleMLIRM, and DoubleMLLPLR models.
In early benchmarks, the Julia implementation performs well and up to 10x faster than the Python package (see the benchmark)
Other similar Julia packages include CausalELM, which offers a very lightweight approach to causal machine learning, where all the machine learners take the form of extreme learning machines. In comparison, this package aims to offer more similar features to those of the DoubleML Python package and allow flexibility of the model choice.