DAGassist contains tools for using directed acyclic graphs (DAGs) to align regressions with an estimand and its identifying assumptions. DAGs are causal graphs that nonparametrically encode the relationships between a model’s variables. For good introductory articles on DAGs, see Pearl (1995), Pearl (2009), Hünermund et al. (2025), and Elwert (2013).
The DAGassist workflow has five steps: (1) declare an estimand; (2) draw a DAG; (3) classify control variables by role; (4) estimate models using DAG-consistent adjustment sets; and (5) recover the interpretable estimand. This guide provides an applied introduction to the DAGassist workflow.
Step 1’s focus on declaring the estimands ensures that studies maintain a consistent quantity of interest for evaluation Lundberg et al. (2021); Findley et al. (2021). Of course, some estimands may be more policy-relevant than others Deaton (2010).
For the purpose of this guide, we are interested in the sample average treatment effect (SATE).
DAGs have three basic building blocks: variables, arrows, and missing arrows. In DAG terminology, variables capture nodes or vertices, whereas edges or arcs refer to arrows Tennant et al. (2021). Missing arrows are equivalent to a strong null hypothesis.
| variable | type | Min | Q1 | Median | Mean | Q3 | Max |
|---|---|---|---|---|---|---|---|
| id | integer | 1.00 | 250.75 | 500.50 | 500.50 | 750.25 | 1000.00 |
| year | integer | 0.00 | 1.00 | 2.00 | 2.00 | 3.00 | 4.00 |
| age | numeric | 0.00 | 27.60 | 37.70 | 37.76 | 47.40 | 86.20 |
| pref | numeric | 0.00 | 1.35 | 2.03 | 2.06 | 2.74 | 4.94 |
| edu_year | numeric | 0.00 | 11.80 | 13.10 | 13.07 | 15.20 | 22.00 |
| married | integer | 0.00 | 0.00 | 1.00 | 0.56 | 1.00 | 1.00 |
| birth_control | integer | 0.00 | 0.00 | 1.00 | 0.71 | 1.00 | 1.00 |
| income | numeric | 2344.00 | 43141.75 | 87560.50 | 125387.86 | 162098.50 | 1817478.00 |
| children | numeric | 0.00 | 0.00 | 0.00 | 2.03 | 3.00 | 12.00 |
| job_stability_t | numeric | -3.00 | -0.27 | 0.55 | 0.49 | 1.29 | 3.00 |
| variable | type | top_levels |
|---|---|---|
| gender | factor | Male:2565 Female:2435 |
| immigrant | factor | No:4380 Yes:620 |
| urban | factor | Urban:3560 Rural:1440 |
| class | ordered | Working:2080 Middle:1580 Low:885 (Other):455 |
| religion | factor | Christian:2005 Unaffiliated:1725 Muslim:460 (Other):810 |
| contract | factor | Temporary:1905 Permanent:1810 Informal:1285 |
| edu_degree | factor | HS_grad:1610 Some_college:1390 BA:975 (Other):1025 |
Example: The Causal Effects of Family Background and Life Course Events on Fertility Patterns
For the purpose of this guide, we visualize a common social science question: how does education affect fertility Morgan and Winship (2015)? The DAG model encodes a plausible, but not exhaustive, set of covariates.
## DAGassist Report:
##
## Roles:
## variable role Exp. Out. conf med col dOut dMed dCol dConfOn dConfOff NCT NCO
## edu_year exposure x
## children outcome x
## age confounder x
## class confounder x
## contract confounder x
## gender confounder x
## immigrant confounder x
## urban confounder x
## birth_control mediator x x
## income mediator x x
## job_stability_t mediator x
## married mediator x x
## pref nco x
## religion nco x
##
## Roles legend: Exp. = exposure/treatment; Out. = outcome; CON = confounder; MED = mediator; COL = collider; dOut = descendant of outcome; dMed = descendant of mediator; dCol = descendant of collider; dConfOn = descendant of a confounder on a back-door path; dConfOff = descendant of a confounder off a back-door path; NCT = neutral control on treatment; NCO = neutral control on outcome
Interpreting the roles table:
DAGassist classifies the
variables in your formula by causal role, based on the relationships in
your DAG. It classifies according to these categories.
treatment /
independent variable / exposure.outcome /
dependent variable.confounder, a common
cause of X and Y. Confounders create a spurious association between X
and Y, and must be adjusted for.mediator, a variable
that lies on a path from X to Y, which transmit some of the effect from
X to Y. One should not adjust for mediators if one wants to estimate the
total effect of X on Y.collider, a direct
common descendant of X and Y. Colliders already block paths, so
adjusting for it opens a spurious association between X and Y.descendant of the outcome, a descendant of Y, which
introduces bias if adjusted for.descendant of a mediator, which should not be adjusted for
when estimating total effect.descendant of a collider. Adjusting for a descendant of a
collider opens a spurious association between X and Y.descendant of a confounder on a back door path, a
descendant of Z that affects Y.descendant of a confounder off a backdoor path, a decendant
of Z that does not affect Y.DAGassist(dag_model,
formula = lm(children ~ edu_year + age + class + gender +
immigrant + urban + birth_control + income +
married + job_stability_t + contract + pref, data = dat))## DAGassist Report:
##
## Roles:
## variable role Exp. Out. conf med col dOut dMed dCol dConfOn dConfOff NCT NCO
## edu_year exposure x
## children outcome x
## age confounder x
## class confounder x
## contract confounder x
## gender confounder x
## immigrant confounder x
## urban confounder x
## birth_control mediator x x
## income mediator x x
## job_stability_t mediator x
## married mediator x x
## pref nco x
##
## (!) Bad controls in your formula: {birth_control, income, married, job_stability_t}
## Minimal controls 1: {age, class, contract, gender, immigrant, urban}
## Canonical controls: {age, class, contract, gender, immigrant, pref, urban}
##
## Formulas:
## original: children ~ edu_year + age + class + gender + immigrant + urban + birth_control + income + married + job_stability_t + contract + pref
##
## Balance diagnostics:
## legend: (S)MD compares covariate means between the Original complete-case sample
## and each spec's sample; |(S)MD| > 0.10 flags a covariate whose sample
## composition shifts (binary vars use a raw difference in means).
## Original vs Minimal 1: n = 5000 vs 5000 balanced
## Original vs Canonical: n = 5000 vs 5000 balanced
## Minimal 1 vs Canonical: n = 5000 vs 5000 balanced
##
## Model comparison:
##
## +-------------------+-----------+-----------+-----------+
## | | Original | Minimal 1 | Canonical |
## +===================+===========+===========+===========+
## | edu_year | -0.122*** | -0.080*** | -0.080*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.015) | (0.013) | (0.013) |
## +-------------------+-----------+-----------+-----------+
## | age | 0.070*** | 0.095*** | 0.096*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.004) | (0.003) | (0.003) |
## +-------------------+-----------+-----------+-----------+
## | genderMale | 0.181* | 0.179* | 0.190* |
## +-------------------+-----------+-----------+-----------+
## | | (0.085) | (0.087) | (0.085) |
## +-------------------+-----------+-----------+-----------+
## | immigrantYes | -0.246+ | -0.172 | -0.243+ |
## +-------------------+-----------+-----------+-----------+
## | | (0.128) | (0.131) | (0.129) |
## +-------------------+-----------+-----------+-----------+
## | urbanUrban | 0.121 | 0.238* | 0.175+ |
## +-------------------+-----------+-----------+-----------+
## | | (0.094) | (0.096) | (0.094) |
## +-------------------+-----------+-----------+-----------+
## | birth_control | 0.133 | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.103) | | |
## +-------------------+-----------+-----------+-----------+
## | income | 0.000 | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.000) | | |
## +-------------------+-----------+-----------+-----------+
## | married | 0.703*** | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.122) | | |
## +-------------------+-----------+-----------+-----------+
## | job_stability_t | 0.285*** | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.047) | | |
## +-------------------+-----------+-----------+-----------+
## | contractTemporary | 0.710*** | 0.772*** | 0.804*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.110) | (0.112) | (0.110) |
## +-------------------+-----------+-----------+-----------+
## | contractPermanent | 0.893*** | 1.116*** | 1.093*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.114) | (0.113) | (0.111) |
## +-------------------+-----------+-----------+-----------+
## | pref | 0.581*** | | 0.578*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.042) | | (0.042) |
## +-------------------+-----------+-----------+-----------+
## | Num.Obs. | 5000 | 5000 | 5000 |
## +-------------------+-----------+-----------+-----------+
## | R2 | 0.227 | 0.183 | 0.213 |
## +===================+===========+===========+===========+
## | + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 |
## +===================+===========+===========+===========+
##
## Roles legend: Exp. = exposure; Out. = outcome; CON = confounder; MED = mediator; COL = collider; dOut = descendant of outcome; dMed = descendant of mediator; dCol = descendant of collider; dConfOn = descendant of a confounder on a back-door path; dConfOff = descendant of a confounder off a back-door path; NCT = neutral control on treatment; NCO = neutral control on outcome
Interpreting the model comparison table:
Minimal is the smallest adjustment set necessary to
close all back-door paths from the independent to the dependent
variable. The minimal set only includes confounders as
controls.Canonical is the largest permissible adjustment set.
Essentially, the canonical set contains all control
variables that are not confounders, mediators,
intermediate outcomes,
descendants of mediatiors, or
descendants of colliders.The table below illustrates the varible roles permitted by each set.
| Path / Node Type | Minimal | Canonical |
|---|---|---|
| Fork/Common–Cause Confounder (Z) | ✓ | ✓ |
| Chain/Mediator (M) | ✗ | ✗ |
| Collider (C) | ✗ | ✗ |
| Descendant of Mediator (N) | ✗ | ✗ |
| Descendant of Collider (Q) | ✗ | ✗ |
| Descendant of Outcome (I) | ✗ | ✗ |
| M-Bias | ✗ | ✗ |
| Butterfly Bias | ✗ | ✗ |
| Neutral Control on Treatment (E → X) | ✗ | ✓ |
| Neutral Control on Outcome (F → Y) | ✗ | ✓ |
| Descendant of Confounder off Backdoor Path (W) | ✗ | ✗ |
| Descendant of Confounder on Backdoor Path (V) | Z or V | Z and V |
Note: ✓ = adjust; ✗ = do not adjust. There may be multiple minimal sets; the canonical set is unique.
DAGassist(dag_model,
formula = lm(children ~ edu_year + age + class + gender +
immigrant + urban + birth_control + income +
married + job_stability_t + contract + pref, data = dat),
estimand = "SATE")## DAGassist Report:
##
## Roles:
## variable role Exp. Out. conf med col dOut dMed dCol dConfOn dConfOff NCT NCO
## edu_year exposure x
## children outcome x
## age confounder x
## class confounder x
## contract confounder x
## gender confounder x
## immigrant confounder x
## urban confounder x
## birth_control mediator x x
## income mediator x x
## job_stability_t mediator x
## married mediator x x
## pref nco x
##
## (!) Bad controls in your formula: {birth_control, income, married, job_stability_t}
## Minimal controls 1: {age, class, contract, gender, immigrant, urban}
## Canonical controls: {age, class, contract, gender, immigrant, pref, urban}
##
## Formulas:
## original: children ~ edu_year + age + class + gender + immigrant + urban + birth_control + income + married + job_stability_t + contract + pref
##
## Balance diagnostics:
## legend: (S)MD compares covariate means between the Original complete-case sample
## and each spec's sample; |(S)MD| > 0.10 flags a covariate whose sample
## composition shifts (binary vars use a raw difference in means).
## Original vs Minimal 1: n = 5000 vs 5000 balanced
## Original vs Canonical: n = 5000 vs 5000 balanced
## Minimal 1 vs Canonical: n = 5000 vs 5000 balanced
##
## Model comparison:
##
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | Original | Minimal 1 | Minimal 1 (SATE) | Canonical | Canonical (SATE) |
## +===================+===========+===========+==================+===========+==================+
## | edu_year | -0.122*** | -0.080*** | -0.077*** | -0.080*** | -0.077*** |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.015) | (0.013) | (0.016) | (0.013) | (0.015) |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | age | 0.070*** | 0.095*** | | 0.096*** | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.004) | (0.003) | | (0.003) | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | genderMale | 0.181* | 0.179* | | 0.190* | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.085) | (0.087) | | (0.085) | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | immigrantYes | -0.246+ | -0.172 | | -0.243+ | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.128) | (0.131) | | (0.129) | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | urbanUrban | 0.121 | 0.238* | | 0.175+ | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.094) | (0.096) | | (0.094) | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | birth_control | 0.133 | | | | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.103) | | | | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | income | 0.000 | | | | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.000) | | | | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | married | 0.703*** | | | | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.122) | | | | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | job_stability_t | 0.285*** | | | | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.047) | | | | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | contractTemporary | 0.710*** | 0.772*** | | 0.804*** | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.110) | (0.112) | | (0.110) | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | contractPermanent | 0.893*** | 1.116*** | | 1.093*** | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.114) | (0.113) | | (0.111) | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | pref | 0.581*** | | | 0.578*** | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | | (0.042) | | | (0.042) | |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | Num.Obs. | 5000 | 5000 | 5000 | 5000 | 5000 |
## +-------------------+-----------+-----------+------------------+-----------+------------------+
## | R2 | 0.227 | 0.183 | 0.172 | 0.213 | 0.206 |
## +===================+===========+===========+==================+===========+==================+
## | + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 |
## +===================+===========+===========+==================+===========+==================+
##
## Weight diagnostics:
## legend: w range reports the min-max weights by group; ESS is kish effective sample size.
## Minimal 1 (SATE): w range=0.04726..4.878 | ESS (weighted)=4368.24
## Canonical (SATE): w range=0.04731..4.877 | ESS (weighted)=4368.17
##
## Roles legend: Exp. = exposure; Out. = outcome; CON = confounder; MED = mediator; COL = collider; dOut = descendant of outcome; dMed = descendant of mediator; dCol = descendant of collider; dConfOn = descendant of a confounder on a back-door path; dConfOff = descendant of a confounder off a back-door path; NCT = neutral control on treatment; NCO = neutral control on outcome
In some cases, the target estimand is the average controlled direct
effect. DAGassist supports recovering the controlled direct
effect using sequential g-estimation via integration with the
DirectEffects R package.
Using the prior example, we can use DAGassist to
estimate the effect of years of education on a person’s number of
children, except through birth control, income, and marital status.
library(DirectEffects)
DAGassist(dag_model,
formula = lm(children ~ edu_year + age + class + gender +
immigrant + urban + birth_control + income +
married + job_stability_t + contract + pref, data = dat),
estimand = c("SATE", "SACDE"),
type = "dotwhisker")Visualizing all estimands
In order to export DAGassist reports as files, users
must first install a few commonly-used packages. Dependencies vary by
export file type.
modelsummary to build the model
comparison table for LaTeX,
Word, Excel, and
plaintext.
broom as a fallback for report
generationknitr to build intermediate .md for
Word and plaintext report
generation.rmarkdown to convert .md files to .docx files for
Word report generation.writexl to export Excel files.Essentially, to export:
modelsummarymodelsummary and
writexlmodelsummary and
knitrmodelsummary,
knitr, and rmarkdownUsers can generate latex reports in the console (default), or to an
output file via the out = parameter:
DAGassist(dag_model,
formula = lm(children ~ edu_year + age + class + gender +
immigrant + urban + birth_control + income +
married + job_stability_t + contract + pref, data = dat),
type = "latex",
out = "out/path/filename.tex")Word and Excel output requires an
out = parameter:
#word example
DAGassist(dag_model,
formula = lm(children ~ edu_year + age + class + gender +
immigrant + urban + birth_control + income +
married + job_stability_t + contract + pref, data = dat),
type = "word", #or, type = "docx"
out = "out/path/filename.docx")
#excel example
DAGassist(dag_model,
formula = lm(children ~ edu_year + age + class + gender +
immigrant + urban + birth_control + income +
married + job_stability_t + contract + pref, data = dat),
type = "excel", #or, type = "xlsx"
out = "out/path/filename.xlsx")Because DAGs encode difficult-to-verify assumptions about the data-generating process (DGP), the direction of some edges may be uncertain (Haber et al. 2022). In the example above, for instance, Urban/Rural is specified as a parent of income. In many cases, place of residence temporally precedes employment and therefore earnings. In others, however, income determines where an individual can afford to live. When the causal direction is genuinely ambiguous, selecting a single orientation may impose an unjustifiable assumption.
DAGassist addresses this problem with partially directed acyclic graphs (PDAGs). Using DAGassist::pdag_robustness(), users can designate edges whose directions are uncertain. The function enumerates all acyclic orientations of those edges and reports whether the minimal adjustment set, canonical adjustment set, or the role of any covariate changes across admissible orientations. These diagnostics indicate whether the proposed estimand is robust to directional ambiguity in the DGP.
DAGassist::pdag_robustness(dag_model,
formula = children ~ edu_year + age + class + gender +
immigrant + urban + birth_control + income + married +
job_stability_t + contract + pref,
uncertain_edges = c("urban -- income",
"income -- immigrant",
"income -- married",
"income -- edu_year"))##
## PDAG robustness summary:
## - uncertain edges specified: 4
## - worlds evaluated (acyclic orientations): 2
## - minimal adjustment set changed: no
## - canonical adjustment set changed: no
## - covariate role classifications changed: none
## - re-estimation recommended: no
Users may alternatively specify uncertain edges through the main DAGassist() function:
DAGassist(dag_model,
formula = children ~ edu_year + age + class + gender + immigrant + urban +
birth_control + income + married + job_stability_t + contract + pref, data = dat,
uncertain_edges = c("urban -- income",
"income -- immigrant",
"income -- married",
"income -- edu_year"))## DAGassist Report:
##
## Roles:
## variable role Exp. Out. conf med col dOut dMed dCol dConfOn dConfOff NCT NCO
## edu_year exposure x
## children outcome x
## age confounder x
## class confounder x
## contract confounder x
## gender confounder x
## immigrant confounder x
## urban confounder x
## birth_control mediator x x
## income mediator x x
## job_stability_t mediator x
## married mediator x x
## pref nco x
##
## (!) Bad controls in your formula: {birth_control, income, married, job_stability_t}
## Minimal controls 1: {age, class, contract, gender, immigrant, urban}
## Canonical controls: {age, class, contract, gender, immigrant, pref, urban}
##
## Formulas:
## original: children ~ edu_year + age + class + gender + immigrant + urban + birth_control + income + married + job_stability_t + contract + pref
##
## Balance diagnostics:
## legend: (S)MD compares covariate means between the Original complete-case sample
## and each spec's sample; |(S)MD| > 0.10 flags a covariate whose sample
## composition shifts (binary vars use a raw difference in means).
## Original vs Minimal 1: n = 5000 vs 5000 balanced
## Original vs Canonical: n = 5000 vs 5000 balanced
## Minimal 1 vs Canonical: n = 5000 vs 5000 balanced
##
## Model comparison:
##
## +-------------------+-----------+-----------+-----------+
## | | Original | Minimal 1 | Canonical |
## +===================+===========+===========+===========+
## | edu_year | -0.122*** | -0.080*** | -0.080*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.015) | (0.013) | (0.013) |
## +-------------------+-----------+-----------+-----------+
## | age | 0.070*** | 0.095*** | 0.096*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.004) | (0.003) | (0.003) |
## +-------------------+-----------+-----------+-----------+
## | genderMale | 0.181* | 0.179* | 0.190* |
## +-------------------+-----------+-----------+-----------+
## | | (0.085) | (0.087) | (0.085) |
## +-------------------+-----------+-----------+-----------+
## | immigrantYes | -0.246+ | -0.172 | -0.243+ |
## +-------------------+-----------+-----------+-----------+
## | | (0.128) | (0.131) | (0.129) |
## +-------------------+-----------+-----------+-----------+
## | urbanUrban | 0.121 | 0.238* | 0.175+ |
## +-------------------+-----------+-----------+-----------+
## | | (0.094) | (0.096) | (0.094) |
## +-------------------+-----------+-----------+-----------+
## | birth_control | 0.133 | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.103) | | |
## +-------------------+-----------+-----------+-----------+
## | income | 0.000 | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.000) | | |
## +-------------------+-----------+-----------+-----------+
## | married | 0.703*** | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.122) | | |
## +-------------------+-----------+-----------+-----------+
## | job_stability_t | 0.285*** | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.047) | | |
## +-------------------+-----------+-----------+-----------+
## | contractTemporary | 0.710*** | 0.772*** | 0.804*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.110) | (0.112) | (0.110) |
## +-------------------+-----------+-----------+-----------+
## | contractPermanent | 0.893*** | 1.116*** | 1.093*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.114) | (0.113) | (0.111) |
## +-------------------+-----------+-----------+-----------+
## | pref | 0.581*** | | 0.578*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.042) | | (0.042) |
## +-------------------+-----------+-----------+-----------+
## | Num.Obs. | 5000 | 5000 | 5000 |
## +-------------------+-----------+-----------+-----------+
## | R2 | 0.227 | 0.183 | 0.213 |
## +===================+===========+===========+===========+
## | + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 |
## +===================+===========+===========+===========+
##
## Roles legend: Exp. = exposure; Out. = outcome; CON = confounder; MED = mediator; COL = collider; dOut = descendant of outcome; dMed = descendant of mediator; dCol = descendant of collider; dConfOn = descendant of a confounder on a back-door path; dConfOff = descendant of a confounder off a back-door path; NCT = neutral control on treatment; NCO = neutral control on outcome
##
## PDAG robustness summary:
## - uncertain edges specified: 4
## - worlds evaluated (acyclic orientations): 2
## - minimal adjustment set changed: no
## - canonical adjustment set changed: no
## - covariate role classifications changed: none
## - re-estimation recommended: no
The two implementations differ primarily in their outputs. pdag_robustness() requires only a DAG and model formula, making it useful for evaluating identification assumptions before data are available, or if one’s models are not compatable with DAGassist(). Conversely, DAGassist() requires a data frame and returns the PDAG diagnostics alongside the standard covariate-role table and re-estimated regression models.
PDAG diagnostics are calculated only over acyclic orientations. This constraint can matter substantively. In the example above, reversing income – edu_year would appear to change income from a mediator to a confounder. Yet that reversal creates a directed cycle: income → edu_year → job_stability_t → income. Because this orientation is inadmissible, it does not contribute to the robustness summary; consequently, reversing income – edu_year alone does not alter the minimal set, canonical set, or any covariate role among the remaining acyclic DAGs.
Introducing uncertainty in the job_stability_t – income edge breaks this constraint and permits additional acyclic orientations. The resulting changes in the robustness summary illustrate how PDAG diagnostics can identify assumptions about causal direction that are consequential for empirical practice.
DAGassist::pdag_robustness(dag_model,
formula = children ~ edu_year + age + class + gender +
immigrant + urban + birth_control + income + married +
job_stability_t + contract + pref,
uncertain_edges = c("urban -- income",
"income -- immigrant",
"income -- married",
"income -- edu_year",
"job_stability_t -- income"))##
## PDAG robustness summary:
## - uncertain edges specified: 5
## - worlds evaluated (acyclic orientations): 7
## - minimal adjustment set changed: yes
## - canonical adjustment set changed: yes
## - covariate role changed: mediator -> ambiguous (confounder / mediator) for income (good/bad control flip)
## - re-estimation recommended: yes
add_edges()Whereas PDAGs address directional uncertainty in existing edges, a second set of assumptions concerns missing edges. A missing arrow in a DAG encodes a strong null (Haber et al. 2022). Because these exclusion assumptions are rarely testable, it is useful to consider whether an estimand would survive their violation.
The add_edges argument introduces uncertainty to
specific omitted pathways. DAGassist reports whether adding an edge
changes the minimal or canonical adjustment set, alters a covariate’s
role, or renders the effect unidentifiable. Edges may be directed
("A -> B"), representing an omitted causal path, or
bidirected ("A <-> B"), representing unmeasured
confounding.
DAGassist::add_edges_robustness(dag_model,
formula = children ~ edu_year + age + class + gender + immigrant + urban +
birth_control + income + married + job_stability_t + contract + pref,
add_edges = c("pref -> edu_year", "religion -> edu_year", "edu_year <-> children"))##
## Edge-addition (exclusion) robustness:
## - edges tested: 3
## - pref -> edu_year: minimal changed: yes; canonical changed: no
## new minimal set(s): {age, class, contract, gender, immigrant, pref, urban}
## role changes: pref: nco->confounder
## - religion -> edu_year: minimal changed: yes; canonical changed: no
## new minimal set(s): {age, class, contract, gender, immigrant, religion, urban}
## role changes: religion: nco->confounder
## - edu_year <-> children: effect NOT identifiable if this pathway exists (no adjustment set blocks it)
## - re-estimation recommended: yes
As with PDAGs, these diagnostics are also available through the main
DAGassist() interface, where they are returned in the
standard report:
DAGassist(dag_model,
formula = children ~ edu_year + age + class + gender + immigrant + urban +
birth_control + income + married + job_stability_t + contract + pref, data = dat,
add_edges = c("pref -> edu_year", "edu_year <-> children"))## DAGassist Report:
##
## Roles:
## variable role Exp. Out. conf med col dOut dMed dCol dConfOn dConfOff NCT NCO
## edu_year exposure x
## children outcome x
## age confounder x
## class confounder x
## contract confounder x
## gender confounder x
## immigrant confounder x
## urban confounder x
## birth_control mediator x x
## income mediator x x
## job_stability_t mediator x
## married mediator x x
## pref nco x
##
## (!) Bad controls in your formula: {birth_control, income, married, job_stability_t}
## Minimal controls 1: {age, class, contract, gender, immigrant, urban}
## Canonical controls: {age, class, contract, gender, immigrant, pref, urban}
##
## Formulas:
## original: children ~ edu_year + age + class + gender + immigrant + urban + birth_control + income + married + job_stability_t + contract + pref
##
## Balance diagnostics:
## legend: (S)MD compares covariate means between the Original complete-case sample
## and each spec's sample; |(S)MD| > 0.10 flags a covariate whose sample
## composition shifts (binary vars use a raw difference in means).
## Original vs Minimal 1: n = 5000 vs 5000 balanced
## Original vs Canonical: n = 5000 vs 5000 balanced
## Minimal 1 vs Canonical: n = 5000 vs 5000 balanced
##
## Model comparison:
##
## +-------------------+-----------+-----------+-----------+
## | | Original | Minimal 1 | Canonical |
## +===================+===========+===========+===========+
## | edu_year | -0.122*** | -0.080*** | -0.080*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.015) | (0.013) | (0.013) |
## +-------------------+-----------+-----------+-----------+
## | age | 0.070*** | 0.095*** | 0.096*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.004) | (0.003) | (0.003) |
## +-------------------+-----------+-----------+-----------+
## | genderMale | 0.181* | 0.179* | 0.190* |
## +-------------------+-----------+-----------+-----------+
## | | (0.085) | (0.087) | (0.085) |
## +-------------------+-----------+-----------+-----------+
## | immigrantYes | -0.246+ | -0.172 | -0.243+ |
## +-------------------+-----------+-----------+-----------+
## | | (0.128) | (0.131) | (0.129) |
## +-------------------+-----------+-----------+-----------+
## | urbanUrban | 0.121 | 0.238* | 0.175+ |
## +-------------------+-----------+-----------+-----------+
## | | (0.094) | (0.096) | (0.094) |
## +-------------------+-----------+-----------+-----------+
## | birth_control | 0.133 | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.103) | | |
## +-------------------+-----------+-----------+-----------+
## | income | 0.000 | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.000) | | |
## +-------------------+-----------+-----------+-----------+
## | married | 0.703*** | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.122) | | |
## +-------------------+-----------+-----------+-----------+
## | job_stability_t | 0.285*** | | |
## +-------------------+-----------+-----------+-----------+
## | | (0.047) | | |
## +-------------------+-----------+-----------+-----------+
## | contractTemporary | 0.710*** | 0.772*** | 0.804*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.110) | (0.112) | (0.110) |
## +-------------------+-----------+-----------+-----------+
## | contractPermanent | 0.893*** | 1.116*** | 1.093*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.114) | (0.113) | (0.111) |
## +-------------------+-----------+-----------+-----------+
## | pref | 0.581*** | | 0.578*** |
## +-------------------+-----------+-----------+-----------+
## | | (0.042) | | (0.042) |
## +-------------------+-----------+-----------+-----------+
## | Num.Obs. | 5000 | 5000 | 5000 |
## +-------------------+-----------+-----------+-----------+
## | R2 | 0.227 | 0.183 | 0.213 |
## +===================+===========+===========+===========+
## | + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 |
## +===================+===========+===========+===========+
##
## Roles legend: Exp. = exposure; Out. = outcome; CON = confounder; MED = mediator; COL = collider; dOut = descendant of outcome; dMed = descendant of mediator; dCol = descendant of collider; dConfOn = descendant of a confounder on a back-door path; dConfOff = descendant of a confounder off a back-door path; NCT = neutral control on treatment; NCO = neutral control on outcome
add_edges()