
Generate a (console/LaTeX/word/excel/txt) report classifying nodes and comparing models
Source:R/assist.R
DAGassist.RdDAGassist() validates a DAG + model specification, classifies node roles,
builds minimal and canonical adjustment sets, fits comparable models, and
renders a compact report in several formats (console, LaTeX fragment, DOCX,
XLSX, plain text). It also supports passing a single engine call (e.g.
feols(Y ~ X + Z | fe, data = df)) instead of a plain formula.
Usage
DAGassist(
dag,
formula = NULL,
data = NULL,
exposure,
outcome,
engine = stats::lm,
labels = NULL,
verbose = TRUE,
type = c("console", "latex", "word", "docx", "excel", "xlsx", "text", "txt", "dwplot",
"dotwhisker"),
show = c("all", "roles", "models"),
out = NULL,
imply = FALSE,
eval_all = FALSE,
exclude = NULL,
omit_intercept = TRUE,
omit_factors = TRUE,
bivariate = FALSE,
estimand = c("none", "ATE", "ATT"),
engine_args = list()
)Arguments
- dag
A dagitty object (see
dagitty::dagitty()).- formula
Either (a) a standard model formula
Y ~ X + ..., or (b) a single engine call such asfeols(Y ~ X + Z | fe, data = df, ...). When an engine call is provided,engine,data, and extra arguments are automatically extracted from the call.- data
A
data.frame(or compatible, e.g. tibble). Optional if supplied via the engine call informula.- exposure
Optional character scalar; if missing/empty, inferred from the DAG (must be unique).
- outcome
Optional character scalar; if missing/empty, inferred from the DAG (must be unique).
- engine
Modeling function, default stats::lm. Ignored if
formulais a single engine call (in that case the function is taken from the call).- labels
Optional variable labels (named character vector or data.frame).
- verbose
Logical (default
TRUE). Controls verbosity in the console printer (formulas + notes).- type
Output type. One of
"console"(default),"latex"/"docx"/"word","excel"/"xlsx","text"/"txt", or the plotting types"dwplot"/"dotwhisker". Fortype = "latex", if noout=is supplied, a LaTeX fragment is printed to the console instead of being written to disk.- show
Which sections to include in the output. One of
"all"(default),"roles"(only the roles grid), or"models"(only the model comparison table/plot). This makes it possible to generate and export just roles or just comparisons.- out
Output file path for the non-console types:
type="latex": a LaTeX fragment written toout(usually.tex); when omitted, the fragment is printed to the console.type="text"/"txt": a plain-text file written toout; when omitted, the report is printed to console.type="dotwhisker"/"dwplot": a image (.png) file written toout; when omitted, the plot is rendered within RStudio.type="docx"/"word": a Word (.docx) file written toout.type="excel"/"xlsx": an Excel (.xlsx) file written toout. Ignored fortype="console".
- imply
Logical; default
FALSE. Specifies evaluation scope.If
FALSE(default): restrict DAG evaluation to variables named in the formula (prune the DAG to exposure, outcome, and RHS terms). Roles/sets/bad-controls are computed on this pruned graph, and the roles table only shows those variables. Essentially, it fits the DAG to the formula.If
TRUE: evaluate on the full DAG and allow DAG-implied controls in the minimal/canonical sets. The roles table shows all DAG nodes, and the printout notes any variables added beyond your RHS. Essentially, it fits the formula to the DAG.
- eval_all
Logical; default
FALSE. WhenTRUE, keep all original RHS terms that are not in the DAG (e.g., fixed effects, interactions, splines, convenience covariates) in the minimal and canonical formulas. WhenFALSE(default), RHS terms not present as DAG nodes are dropped from those derived formulas.- exclude
Optional character vector to remove neutral controls from the canonical set. Recognized values are
"nct"(drop neutral-on-treatment controls) and"nco"(drop neutral-on-outcome controls). You can supply one or both, e.g.exclude = c("nco", "nct"); each requested variant is fitted and shown as a separate "Canon. (-...)" column in the console/model exports.- omit_intercept
Logical; drop intercept rows from the model comparison display (default
TRUE).- omit_factors
Logical; drop factor-level rows from the model comparison display (default
TRUE). This parameter only suppresses factor output–they are still included in the regression.- bivariate
Logical; if
TRUE, include a bivariate (exposure-only) specification in the comparison table in addition to the user's original and DAG-derived models.- estimand
Character; causal estimand for binary treatments. Currently only supported for
type = "console". One of"none"(default),"ATE", or"ATT". When"ATE"or"ATT", the console print method will attempt to compute inverse-probability weights via the WeightIt package and add weighted versions of each comparison model as additional columns.- engine_args
Named list of extra arguments forwarded to
engine(...). Ifformulais an engine call, arguments from the call are merged withengine_args(call values take precedence).
Value
An object of class "DAGassist_report", invisibly for file and plot
outputs, and printed for type="console". The list contains:
validation- result fromvalidate_spec(...)which verifies acyclicity and X/Y declarations.roles- raw roles data.frame fromclassify_nodes(...)(logic columns).roles_display- roles grid after labeling/renaming for exporters.bad_in_user- variables in the user's RHS that areMED/COL/dOut/dMed/dCol.controls_minimal- (legacy) one minimal set (character vector).controls_minimal_all- list of all minimal sets (character vectors).controls_canonical- canonical set (character vector; may be empty).controls_canonical_excl- named list of filtered canonical sets (e.g.$nco,$nct) whenexcludeis used.formulas- list withoriginal,minimal,minimal_list,canonical, and any filtered canonical formulas.models- list with fitted modelsoriginal,minimal,minimal_list,canonical, and any filtered canonical models.verbose,imply- flags as provided.
Details
In addition to tabular export formats, you can create a dot-whisker plot
(via type = "dwplot" or type = "dotwhisker") for the model comparison.
Engine-call parsing. If formula is a call (e.g., feols(Y ~ X | fe, data=df)),
DAGassist extracts the engine function, formula, data argument, and any additional
engine arguments directly from that call; these are merged with engine/engine_args
you pass explicitly (call arguments win).
fixest tails. For engines like fixest that use | to denote FE/IV parts,
DAGassist preserves any | ... tail when constructing minimal/canonical formulas
(e.g., Y ~ X + controls | fe | iv(...)).
Roles grid. The roles table displays short headers:
Exp.(exposure),Out.(outcome),CON(confounder),MED(mediator),COL(collider),dOut(descendant ofY),dMed(descendant of any mediator),dCol(descendant of any collider),dConfOn(descendant of a confounder on a back-door path),dConfOff(descendant of a confounder off a back-door path),NCT(neutral control on treatment),NCO(neutral control on outcome). These extra flags are used to (i) warn about bad controls, and (ii) build filtered canonical sets such as “Canonical (–NCO)” for export.
Bad controls. For total-effect estimation, DAGassist flags as bad controls
any variables that are MED, COL, dOut, dMed, or dCol. These are warned in
the console and omitted from the model-comparison table. Valid confounders (pre-treatment)
are eligible for minimal/canonical adjustment sets.
Output types.
consoleprints roles, adjustment sets, formulas (ifverbose), and a compact model comparison (using{modelsummary}if available, falling back gracefully otherwise).latexwrites or prints a LaTeX fragment you can\\input{}into a paper — it usestabularraylong tables and will include any requested Canon. (-NCO / -NCT) variants.docx/wordwrites a Word doc (respectsoptions(DAGassist.ref_docx=...)if set).excel/xlsxwrites an Excel workbook with tidy tables.text/txtwrites a plain-text report for logs/notes.dwplot/dotwhiskerproduces a dot-whisker visualization of the fitted models.
Dependencies. Core requires {dagitty}. Optional enhancements: {modelsummary}
(pretty tables), {broom} (fallback tidying), {rmarkdown} + pandoc (DOCX),
{writexl} (XLSX), {dotwhisker}/{ggplot2} for plotting.
Interpreting the output
See the vignette articles for worked examples on generating roles-only, models-only, and LaTeX/Word/Excel reports.
Model Comparison:
Minimal - the smallest adjustment set that blocks all back-door paths (confounders only).
Canonical - the largest permissible set: includes all controls that are not
MED,COL,dOut,dMed, ordCol.
Errors and edge cases
If exposure/outcome cannot be inferred uniquely, the function stops with a clear message.
Fitting errors (e.g., FE collinearity) are captured and displayed in comparisons without aborting the whole pipeline.
See also
print.DAGassist_report() for the console printer, and the helper
exporters in report_* modules.
Examples
# generate a console DAGassist report
DAGassist(dag = g,
formula = lm(Y ~ X + Z + C + M, data = df))
#> DAGassist Report:
#>
#> Roles:
#> variable role Exp. Out. conf med col dOut dMed dCol dConfOn dConfOff NCT NCO
#> X exposure x
#> Y outcome x
#> Z confounder x
#> M mediator x
#> C collider x x x
#>
#> (!) Bad controls in your formula: {C, M}
#> Minimal controls 1: {Z}
#> Canonical controls: {Z}
#>
#> Formulas:
#> original: Y ~ X + Z + C + M
#>
#> Model comparison:
#>
#> +---+----------+-----------+-----------+
#> | | Original | Minimal 1 | Canonical |
#> +===+==========+===========+===========+
#> | X | 0.467*** | 1.306*** | 1.306*** |
#> +---+----------+-----------+-----------+
#> | | (0.122) | (0.098) | (0.098) |
#> +---+----------+-----------+-----------+
#> | Z | 0.185+ | 0.235+ | 0.235+ |
#> +---+----------+-----------+-----------+
#> | | (0.102) | (0.127) | (0.127) |
#> +---+----------+-----------+-----------+
#> | C | 0.368*** | | |
#> +---+----------+-----------+-----------+
#> | | (0.076) | | |
#> +---+----------+-----------+-----------+
#> | M | 0.512*** | | |
#> +---+----------+-----------+-----------+
#> | | (0.077) | | |
#> +===+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01, |
#> | *** p < 0.001 |
#> +===+==========+===========+===========+
#>
#> Roles legend: Exp. = exposure; Out. = outcome; CON = confounder; MED = mediator; COL = collider; dOut = descendant of outcome; dMed = descendant of mediator; dCol = descendant of collider; dConfOn = descendant of a confounder on a back-door path; dConfOff = descendant of a confounder off a back-door path; NCT = neutral control on treatment; NCO = neutral control on outcome
# generate a LaTeX DAGassist report in console
DAGassist(dag = g,
formula = lm(Y ~ X + Z + C + M, data = df),
type = "latex")
#> % --------------------- DAGassist LaTeX fragment ---------------------
#> % Requires: \usepackage{tabularray} \UseTblrLibrary{booktabs,siunitx,talltblr}
#> \begingroup\footnotesize
#> \begingroup\setlength{\emergencystretch}{3em}
#> % needs \usepackage{graphicx} for \rotatebox
#> \begin{longtblr}[presep=0pt, postsep=0pt, caption={DAGassist Report:}, label={tab:dagassist}]%
#> {width=\textwidth,colsep=1.5pt,rowsep=0pt,abovesep=0pt,belowsep=0pt,column{3}={colsep=6pt},colspec={X[35,l]X[15,l]X[8,c]X[8,c]X[8,c]X[8,c]X[8,c]X[8,c]X[8,c]X[8,c]X[8,c]X[8,c]X[8,c]X[8,c]}}
#> \toprule
#> Variable & Role & \rotatebox[origin=c]{60}{Exp.} & \rotatebox[origin=c]{60}{Out.} & \rotatebox[origin=c]{60}{CON} & \rotatebox[origin=c]{60}{MED} & \rotatebox[origin=c]{60}{COL} & \rotatebox[origin=c]{60}{dOut} & \rotatebox[origin=c]{60}{dMed} & \rotatebox[origin=c]{60}{dCol} & \rotatebox[origin=c]{60}{dConfOn} & \rotatebox[origin=c]{60}{dConfOff} & \rotatebox[origin=c]{60}{NCT} & \rotatebox[origin=c]{60}{NCO} \\
#> \midrule
#> C & collider & & & & & x & x & x & & & & & \\
#> M & mediator & & & & x & & & & & & & & \\
#> X & exposure & x & & & & & & & & & & & \\
#> Y & outcome & & x & & & & & & & & & & \\
#> Z & confounder & & & x & & & & & & & & & \\
#> \bottomrule
#> \end{longtblr}
#> \endgroup
#> % no vertical glue between tables
#> \nointerlineskip
#> \begin{longtblr}[presep=0pt,postsep=0pt,%% tabularray outer open
#> entry=none,label=none,
#> note{}={+ p \num{< 0.1}, * p \num{< 0.05}, ** p \num{< 0.01}, *** p \num{< 0.001}},
#> ] %% tabularray outer close
#> { %% tabularray inner open
#> colspec={X[]X[]X[]X[]},
#> hline{2}={1-4}{solid, black, 0.05em},
#> hline{10}={1-4}{solid, black, 0.05em},
#> hline{1}={1-4}{solid, black, 0.1em},
#> hline{12}={1-4}{solid, black, 0.1em},
#> column{2-4}={}{halign=c},
#> column{1}={}{halign=l},
#> } %% tabularray inner close
#> & Original & Minimal 1 & Canonical \\
#> X & \num{0.467}*** & \num{1.306}*** & \num{1.306}*** \\
#> & (\num{0.122}) & (\num{0.098}) & (\num{0.098}) \\
#> Z & \num{0.185}+ & \num{0.235}+ & \num{0.235}+ \\
#> & (\num{0.102}) & (\num{0.127}) & (\num{0.127}) \\
#> C & \num{0.368}*** & & \\
#> & (\num{0.076}) & & \\
#> M & \num{0.512}*** & & \\
#> & (\num{0.077}) & & \\
#> Num.Obs. & \num{150} & \num{150} & \num{150} \\
#> R2 & \num{0.806} & \num{0.693} & \num{0.693} \\
#> \end{longtblr}
#> \par\endgroup
#> \addtocounter{table}{-1}
#> \vspace{1em}
#> \footnotesize
#> \noindent\textit{Controls (minimal):} {Z}\\
#> \textit{Controls (canonical):} {Z}
#> \\
#> \scriptsize
#> \textit{Roles legend:} Exp. (exposure); Out. (outcome); CON (confounder); MED (mediator); COL (collider); dOut (proper descendant of Y); dMed (proper descendant of any mediator); dCol (proper descendant of any collider); dConfOn (descendant of a confounder on a back-door path); dConfOff (descendant of a confounder off a back-door path); NCT (neutral control on treatment); NCO (neutral control on outcome).
# generate just the roles table in the console
DAGassist(dag = g,
show = "roles")
#> DAGassist Report:
#>
#> Roles:
#> variable role Exp. Out. conf med col dOut dMed dCol dConfOn dConfOff NCT NCO
#> X exposure x
#> Y outcome x
#> Z confounder x
#> M mediator x
#> C collider x x x
#> A nco x
#> B nco x
#>
#> Roles legend: Exp. = exposure; Out. = outcome; CON = confounder; MED = mediator; COL = collider; dOut = descendant of outcome; dMed = descendant of mediator; dCol = descendant of collider; dConfOn = descendant of a confounder on a back-door path; dConfOff = descendant of a confounder off a back-door path; NCT = neutral control on treatment; NCO = neutral control on outcome