Skip to contents

Introduction

DAGassist() is meant to be simple and easy to use, and most of its features can be enjoyed via a simple two-parameter argument:

DAGassist(
  dag = your_dag_model,
  formula = your_regression_call
)

However, DAGassist() includes several parameters for more specific applications. This vignette explains how to use those parameters to get the most out of DAGassist().

formula arguments

DAGassist supports formulaic and regression-based formula arguments.

#formulaic formula
DAGassist(
  dag = dag_model,
  formula = Y ~ X + C,
  data = df,
  exposure = "X",
  outcome = "Y"
)

#imputed formula
DAGassist(
  dag = dag_model,
  formula = lm(Y ~ X + C, data=df)
)

The two formulas above will print identical output.

imply arguments

In cases where you only want DAGassist to use the variables explicitly called in your formula, use imply = FALSE.

DAGassist(
  dag = dag_model,
  formula = lm(Y~X+C, data = df),
  imply = FALSE
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role       X  Y  conf  med  col  IO  dMed  dCol
#> X         exposure   x                                   
#> Y         outcome       x                                
#> C         collider                    x    x             
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {}
#> Canonical controls: {}
#> 
#> Formulas:
#>   original:  Y ~ X + C
#> 
#> Model comparison:
#> 
#> +---+----------+-----------+-----------+
#> |   | Original | Minimal 1 | Canonical |
#> +===+==========+===========+===========+
#> | X | 0.908*** | 1.415***  | 1.415***  |
#> +---+----------+-----------+-----------+
#> |   | (0.030)  | (0.021)   | (0.021)   |
#> +---+----------+-----------+-----------+
#> | C | 0.475*** |           |           |
#> +---+----------+-----------+-----------+
#> |   | (0.022)  |           |           |
#> +===+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01,  |
#> | *** p < 0.001                        |
#> +===+==========+===========+===========+

In cases where you want DAGassist to explore all of the causal relationships explicated in your DAG, use imply = TRUE.

DAGassist(
  dag = dag_model,
  formula = lm(Y~X+C, data = df),
  imply = TRUE
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role        X  Y  conf  med  col  IO  dMed  dCol
#> X         exposure    x                                   
#> Y         outcome        x                      x         
#> Z         confounder        x                             
#> M         mediator                x                       
#> C         collider                     x    x   x         
#> A         other                                           
#> B         other                                           
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {Z}
#> Canonical controls: {A, B, Z}
#> 
#> Formulas:
#>   original:  Y ~ X + C
#>   minimal 1 : Y ~ X + Z
#>   canonical: Y ~ X + A + B + Z
#> 
#> Note: DAGassist added variables not in your formula, based on the
#> relationships in your DAG, to block back-door paths
#> between X and Y.
#>   - Minimal 1 added: {Z}
#>   - Canonical added: {A, B, Z}
#> 
#> Model comparison:
#> 
#> +---+----------+-----------+-----------+
#> |   | Original | Minimal 1 | Canonical |
#> +===+==========+===========+===========+
#> | X | 0.908*** | 1.256***  | 1.256***  |
#> +---+----------+-----------+-----------+
#> |   | (0.030)  | (0.027)   | (0.026)   |
#> +---+----------+-----------+-----------+
#> | C | 0.475*** |           |           |
#> +---+----------+-----------+-----------+
#> |   | (0.022)  |           |           |
#> +---+----------+-----------+-----------+
#> | Z |          | 0.311***  | 0.309***  |
#> +---+----------+-----------+-----------+
#> |   |          | (0.034)   | (0.033)   |
#> +---+----------+-----------+-----------+
#> | A |          |           | 0.187***  |
#> +---+----------+-----------+-----------+
#> |   |          |           | (0.026)   |
#> +---+----------+-----------+-----------+
#> | B |          |           | -0.057*   |
#> +---+----------+-----------+-----------+
#> |   |          |           | (0.026)   |
#> +===+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01,  |
#> | *** p < 0.001                        |
#> +===+==========+===========+===========+

DAGassist will notify you of which variables it added. imply = FALSE by default.

omit_factors and omit_intercept arguments

DAGassist omits factor and intercept rows by default, but you can explicitly include them. However, if they are not included in your DAG, DAGassist will not evaluate them, and will not include them in the minimal or canonical models.

DAGassist(
  dag = dag_model,
  formula = fixest::feols(
    Y ~ X + C + i(region),  
    data = df),
  omit_factors = FALSE,
  omit_intercept = FALSE
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role       X  Y  conf  med  col  IO  dMed  dCol
#> X         exposure   x                                   
#> Y         outcome       x                                
#> C         collider                    x    x             
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {}
#> Canonical controls: {}
#> 
#> Note: The following regressors, which are included in the below models, were not evaluated by DAGassist because they are not nodes in the DAG:
#>   {i(region)}
#> 
#> Formulas:
#>   original:  Y ~ X + C + i(region)
#> 
#> Model comparison:
#> 
#> +----------------+----------+-----------+-----------+
#> |                | Original | Minimal 1 | Canonical |
#> +================+==========+===========+===========+
#> | (Intercept)    | 0.060    | -0.011    | -0.011    |
#> +----------------+----------+-----------+-----------+
#> |                | (0.049)  | (0.027)   | (0.027)   |
#> +----------------+----------+-----------+-----------+
#> | X              | 0.908*** | 1.415***  | 1.415***  |
#> +----------------+----------+-----------+-----------+
#> |                | (0.030)  | (0.021)   | (0.021)   |
#> +----------------+----------+-----------+-----------+
#> | C              | 0.474*** |           |           |
#> +----------------+----------+-----------+-----------+
#> |                | (0.022)  |           |           |
#> +----------------+----------+-----------+-----------+
#> | region = North | -0.030   |           |           |
#> +----------------+----------+-----------+-----------+
#> |                | (0.069)  |           |           |
#> +----------------+----------+-----------+-----------+
#> | region = South | -0.085   |           |           |
#> +----------------+----------+-----------+-----------+
#> |                | (0.069)  |           |           |
#> +----------------+----------+-----------+-----------+
#> | region = West  | -0.167*  |           |           |
#> +----------------+----------+-----------+-----------+
#> |                | (0.069)  |           |           |
#> +================+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01, *** p <       |
#> | 0.001                                             |
#> +================+==========+===========+===========+

labels arguments

You can include a label list.

labs <- list(
  X = "Exposure",
  C = "Collider"
)

DAGassist(
  dag = dag_model,
  formula = lm(
    Y ~ X + C, data = df),
  labels = labs
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role       X  Y  conf  med  col  IO  dMed  dCol
#> Exposure  exposure   x                                   
#> Y         outcome       x                                
#> Collider  collider                    x    x             
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {}
#> Canonical controls: {}
#> 
#> Formulas:
#>   original:  Y ~ X + C
#> 
#> Model comparison:
#> 
#> +----------+----------+-----------+-----------+
#> |          | Original | Minimal 1 | Canonical |
#> +==========+==========+===========+===========+
#> | Exposure | 0.908*** | 1.415***  | 1.415***  |
#> +----------+----------+-----------+-----------+
#> |          | (0.030)  | (0.021)   | (0.021)   |
#> +----------+----------+-----------+-----------+
#> | Collider | 0.475*** |           |           |
#> +----------+----------+-----------+-----------+
#> |          | (0.022)  |           |           |
#> +==========+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01, *** p   |
#> | < 0.001                                     |
#> +==========+==========+===========+===========+

Note that the label parameter uses modelsummary() coef_rename logic, so an incomplete label list will not throw any errors.

DAGassist(
  dag = dag_model,
  formula = lm(
    Y ~ X + C, data = df),
  labels = labs,
  imply = TRUE
)
#> DAGassist Report: 
#> 
#> Roles:
#> variable  role        X  Y  conf  med  col  IO  dMed  dCol
#> Exposure  exposure    x                                   
#> Y         outcome        x                      x         
#> Z         confounder        x                             
#> M         mediator                x                       
#> Collider  collider                     x    x   x         
#> A         other                                           
#> B         other                                           
#> 
#>  (!) Bad controls in your formula: {C}
#> Minimal controls 1: {Z}
#> Canonical controls: {A, B, Z}
#> 
#> Formulas:
#>   original:  Y ~ X + C
#>   minimal 1 : Y ~ X + Z
#>   canonical: Y ~ X + A + B + Z
#> 
#> Note: DAGassist added variables not in your formula, based on the
#> relationships in your DAG, to block back-door paths
#> between X and Y.
#>   - Minimal 1 added: {Z}
#>   - Canonical added: {A, B, Z}
#> 
#> Model comparison:
#> 
#> +----------+----------+-----------+-----------+
#> |          | Original | Minimal 1 | Canonical |
#> +==========+==========+===========+===========+
#> | Exposure | 0.908*** | 1.256***  | 1.256***  |
#> +----------+----------+-----------+-----------+
#> |          | (0.030)  | (0.027)   | (0.026)   |
#> +----------+----------+-----------+-----------+
#> | Collider | 0.475*** |           |           |
#> +----------+----------+-----------+-----------+
#> |          | (0.022)  |           |           |
#> +----------+----------+-----------+-----------+
#> | Z        |          | 0.311***  | 0.309***  |
#> +----------+----------+-----------+-----------+
#> |          |          | (0.034)   | (0.033)   |
#> +----------+----------+-----------+-----------+
#> | A        |          |           | 0.187***  |
#> +----------+----------+-----------+-----------+
#> |          |          |           | (0.026)   |
#> +----------+----------+-----------+-----------+
#> | B        |          |           | -0.057*   |
#> +----------+----------+-----------+-----------+
#> |          |          |           | (0.026)   |
#> +==========+==========+===========+===========+
#> | + p < 0.1, * p < 0.05, ** p < 0.01, *** p   |
#> | < 0.001                                     |
#> +==========+==========+===========+===========+